- Senior
- Office in Chennai
Job Description:
Key Responsibilities
Strategic Planning & Architecture
- Develop future-ready AI strategies and technical roadmaps aligned with business goals.
- Identify and evaluate opportunities for AI to create value, leveraging emerging trends (e.g., neural networks, quantum computing, hybrid models).
- Architect solutions incorporating AI/ML (including hybrid models), big data, and integration with IoT and edge computing to support real-time and low-latency business needs.
Solution Design, Integration & Delivery
- Lead the architectural design, development, and deployment of reliable, scalable AI/ML systems across Azure, AWS, and hybrid clouds.
- Ensure seamless integration of AI with broader enterprise applications, cloud infrastructures, MCP servers, databases, and IoT/edge devices.
- Select and integrate appropriate tools, platforms, and industry-standard frameworks (e.g., TensorFlow, PyTorch, Hugging Face, Hadoop, Spark, Kafka).
- Expose and secure AI/ML models via REST/gRPC interfaces deployed in microservices architectures; manage API gateways, load balancing, and API security.
Project & Team Leadership
- Lead AI project operations—scoping, planning, and driving projects to completion on time and budget.
- Mentor and develop AI technical teams; facilitate knowledge transfer on best practices, new technologies, and responsible AI use.
- Foster a culture of innovation, transparency, collaboration, accountability, and ethical AI.
- Champion responsible AI: Promote justice, transparency, and accountability in AI development and deployment.
Evaluation, Optimization, and Governance
- Oversee performance monitoring, optimization, and retraining of production AI systems.
- Implement and maintain robust MLOps pipelines (CI/CD, monitoring, automated retraining) and model management (versioning, rollback, and decommissioning).
- Leverage model registries such as MLflow, SageMaker Model Registry, or Azure ML Registry.
- Apply advanced observability and monitoring (Prometheus, Grafana, OpenTelemetry, DataDog, ELK stack).
- Ensure compliance with security, privacy, and regulatory (e.g., HIPAA, SOC2) and ethical AI standards.
- Apply privacy-preserving techniques (differential privacy, federated learning, data anonymization).
- Apply model validation, A/B testing, canary deployments, and adversarial testing for AI reliability.
Stakeholder Engagement & Communication
- Collaborate with business stakeholders, data scientists, and engineers to translate organizational/business needs into actionable AI solutions.
- Clearly articulate AI system benefits, limitations, risks, and future possibilities to technical and non-technical audiences.
Future-Oriented Focus
- Adopt Cutting-Edge AI: Evaluate and leverage new developments (neural networks, quantum, hybrid models).
- Drive Hybrid AI Models: Architect solutions combining machine learning, neural nets, and rule-based methods.
- Integrate AI with IoT/Edge: Deploy AI for IoT and edge scenarios (e.g., NVIDIA Jetson, AWS Greengrass, Azure Percept) for real-time, decentralized intelligence.
- ML/LLM Ops: Apply best practices in LLMOps, vector databases (Pinecone, ChromaDB, Weaviate), and prompt engineering for LLM-based solutions.
- Champion Responsible AI: Promote fairness, transparency, and ethical AI across all projects.
Qualifications Required
- Bachelor’s in Computer Science, Engineering, or related field (Master’s preferred)
- 8+ years in technical/data/solution architecture roles, with 4+ years focused on AI/ML systems at enterprise scale
- Demonstrated expertise in Azure, AWS, and AI/ML platforms (TensorFlow, PyTorch, Hugging Face, etc.)
- Advanced hands-on experience with MCP server setup, optimization, and troubleshooting
- Data engineering proficiency (big data, ETL/ELT, Hadoop, Spark, Kafka, etc.)
- Expertise in AI model deployment, serving/inference frameworks (Triton, TensorRT, VLLM, TGI, etc.)
- Programming proficiency (Python, R, Java)
- Experience with DevOps, MLOps, and CI/CD for AI projects; Infrastructure-as-Code skills (Terraform, CloudFormation, ARM)
- Advanced skills in containerization/orchestration (Docker, Kubernetes) and container security
- Practical knowledge of API/microservice architectures and API security best practices
- Experience integrating AI with IoT and edge architectures
- Strong project management, team leadership, and stakeholder communication skills
- Model lifecycle management (development, versioning, monitoring, rollback, decommissioning)
- Experience with monitoring and observability tools for AI/ML workloads (Prometheus, Grafana, DataDog, OpenTelemetry)
- Familiarity with privacy-preserving ML techniques (differential privacy, federated learning)
- Experience and proficiency in model testing, validation, and adversarial robustness
- Strong background in cloud performance & cost optimization and multi-cloud resiliency
Preferred
- Advanced certifications (Azure, AWS, AI/ML)
- Experience with regulatory frameworks (HIPAA, SOC2, FHIR, HL7, EDI)
- AI/LLM-specific certifications
- Familiarity with edge AI deployment accelerators (NVIDIA Jetson, AWS Greengrass, Azure Percept)
- Experience with hybrid and multi-cloud AI architectures
Kaleris is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees.