We are seeking a DevOps Technical Lead with a strong background in infrastructure automation, cloud architecture, and a keen interest in Generative AI technologies. The ideal candidate will lead the development of an Infrastructure Agent powered by GenAI – capable of intelligent provisioning, configuration, observability, and self-healing.
 
Key Responsibilities:
  • Lead architecture & design of an intelligent Infra Agent leveraging GenAI capabilities.
  • Integrate LLMs and automation frameworks (e.g., LangChain, OpenAI, Hugging Face) to enhance DevOps workflows.
  • Build solutions that automate infrastructure provisioning, CI/CD, incident remediation, and drift detection.
  • Develop reusable components and frameworks using IaC (Terraform, Pulumi, CloudFormation) and configuration management tools (Ansible, Chef, etc.).
  • Partner with AI/ML engineers and SREs to design intelligent infrastructure decision-making logic.
  • Implement secure and scalable infrastructure on cloud platforms (AWS, Azure, GCP).
  • Continuously improve agent performance through feedback loops, telemetry, and fine-tuning of models.
  • Drive DevSecOps best practices, compliance, and observability.
  • Mentor DevOps engineers and collaborate with cross-functional teams (AI/ML, Platform, and Product).
 
Required Qualifications:
  • Bachelor's or Master’s degree in Computer Science, Engineering, or related field.
  • 8+ years of experience in DevOps, SRE, or Infrastructure Engineering.
  • Proven experience in leading infrastructure automation projects and technical teams.
  • Expertise with one or more cloud platforms: AWS, Azure, GCP.
  • Deep knowledge of tools like Terraform, Kubernetes, Helm, Docker, Jenkins, and GitOps.
  • Hands-on experience integrating or building with LLMs / GenAI APIs (e.g., OpenAI, Anthropic, Cohere).
  • Familiarity with LangChain, AutoGen, or custom agent frameworks.
  • Experience with programming/scripting languages: Python, Go, or Bash.
  • Understanding of cloud security, policy as code, and monitoring tools (Prometheus, Grafana, Datadog).
 
Preferred Qualifications:
  • Experience building or fine-tuning LLM-based agents for operations or automation tasks.
  • Contributions to open-source GenAI or DevOps projects.
  • Understanding of MLOps pipelines and AI infrastructure.
  • Certifications in DevOps, cloud, or AI technologies (e.g., AWS DevOps Engineer, Azure AI Engineer).