Compare your current skills to what this opportunity needs—we'll show you what you already have and what could strengthen your application.
Job Description
Please read the note before you apply!!!!!
Job Title:
DevOps Engineer / Site Reliability Engineer (SRE) (W2)
Company:
Baanyan Software Services Inc Experience:
7+
Years Employment Type:
Full-Time, W2 (No C2C Resumes Please)
Location:
Open to Relocation (Multiple Client Locations Across the US)
Visa:
Graduates with a Master's Degree /
Bachelor's Degree Sponsorship:
Yes Job Title:
DevOps Engineer / Site Reliability Engineer (SRE)
Job Description Responsibilities:
Design, implement, and maintain scalable, highly available, and secure cloud infrastructure. Automate infrastructure provisioning, deployment, monitoring, and incident response processes. Build and manage CI/CD pipelines for application deployments across multiple environments. Manage containerized workloads using Docker and Kubernetes in cloud-native environments. Collaborate with development, QA, and security teams to improve software delivery and operational excellence. Monitor system performance, troubleshoot production issues, and ensure high availability and reliability. Implement Infrastructure as Code (IaC) using Terraform, CloudFormation, or similar tools. Manage logging, monitoring, alerting, and observability platforms. Ensure security, compliance, and best practices across cloud and infrastructure environments. Participate in on-call rotations and incident management activities. Optimize cloud infrastructure costs and improve system performance.
Required Skills:
7+ years of hands-on experience in DevOps, Site Reliability Engineering (SRE), or Cloud Engineering. Strong experience with AWS, Azure, or Google Cloud Platform (GCP). Expertise in Kubernetes and Docker containerization technologies. Hands-on experience with Infrastructure as Code (Terraform, CloudFormation, Ansible). Strong experience building and maintaining CI/CD pipelines using Jenkins, GitHub Actions, GitLab CI/CD, or Azure DevOps. Experience with Linux/Unix administration and shell scripting (Bash, Python). Strong understanding of networking concepts, load balancing, DNS, SSL/TLS, VPNs, and security best practices. Experience with monitoring and observability tools such as Prometheus, Grafana, Datadog, ELK Stack, Splunk, New Relic, or Dynatrace. Experience with source control systems such as Git and GitHub/GitLab. Strong troubleshooting, incident management, and root cause analysis skills. Experience supporting highly available, distributed production systems. Familiarity with Agile and DevOps methodologies.
Preferred Skills:
Experience with service mesh technologies such as Istio or Linkerd. Experience with Kubernetes Operators and Helm Charts. Knowledge of AWS EKS, ECS, Lambda, EC2, RDS, S3, CloudWatch, IAM, and VPC. Experience implementing DevSecOps practices and security automation. Familiarity with SRE principles, including SLI, SLO, SLA, error budgets, and reliability engineering. Experience with Kafka, RabbitMQ, or other messaging platforms. Hands-on experience with disaster recovery, backup strategies, and business continuity planning. Certifications such as AWS Certified DevOps Engineer, AWS Solutions Architect, CKA, CKAD, or Terraform Associate. Experience using AI-powered development and operations tools such as GitHub Copilot, Windsurf, Cursor AI, Amazon Q, ChatGPT, and AI-driven observability platforms.