09th April, 2026
Our client, a leading network automation solutions company, is looking for a remote Site Reliability Engineer (Cloud Engineering) to join their team!
This is a remote/full-time/contract position
Salary based on experience: $60-$70/HR
IMPACT OF THIS TECHNICAL ROLE:
As a Site Reliability Engineer (SRE) on the Nautobot Cloud Engineering team, you will help deliver and maintain our managed Nautobot SaaS offering. Your primary focus will be operating, supporting, and evolving customer environments in AWS—especially EKS, EC2, and related services—while ensuring uptime, performance, and security. You will also handle occasional escalations for legacy customers running on AKS or on-premises deployments.
This role combines operational excellence with a mindset for continuous improvement. You will work across infrastructure, CI/CD pipelines, and observability tooling, applying DevOps best practices to deliver a reliable, scalable, and secure platform for our customers.
A DAY IN THE LIFE:
- Operate and support Nautobot Cloud deployments in AWS, including EKS, EC2, RDS, and associated services.
- Use Jira to manage operational and project-related tasks, track incidents, and document changes.
- Support the resolution of escalated issues for other Kubernetes-like environments, including AKS and on-prem, as needed.
- Deploy and update Nautobot instances using Helm charts, Kubernetes manifests, and automation workflows.
- Automate improvements to CI/CD pipelines (GitHub Actions, Terraform, Ansible) for provisioning, upgrades, and configuration management.
- Maintain observability tools (Prometheus, Loki, Grafana) to ensure accurate monitoring, alerting, and logging.
- Troubleshoot application and infrastructure issues across containerized environments.
- Collaborate with engineers across Cloud Operations, Nautobot Core, and Nautobot Apps teams to deliver cross-functional solutions.
- Contribute to documentation for operational runbooks, troubleshooting guides, and architecture diagrams.
- Participate in Agile ceremonies, including standups and retrospectives.
WHAT YOU BRING: - Passion for reliability, customer success, and operational excellence.
- Ability to troubleshoot complex distributed systems and quickly identify root causes.
- Strong communication skills—able to clearly convey technical concepts to both peers and customers.
- A proactive mindset, looking for opportunities to improve processes and prevent issues before they occur.
- Flexibility to adapt to changing priorities and technologies
WHAT YOU HAVE: - 3-5 years of experience applying DevOps or SRE practices to production systems.
- 2+ years of experience operating workloads in AWS, with a focus on EKS, EC2, IAM, and networking.
- 2+ years working with Kubernetes (preferably in production) and Helm.
- Experience with IaC tools such as Terraform and configuration management tools like Ansible.
- Familiarity with CI/CD pipelines (GitHub Actions, Jenkins, CircleCI, etc.).
- Proficiency in scripting languages such as Python or Bash.
- Comfortable working in Linux-based environments.
- Familiarity with monitoring, logging, and alerting solutions (Prometheus, Loki, Grafana, Datadog, ELK).
- Skilled in using Jira to manage operational tasks, incident response, sprint planning, and project tracking. Experience with similar ticketing systems is also a plus.
- Analytical and troubleshooting skills using k9s for real-time Kubernetes management and Terraform for diagnosing and resolving Infrastructure-as-Code deployment issues. Prior experience with these tools is a plus.
- Networking fundamentals (equivalent to CCNA-level understanding) are a plus.
LOCATION OF THIS ROLE: - Anywhere in North America (Eastern time zone preferred)
Applicants must be authorized to work in the U.S.
We are an equal-opportunity employer. We do not discriminate in hiring or employment against any individual based on race, color, gender, national origin, ancestry, religion, physical or mental disability, age, veteran status, sexual orientation, gender identity or expression, marital status, pregnancy, citizenship, or any other factor protected by anti-discrimination laws.
Apply For Job