Replicated is the modern way to ship and monitor multi-prem software. Replicated helps software vendors quickly and securely deploy their applications to any customer using a single architecture. The Replicated platform provides all of the tools needed to operationalize and scale the distribution of Kubernetes applications into any enterprise environment.
Our customers include HashiCorp, Puppet, SmartBear, Jama, Swimlane, Tripwire, Acrolinx, and Knime and many other fast-growing enterprise software vendors. We're a Series C funded company (over $80m raised) with long term investors and a long term focus.
Replicated is committed to cultivating an efficient, respectful workplace. We know that innovation thrives on teams where diverse points of view come together to solve hard problems in ways that are just now possible. As such, we explicitly seek people that bring diverse life experiences, diverse educational backgrounds, diverse cultures, and diverse work experiences.
We are fully remote and plan to stay that way! We're open to any state in the US. In addition, for some roles, we're open to candidates in Canada, the UK, Israel, Australia, and New Zealand.
Replicated is hiring a Senior Site Reliability Engineer to join our growing team! The Site Reliability Engineering team works to ensure that the company's services maintain the high level of reliability needed by our engineers and vendors. You’ll monitor and troubleshoot systems (Containers, Kubernetes, Cloudflare), using tooling to increase efficiency (Terraform, Datadog), and collaborate with our internal development teams to enhance the quality of our products (GitHub, Flux). We’re proud to champion a blameless culture focused on continuous improvement, and highly value interpersonal skills as critical in handling incidents, effectively communicating technical issues, and managing conflicts in a productive way.
The Vision for Success:
Manage the Replicated software infrastructure and the Kubernetes clusters it runs on
Troubleshoot failures in tooling and infrastructure and build sustainable fixes
Collaborate with development teams to establish internal SLOs and SLIs and to improve the observability our systems
Write code and help support new services, products, and features
Grow our SRE practice at Replicated to support a new level of maturity
Participate in on-call responsibility and lead incident response by remediating or triaging issues
The Value you Will Bring:
3+ years experience as a developer, site reliability engineer, platform engineer, or devops engineer
Strong command of at least 1 programming language
2+ years experience with Kubernetes, Docker, or other container technology
Deep experience with at least one cloud provider (AWS, GCP, Azure)
Experience with EKS, Terraform, DataDog, Cloudflare - important -
A growth mindset, the curiosity to learn, and commitment to a culture of psychological safety
Excellent collaboration and technical communication skills
Bonus Points:
Strong, well reasoned opinions on technology
Development experience in Go, Typescript, or Bash
Solid understanding of CI/CD and the modern SDLC
Emphasis on the human components that make for a great SRE practice
In your first 30 Days
Learn as much as you can about the company, team and product
Complete hands-on training with the product, complete an onboarding checklist and meet with team members from across the company
Improve the onboarding process as you notice rough edges, bringing your fresh perspectives to identify areas where we can do better
Spend time with the team understanding our work and current initiatives
Pair with members of the team to grow your knowledge of our infrastructure and ask questions to understand how we got to where we are
In your first 60 days
Contribute individual work to existing initiatives to deepen your understanding of Replicated’s infrastructure
Shadow engineers in the on-call rotation, learning the process and making suggestions for improvements (both technical and procedural)
Examine the systems you are learning, seeking to understand the “why” and challenge the answers as appropriate
Continue to grow your understanding of Replicated - how our products are developed, building relationships with individuals, how our services connect and interact
In your first 90 days
Take a leadership role for one of the team’s priorities, coordinating within and beyond the infrastructure team to accomplish the goal as you see fit
Use your expertise to improve the reliability of Replicated services in collaboration with development and security teams
Join the on-call rotation as a fully fledged member, leading incident command and learning reviews as needed
Continue to learn and grow. Replicated is committed to ongoing individual growth and there will always be opportunities that require it
Pay transparency
At Replicated, we value our teammates as individuals who are stronger together. We offer a robust pay and benefits package that rewards employees for their contributions to our success, supports their well-being, and helps all of us create a great remote work environment.
In the US, the salary range for this role is as follows:
Site Reliability Engineer II: $140,000 - $160,000
Sr. Site Reliability Engineer I: $155,000 - $180,000
Sr. Site Reliability Engineer II: $175,000 - $195,000
This is dependent on several factors, including level, qualifications, and experience. We also offer stock options, a strong health insurance package, as well as a unique home office allowance & a professional development budget. An overview is on our careers page here: https://www.replicated.com/careers/
Not sure you meet 100% of our qualifications? Please apply anyway!
We invest in our team and love candidates who are eager to learn and grow. We have a fantastic team of highly collaborative individuals who enjoy learning, growing, and mentoring others.
OUR CORE VALUES
Care Deeply: Care deeply about the work that you do. Because of that you are constantly learning and willing to go out on a limb, challenge assumptions, go back to first principles, etc.
Longterm: Treat every interaction as part of a 30 year relationship, you’ll see everyone down the road again as customers, partners, coworkers, etc.
Curious: We're always learning and we approach everyone and every problem with curiosity. When needed we challenge assumptions, and go back to first principles.
BENEFITS
We offer strong benefits to help you stay healthy and productive. For the US, our benefits are listed below:
Health/Dental/Vision
Life/AD&D
LTD/STD
FSA
401K
Stock options
Partner perk programs
Generous time off, we expect you to take a minimum of 3 weeks of per year
Laptop+accessories you need to get set up
Generous home office set up allowance or co-working space allowance - up to $10,000 per year!
Curiosity Budget to help you keep learning and growing!
Replicated is an Equal Opportunity Employer. We celebrate diversity and are committed to creating an inclusive environment for all employees. We encourage applicants of all backgrounds and we work to make sure that all team members have an equal opportunity to succeed.
We do not accept unsolicited assistance from any headhunters, recruitment firms or any other third party for any of our job openings. Any unsolicited resumes sent from anyone other than the candidate, in any format, to any person at Replicated, will be considered Replicated property. Replicated will NOT pay a fee for any placement resulting from the receipt of an unsolicited resume.
#LI-Remote