Never miss a job!

2023-10-21

Senior Site Reliability Engineer

Description

Purpose

UneeQ is the global standard for digital humans, enabling creators and brands to bring impactful interaction into our digital world. We are seeking a Senior Site Reliability Engineer to work as part of the SRE team to ensure our platform is scalable, resilient, and reliable by:

Providing an infrastructure platform to the development team that handles scaling, failing, and monitoring.
Managing the security and availability of that infrastructure platform.
Recommending infrastructure solutions for business and customer requirements in a way that balances the business needs with best practice engineering practices and architecture.
Collaborating across multiple internal teams to design and develop infrastructure solutions supporting our business and technical strategies.
Facilitating change management via automated CI/CD pipelines.
Providing tooling to the development team to create a build and release pipeline that adheres to change management best practices.
Maintaining incident management processes, including on-call rosters and post-mortems.
Ensuring all our applications and services have measurable SLOs and develop observability tools, frameworks, and processes.
Monitoring the infrastructure costs against an agreed budget and work with technical and business stakeholders to align expectations and address discrepancies.
Working with the development team to expose metrics and open up monitoring opportunities.

This role is New Zealand based and reports to the Lead SRE. UneeQ is a remote-first workplace meaning you will mostly be working from your home.

What you’re trying to achieve

Increase development team efficiency by providing infrastructure and tooling that makes their lives easier.
Ensure that UneeQ meets or exceeds availability, performance, and security SLAs.
Ensure that our processes (security, change management, incident management, etc.) adhere to best practices.
Ensure we can report to stakeholders about our system performance and availability.
Use vendors to save time and effort while keeping track of the infrastructure spend.
Maintain a culture and habit of continuous improvement.

How we’ll measure success

The primary qualitative metric is the perception of adding value to the rest of the team, which is assessed regularly via peer feedback. Performance against quantitative SLAs includes:

Availability
Average time to respond
Average time to repair
Spend is within budget, etc.

General competencies that will help you

Empathy to peers and stakeholders
Attention to detail
Systems thinking
Process maturity building
Hunger for learning
Desire to be of service to others
Can-do attitude
Healthy skepticism

Requirements

Specific capabilities that will be necessary

Knowledge of current trends, tooling, and best practices within the industry for infrastructure management, application deployment, scalability, and observability
Knowledge of change management, incident management, and continual improvement best practices
Experience creating and maintaining complex configuration-as-code infrastructure projects.
Understanding foundational programming concepts and patterns sufficiently for routine infrastructure task automation
Ability to use and develop models to support capacity planning and budget management
Experience and good working knowledge of Terraform
Experience designing and implementing cloud-based solutions using AWS or Azure.
Solid understanding of container orchestration and experience using Kubernetes or a similar platform
Experience designing, implementing, and maintaining solutions in some (not all) of the following fields:
- HTTP/WebRTC applications infrastructure.
- Infrastructure for audio and video streaming.
- Database (MySql, PostgreSQL, Redis, Rabbit MQ, Kafka) management.
- Observability using Prometheus and Grafana or a similar technology.
- CI/CD pipelines, preferably for Go/C++/JavaScript.
- Bash or Python scripting for Linux systems.
- User provisioning.
- Permissions management.

Benefits

Competitive compensation
100% of employee health insurance premiums (including vision and dental).
Annual learning allowance to support us to continue to develop and grow

Apply

Never miss a job!

Senior Site Reliability Engineer

Description

Requirements

Benefits

You must be logged in to apply for this job

Please let Uneeq know that you found this role at devopsprojectshq.com as a way to support us,
so we can keep providing you with awesome DevOps jobs.

Similar Jobs

Senior DevOps Engineer

Senior Data Platform Engineer (Remote)

Senior DevOps Engineer

DevOps Engineer III

Ready to land your dream job?

Create your profile and let companies find you!

Built and hosted in the EU 🇪🇺 we keep your data safe

Never miss a job!

Senior Site Reliability Engineer

Description

Requirements

Benefits

You must be logged in to apply for this job

Please let Uneeq know that you found this role at devopsprojectshq.com as a way to support us, so we can keep providing you with awesome DevOps jobs.

Similar Jobs

Senior DevOps Engineer

Senior Data Platform Engineer (Remote)

Senior DevOps Engineer

DevOps Engineer III

Ready to land your dream job?

Create your profile and let companies find you!

Built and hosted in the EU 🇪🇺 we keep your data safe

Please let Uneeq know that you found this role at devopsprojectshq.com as a way to support us,
so we can keep providing you with awesome DevOps jobs.