Swarmia was founded to help every team achieve visibility into their own ways of working, a culture of small but continuous improvement, and great tooling to help them improve in a way that sticks. Companies like Docker, Miro, and Webflow use Swarmia’s SaaS product.
We are looking for a Senior Platform Engineer with deep infrastructure expertise to help us build and operate production systems that our customers rely on daily.
Our infrastructure mainly runs on Google Cloud and is managed with Terraform. Since starting the company five years ago, we’ve reached 99.9% uptime every year. Our code is continuously deployed to production, with high automated test coverage. While we’re very experienced with building production systems that scale, you will be the first person to be fully dedicated to the Platform & Infrastructure work with this title.
Assist product teams in running their Kubernetes payloads and automating manual steps.
Design and implement infrastructure changes using Terraform
Collaborate with product teams to optimize application performance and resource utilization
Look near: Notice our message queue getting backed up? Dive in, analyze the bottleneck, and implement a solution before it affects our users
Look far: See our cloud costs trending up? Analyze usage patterns, identify optimization opportunities, and work with the team to implement efficient solutions
Configure VPCs and securely segment production networks
Write documentation and playbooks with the team - we prefer collaborative problem-solving over working in silos
Set up and fine-tune monitoring and SLO alerting in our observability stack to catch issues before they impact customers
Plan when to upsize our PostgreSQL servers or bring more machines to our Kubernetes cluster.
Go spelunking in Google Security Command Center, review security posture, and implement improvements to keep our platform secure and compliant
Design and implement automated disaster recovery procedures, then work with the team to practice them regularly
Automate away manual operational tasks - if you find yourself doing something more than twice, it's probably time to automate it
Work with the team to implement and maintain compliance requirements while keeping our development workflow smooth
Jump into a customer support chat when there's a problem with a customer's data sync
Optimize PostgreSQL performance by identifying table and index bloat and tuning pg_repack runs
You don't need to know all of the tools beforehand, we are happy to show you the ropes!
Hosting: Google Cloud Platform and Google Kubernetes Engine
Messaging systems: Google PubSub
Terraform for infrastructure as code
Observability stack (Prometheus, Grafana, GCP logging/tracing/monitoring)
Backends: TypeScript/NodeJS
Databases: PostgreSQL (Google Cloud SQL), Redis
Data warehouse: BigQuery, Dataform
CI/CD: GitHub Actions
A highly experienced and motivated team
A very relevant domain (and hopefully interesting!) to any engineer
70-90k€ annual salary plus a meaningful amount of equity
Paid annual vacation, with ten extra days for new employees
Flexible model of work - pick your own balance of remote/office
Great work/life balance - we're a startup, but we don’t crunch and work at an unsustainable pace (many of us have kids and other responsibilities beyond work)