We are Kaleyra! Our technology makes it possible to safely and securely deliver billions of messages monthly in more than 190 countries. Our APIs in messaging, voice, and video power communications for global brands such as Hyundai, Uber, Flipkart and AirAsia.
The European Site Reliability Engineering [SRE] team basd in Milan is on a mission to make Kaleyra synonymous with SRE excellence. We are building an SRE team and practice that rivals big-tech. We are looking for enthusiastic SRE leaders with a bias for action and ownership to join our quest!
Who are you?
You either come from a software development background or operations/infrastructure (we are fine with both, as long as you are familiar with both domains and willing to integrate).
You are open and candid about discussing solutions, problems and improvements within your team and others in the organization.
You have a passion for SRE principles and their adoption, to steer promised reliability, performance and security of the applications, services and systems.
You are a fan of DevOps approach, promoting loosely coupled, heavily automated, constantly monitored distributed systems, and you always plan for failure and never take anything for granted.
You are located in the European Union. We are based in Milan, and our teams are distributed across the world.
What you'll be doing
Be part of the central European SRE/DevOps team, reporting to its lead.
Improve reliability, observability and incident management process for our global customer platform in the European Union (currently on AWS using Kubernetes)
Help the team to give guidelines and blueprint on DevOps lifecycle
Work and collaborate with the Cloud Infrastructure team to implement best practices, scale out our services and automate everything as much as possible.
In consultation with Information Security, implement best practices and compliance.
What is expected of you (your experience)
Understand how a distributed application works
Have experience with at least one public cloud (AWS preferred)
Are familiar with containerization technologies like docker and kubernetes
Are able to create / refactor deployment pipelines (we use Jenkins)
Have strong coding and scripting experience and you are interested in improving your programming capacity (bash, python, go).
Are keen to be flexible to be on-call in a 24x7 rotating shift on a rotational basis with the team (on call is not happening right now but could happen in the future)
Requirements
What we are ideally looking for in the next member of the team:
Humble and curious attitude: you are keen to understand how things work and improve them by continuous iteration.
Clear communicator, confident, self-sufficient, focused and determined, even so, You value, listen and weigh your teammates' opinion.
Strong problem solving, troubleshooting and system analysis mindset and resolution.
Know when to escalate or ask for help vs having to own and resolve the issue in isolation.
Knowledge of technology architecture for large-scale distributed systems is a plus.
The mandatory list of skills and tools:
Linux (debian/ubuntu): workstations and server are using linux.
Cloud: AWS (preferred), OCI, GCP or Azure are a plus.
Container management: docker, kubernetes but also PaaS-driven: ECS, EKS, GKE, AKS.
Monitoring and logging tools like New Relic, Datadog, Prometheus, ELK Elasticsearch stack.
Advanced working experience with git source repository
Programming languages: at least one out of shell/bash, python, groovy.
Automation tooling, both infrastructure and configuration based, like terraform/ansible/packer etc.
Continuous integration and delivery ecosystem and tools: Jenkins (preferred) or Gitlab/CircleCI, ArgoCD/Flux, etc.
Things we care about
We value attitude and your thinking process over the list of skills and tools in your belt.
We are in a path of standardization, therefore we are looking into individuals that are keen to simplify and dig into complex applications.
Done is better than perfect: we understand the value of getting things done in agreed timelines and iterate over their improvements.
Simple is better than magical: you will contribute to create easy to use solutions that are more likely to be adopted by stakeholders that can understand their purpose and collaborate to improve them.
Visibility is key: we value data-driven decisions and business-driven metrics. You will be key to implementing observability tooling used across the engineering function.
Automation everywhere it is reasonable: it's ok to have manual and documented tasks if there is a VERY good reason.
Everything should be reproducible and repeatable: we like boring solutions because they always work.
Infrastructure as code: configuration files, infrastructure deployments should be version controlled and driving automated deployments.
Security: is part of the process, it's a journey and it's definitely an engineering problem.
Well written documentation: if it's not documented it doesn't exist.
English fluency is required (both written and spoken), Italian fluency is a plus.
We offer flexible working with an average of 3 days at the office and 2 days working from home.
Please let Kaleyra know that you found this role at devopsprojectshq.com as a way to support us, so we can keep providing you with awesome DevOps jobs.
Ready to land your dream job?
Create your profile and let companies find you!
Built and hosted in the EU 🇪🇺 we keep your data safe