Design, deploy, and manage our Kubernetes platform to support scalable and reliable application deployments. Monitor and maintain the platform's health, performance, and security
Oversee the deployment of our Software-as-a-Service applications on the Kubernetes platform. Implement best practices for application scalability, high availability, and disaster recovery
Implement robust monitoring, alerting, and logging systems to proactively identify and resolve potential issues. Ensure high system availability and quick incident response times
Continuously optimize the Kubernetes infrastructure and SaaS applications to achieve maximum performance and efficiency. Conduct performance testing and tuning to meet or exceed service level objectives
Participate in an on-call rotation to respond to incidents promptly and effectively
Conduct thorough post-incident reviews to identify root causes and implement preventive measures
Develop and maintain automation tools and scripts to streamline processes and improve the efficiency of operational tasks
Implement security best practices for Kubernetes and SaaS applications
Collaborate with the security team to ensure compliance with industry standards and regulations
Work closely with cross-functional teams, including development, infrastructure, and product management, to provide expertise and support throughout the software development lifecycle
Identify areas for improvement in the infrastructure, processes, and deployment methodologies. Propose and implement enhancements to increase system reliability and performance.
Requirements
5+ years of professional experience as a Site Reliability Engineer, DevOps Engineer, or in a similar role, with a strong focus on Kubernetes platform management and SaaS deployment
Proficiency in managing Kubernetes clusters and related tooling (e.g., Helm, kubectl, operators). Experience with container orchestration, service mesh, and Kubernetes networking
Knowledge of continuous integration and continuous deployment pipelines, preferably with tools like Jenkins, GitLab CI/CD, or Tekton
Experience with monitoring solutions (e.g., Prometheus, Grafana) and centralized logging platforms (e.g., ELK stack)
Familiarity with major cloud providers (e.g., AWS, Azure, GCP) and experience deploying and managing applications on cloud infrastructure
Solid programming skills in languages such as Python or Go. Proficiency in scripting to automate tasks and develop tooling
Understanding of networking concepts and security best practices in the context of Kubernetes and SaaS deployments.
Strong analytical and problem-solving abilities to diagnose and resolve complex technical issues
Excellent teamwork and communication skills to collaborate effectively with various teams and stakeholders.
Benefits
Challenging projects in a highly professional, but also a collaborative and supportive environment
Working in small and excellently skilled teams
Opportunity for long-term professional growth within our development center
Competitive compensation depending on experience and skills
Respect and support for your professional, family and personal goals.
Please let Createq know that you found this role at devopsprojectshq.com as a way to support us, so we can keep providing you with awesome DevOps jobs.
Ready to land your dream job?
Create your profile and let companies find you!
Built and hosted in the EU 🇪🇺 we keep your data safe