About Us
Luxury Presence is the fastest-growing digital platform for real estate agents, teams, and brokerages. Our award-winning real estate websites, modern marketing solutions, and AI-powered mobile platform help agents attract more business, work more efficiently, and serve their clients. Since launching in 2016, Luxury Presence has been trusted by more than 11,000 real estate professionals, including over 20 Wall Street Journal Top 100 agents.
Position
Team: Engineering / Infrastructure
Title: Sr. Site Reliability Engineer
Location: Canada, Remote
Compensation: 120-150k, Stock Options, Flex PTO, Health/Vision/Dental, Retreats
Who is the Infrastructure Squad?
The infrastructure squad (internally known as SWAT Squad) is responsible for all cloud resources' reliability, scalability, and performance. We provide the foundation upon which Luxury Presence’s products are built. SREs lead the design, operation, and maintenance of our AWS cloud infrastructure, while also offering Engineering teams valuable frameworks, pipelines, and tooling to streamline production application design, deployment, and management.
What will you do as a Sr. Site Reliability Engineer (SRE)?
As an SRE, your role will involve collaborating with Infrastructure and Product Delivery teams, utilizing tooling and automation to enhance our Kubernetes clusters' readiness, improve observability, scale the system, optimize performance, and promote operational excellence practices.
• Identify and lead strategic initiatives that streamline platform operations, optimize platform performance, and improve system health.
• Design, build, test and maintain scalable and reliable infrastructure that supports workloads for all other teams.
• Manage and optimize Kubernetes clusters on AWS, ensuring high availability, scalability, and efficient resource utilization.
• Automate application deployments, continuous delivery, and monitoring changes to Kubernetes manifests using ArgoCD.
• Continuously improve our GitOps foundation with IaC tools like Terraform and Crossplane.
• Improve monitoring and observability to gain better insights into the platform's health, performance, and potential issues.
• Assist in the resolution of incidents, identifying root causes, and ensuring timely responses to critical issues affecting the platform.
• Analyze and optimize system performance, identifying and resolving bottlenecks in applications or infrastructure components.
• Develop and maintain automation scripts, tools, and workflows to streamline tasks and enhance efficiency.
• Document configurations, processes, and best practices, and sharing knowledge with the Engineering team to foster collaboration and learning.
• Ensure the platform adheres to security best practices and compliance requirements. Implementing necessary measures.
$120,000 - $150,003 a year