Senior Site Reliability Engineer

1 month ago


Montreal, Quebec, Canada Lightspeed Full time

Position Overview: We are seeking a Senior Site Reliability Engineer to enhance our team at Lightspeed. Our company develops innovative software solutions that empower merchants to expand their business and increase profitability. In this role, you will be instrumental in managing critical aspects such as cloud infrastructure, reliability, incident management, data analytics, and operational efficiency.

Key Responsibilities:

  • Collaborate with development teams to equip them with essential tools and methodologies for monitoring software performance in production, establishing and tracking reliability metrics (SLI, SLO), and overseeing error budgets.
  • Architect, construct, and sustain resilient infrastructure utilizing Google Cloud Platform (GCP) and cloud-native technologies like GKE, Cloud SQL, and BigQuery.
  • Design and oversee CI/CD pipelines to streamline deployment and release processes employing various technologies (GitLab, GitHub, Helm, Terraform).
  • Lead the incident management process and perform post-incident reviews to mitigate future disruptions.
  • Guide junior SREs and developers, sharing best practices in cloud architecture, data management, and software engineering.
  • Conduct performance assessments and implement enhancements to boost system reliability and efficiency.
  • Work with cross-functional teams to identify, design, and execute internal process improvements in a cost-effective manner.
  • Create and maintain robust, scalable, and highly available systems.
  • Develop platform solutions and apply software engineering principles to enhance software reliability and expedite delivery.
  • Manage infrastructure changes through Infrastructure as Code (IaC) methodologies.
  • Participate in the on-call rotation as needed.
  • Stay informed about industry trends and emerging technologies, advocating for the adoption of innovations that enhance product quality and team productivity.

Qualifications:

  • Bachelor's degree in Computer Science, Engineering, or equivalent practical experience.
  • 7-9 years of experience in site reliability engineering, systems administration, or software engineering.
  • Expertise in container orchestration platforms, particularly Kubernetes.
  • Strong knowledge of relational databases (e.g., PostgreSQL, MySQL) and NoSQL databases (e.g., MongoDB, Cassandra, Redis).
  • In-depth understanding of network protocols and IP networking, with experience in network troubleshooting.
  • Proficiency in programming languages such as Java, Python, or Go.
  • Demonstrated experience managing large-scale infrastructure in cloud environments like Google Cloud, AWS, or Azure.
  • Familiarity with monitoring tools (e.g., Prometheus, Grafana, Datadog) and logging solutions (e.g., ELK stack).
  • Strong grasp of security best practices.
  • Exceptional problem-solving abilities and the capacity to work under pressure to resolve complex issues.
  • Excellent communication skills for effective collaboration with cross-functional teams.
  • Strong leadership capabilities, with the ability to lead projects and influence engineering decisions across the organization.

Benefits:

  • Flexible work environment.
  • Career advancement opportunities in a growing company.
  • Contribute to a team that values growth and impact.

Lightspeed is committed to fostering an inclusive and barrier-free workplace. We welcome applications from individuals with diverse backgrounds and experiences.



  • Montreal, Quebec, Canada Lightspeed Full time

    About the RoleWe are seeking a Senior Site Reliability Engineer to become a vital part of our team at Lightspeed. Our mission is to develop innovative software solutions that empower merchants to enhance their business growth and profitability.Key ResponsibilitiesYou will collaborate with a dedicated team focused on various critical aspects, including:Cloud...


  • Montreal, Quebec, Canada Lightspeed Full time

    We are seeking a Senior Site Reliability Engineer to become a vital part of our team at Lightspeed. Our organization specializes in developing innovative software solutions that empower merchants to enhance their business growth and profitability. In this role, you will collaborate with a team focused on addressing critical areas such as cloud...


  • Montreal, Quebec, Canada Lightspeed Full time

    Position Overview: We are seeking a Senior Site Reliability Engineer to enhance our operations at Lightspeed. This role is pivotal in developing software solutions that empower merchants to expand their business and increase profitability. You will play a crucial role in managing cross-functional concerns, including cloud infrastructure, reliability,...


  • Montreal, Quebec, Canada Banque Nationale du Canada Full time

    Senior Site Reliability EngineerWork Arrangement: HybridJob Category: Senior ProfessionalStatus: PermanentContract Type: PermanentWork Schedule: Full-TimeLocation: MontrealArea of Expertise: Information TechnologyA career in technology at Banque Nationale du Canada involves engaging in transformative initiatives that directly benefit clients. As a Senior...


  • Montreal, Quebec, Canada Lightspeed Full time

    Welcome to Lightspeed!We are in search of a Senior Site Reliability Engineer to become a vital part of our NuOrder by Lightspeed team in North America. At NuORDER by Lightspeed, we develop innovative software solutions designed to enhance the growth and profitability of merchants' businesses. You will collaborate with a team dedicated to addressing...


  • Montreal, Quebec, Canada Royal Bank of Canada Full time

    About the OpportunityWe are seeking a highly skilled Senior Site Reliability Engineer to join our Digital Branch SRE organization at the Royal Bank of Canada. As a key member of our team, you will be responsible for designing, implementing, and supporting Site Reliability Engineering (SRE) solutions for our applications.Key ResponsibilitiesDevelop and...


  • Montreal, Quebec, Canada Royal Bank of Canada Full time

    About the OpportunityWe are seeking a highly skilled Senior Site Reliability Engineer to join our Digital Branch SRE organization at the Royal Bank of Canada. As a key member of our team, you will be responsible for designing, implementing, and supporting Site Reliability Engineering (SRE) solutions for our applications.Key ResponsibilitiesDevelop and...


  • Montreal, Quebec, Canada Lightspeed Full time

    About the Role: We are seeking a highly skilled Staff Site Reliability Engineer to enhance our operations at Lightspeed. This position is pivotal in developing software solutions that empower merchants to expand their business effectively and profitably. You will play a crucial role in managing essential aspects such as cloud infrastructure, reliability,...


  • Montreal, Quebec, Canada Banque Nationale du Canada Full time

    Site Reliability Engineering Developer (SRE)Work Arrangement: HybridJob Category: Senior ProfessionalStatus: PermanentContract Type: PermanentWork Schedule: Full-TimeLocation: MontrealArea of Interest: Information TechnologyA career in technology at Banque Nationale du Canada involves engaging in transformative initiatives that directly impact clients. As a...


  • Montreal, Quebec, Canada Axelon Services Corporation Full time

    Job Title:Site Reliability EngineerAbout the Role:We are seeking a skilled Site Reliability Engineer to join our Application Infrastructure team. As a Site Reliability Engineer, you will play a critical role in driving the reliability engineering, operations, and customer support services for our ServiceNow SaaS implementation.Key Responsibilities:Deliver...


  • Montreal, Quebec, Canada Axelon Services Corporation Full time

    Job Title:Site Reliability EngineerAbout the Role:We are seeking a skilled Site Reliability Engineer to join our Application Infrastructure team. As a Site Reliability Engineer, you will play a critical role in driving the reliability engineering, operations, and customer support services for our ServiceNow SaaS implementation.Key Responsibilities:Deliver...


  • Montreal, Quebec, Canada Banque Nationale du Canada Full time

    Site Reliability Engineering Developer (SRE)Work Arrangement: HybridJob Category: Senior ProfessionalStatus: PermanentContract Type: PermanentSchedule: Full-TimeLocation: MontrealArea of Focus: Information TechnologyA career in technology at Banque Nationale du Canada involves engaging in transformative initiatives that directly benefit clients. As a Site...


  • Montreal, Quebec, Canada Royal Bank of Canada Full time

    Job SummaryWe are seeking a highly skilled Senior Site Reliability Engineer to join our Digital Branch SRE organization. As a key member of our team, you will be responsible for developing, implementing, and supporting SRE solutions for our applications.Key ResponsibilitiesParticipate in code reviews and non-functional reviews of production-bound SRE...


  • Montreal, Quebec, Canada Royal Bank of Canada Full time

    Job SummaryWe are seeking a highly skilled Senior Site Reliability Engineer to join our Digital Branch SRE organization. As a key member of our team, you will be responsible for developing, implementing, and supporting SRE solutions for our applications.Key ResponsibilitiesParticipate in code reviews and non-functional reviews of production-bound SRE...


  • Montreal, Quebec, Canada Lyft Full time

    Lyft is a leading micromobility company dedicated to improving urban transportation systems worldwide. They are seeking a Site Reliability Engineer to join their expanding team. This individual will be responsible for designing, implementing, and maintaining the infrastructure systems to ensure reliability and scalability.Responsibilities:Assist in defining...


  • Montreal, Quebec, Canada Lightspeed Full time

    Position Overview:We are seeking a highly skilled Senior Site Reliability Engineer to enhance our Cloud Networking initiatives at Lightspeed. This role is pivotal in developing software solutions that empower merchants to expand their business profitability.Key Responsibilities:Collaborate with development teams to equip them with essential tools and...


  • Montreal, Quebec, Canada Axelon Services Corporation Full time

    {"title": "Site Reliability Engineer", "description": "About the RoleWe are seeking a skilled Site Reliability Engineer to join our HashiVault squad. As an SRE, you will be responsible for implementing new features, dealing with user requests, and reducing repeatable tasks to allow more time for strategic initiatives.About the Company*** is a leading global...


  • Montreal, Quebec, Canada Axelon Services Corporation Full time

    {"title": "Site Reliability Engineer", "description": "About the RoleWe are seeking a skilled Site Reliability Engineer to join our HashiVault squad. As an SRE, you will be responsible for implementing new features, dealing with user requests, and reducing repeatable tasks to allow more time for strategic initiatives.About the Company*** is a leading global...


  • Montreal, Quebec, Canada Lyft Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at Lyft in Montreal. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability and scalability of our production systems, platforms, and tools.Key ResponsibilitiesDefine the team's roadmap and architecture based on technological and...


  • Montreal, Quebec, Canada Lyft Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at Lyft in Montreal. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability and scalability of our production systems, platforms, and tools.Key ResponsibilitiesDefine the team's roadmap and architecture based on technological and...