Senior Site Reliability Engineer

1 month ago


Montreal, Quebec, Canada Lightspeed Full time

About the Role: We are seeking a highly skilled Staff Site Reliability Engineer to enhance our operations at Lightspeed. This position is pivotal in developing software solutions that empower merchants to expand their business effectively and profitably. You will play a crucial role in managing essential aspects such as cloud infrastructure, reliability, incident management, data analytics, and operational efficiency.

Key Responsibilities:

  • Collaborate with development teams to equip them with tools and methodologies for monitoring software performance in production, establishing and tracking reliability metrics (SLI, SLO), and overseeing error budgets.
  • Architect, develop, and sustain resilient infrastructure utilizing Google Cloud Platform (GCP) and cloud-native technologies like GKE, Cloud SQL, and BigQuery.
  • Design and manage CI/CD pipelines to streamline deployment and release processes using technologies such as GitLab, GitHub, Helm, and Terraform.
  • Lead incident management efforts and perform post-incident reviews to mitigate future disruptions.
  • Guide junior SREs and developers, sharing best practices in cloud architecture, data handling, and software engineering.
  • Conduct performance evaluations of systems and implement improvements to enhance reliability and throughput.
  • Work with cross-functional teams to identify, design, and execute internal process enhancements in a cost-effective manner.
  • Create and develop robust, scalable, and highly available systems.
  • Apply software engineering principles to enhance the reliability of our software and expedite software delivery.
  • Oversee infrastructure modifications through Infrastructure as Code (IaC) methodologies.
  • Participate in the on-call rotation as needed.
  • Stay abreast of industry trends and emerging technologies, advocating for the integration of new practices that elevate product quality and team efficiency.

Qualifications:

  • Bachelor's degree in Computer Science, Engineering, or equivalent practical experience.
  • 7-9 years of experience in site reliability engineering, systems administration, or software engineering.
  • Expertise in container orchestration, particularly with Kubernetes.
  • Strong knowledge of relational databases (e.g., PostgreSQL, MySQL) and NoSQL databases (e.g., MongoDB, Cassandra, Redis).
  • Comprehensive understanding of network protocols and IP networking, along with experience in network troubleshooting.
  • Proficiency in programming languages such as Java, Python, or Go.
  • Demonstrated experience managing large-scale cloud infrastructure, particularly in environments like Google Cloud, AWS, or Azure.
  • Familiarity with monitoring tools (e.g., Prometheus, Grafana, Datadog) and logging solutions (e.g., ELK stack).
  • Strong grasp of security best practices.
  • Exceptional problem-solving abilities and the capacity to work under pressure to resolve complex issues.
  • Excellent communication skills for effective collaboration with diverse teams.
  • Proven leadership capabilities, with the ability to guide projects and influence engineering decisions across the organization.

Benefits:

  • Flexible working environment.
  • Career advancement opportunities in a rapidly growing company.
  • Collaborative team atmosphere that encourages personal impact.

Lightspeed is committed to fostering an inclusive and accessible workplace. We welcome applications from individuals of all backgrounds and experiences.



  • Montreal, Quebec, Canada Lightspeed Full time

    About the RoleWe are seeking a Senior Site Reliability Engineer to become a vital part of our team at Lightspeed. Our mission is to develop innovative software solutions that empower merchants to enhance their business growth and profitability.Key ResponsibilitiesYou will collaborate with a dedicated team focused on various critical aspects, including:Cloud...


  • Montreal, Quebec, Canada Lightspeed Full time

    We are seeking a Senior Site Reliability Engineer to become a vital part of our team at Lightspeed. Our organization specializes in developing innovative software solutions that empower merchants to enhance their business growth and profitability. In this role, you will collaborate with a team focused on addressing critical areas such as cloud...


  • Montreal, Quebec, Canada Lightspeed Full time

    Position Overview: We are seeking a Senior Site Reliability Engineer to enhance our operations at Lightspeed. This role is pivotal in developing software solutions that empower merchants to expand their business and increase profitability. You will play a crucial role in managing cross-functional concerns, including cloud infrastructure, reliability,...


  • Montreal, Quebec, Canada Lightspeed Full time

    Position Overview: We are seeking a Senior Site Reliability Engineer to enhance our team at Lightspeed. Our company develops innovative software solutions that empower merchants to expand their business and increase profitability. In this role, you will be instrumental in managing critical aspects such as cloud infrastructure, reliability, incident...


  • Montreal, Quebec, Canada Banque Nationale du Canada Full time

    Senior Site Reliability EngineerWork Arrangement: HybridJob Category: Senior ProfessionalStatus: PermanentContract Type: PermanentWork Schedule: Full-TimeLocation: MontrealArea of Expertise: Information TechnologyA career in technology at Banque Nationale du Canada involves engaging in transformative initiatives that directly benefit clients. As a Senior...


  • Montreal, Quebec, Canada Lightspeed Full time

    Welcome to Lightspeed!We are in search of a Senior Site Reliability Engineer to become a vital part of our NuOrder by Lightspeed team in North America. At NuORDER by Lightspeed, we develop innovative software solutions designed to enhance the growth and profitability of merchants' businesses. You will collaborate with a team dedicated to addressing...


  • Montreal, Quebec, Canada Royal Bank of Canada Full time

    About the OpportunityWe are seeking a highly skilled Senior Site Reliability Engineer to join our Digital Branch SRE organization at the Royal Bank of Canada. As a key member of our team, you will be responsible for designing, implementing, and supporting Site Reliability Engineering (SRE) solutions for our applications.Key ResponsibilitiesDevelop and...


  • Montreal, Quebec, Canada Royal Bank of Canada Full time

    About the OpportunityWe are seeking a highly skilled Senior Site Reliability Engineer to join our Digital Branch SRE organization at the Royal Bank of Canada. As a key member of our team, you will be responsible for designing, implementing, and supporting Site Reliability Engineering (SRE) solutions for our applications.Key ResponsibilitiesDevelop and...


  • Montreal, Quebec, Canada Banque Nationale du Canada Full time

    Site Reliability Engineering Developer (SRE)Work Arrangement: HybridJob Category: Senior ProfessionalStatus: PermanentContract Type: PermanentWork Schedule: Full-TimeLocation: MontrealArea of Interest: Information TechnologyA career in technology at Banque Nationale du Canada involves engaging in transformative initiatives that directly impact clients. As a...


  • Montreal, Quebec, Canada Axelon Services Corporation Full time

    Job Title:Site Reliability EngineerAbout the Role:We are seeking a skilled Site Reliability Engineer to join our Application Infrastructure team. As a Site Reliability Engineer, you will play a critical role in driving the reliability engineering, operations, and customer support services for our ServiceNow SaaS implementation.Key Responsibilities:Deliver...


  • Montreal, Quebec, Canada Axelon Services Corporation Full time

    Job Title:Site Reliability EngineerAbout the Role:We are seeking a skilled Site Reliability Engineer to join our Application Infrastructure team. As a Site Reliability Engineer, you will play a critical role in driving the reliability engineering, operations, and customer support services for our ServiceNow SaaS implementation.Key Responsibilities:Deliver...


  • Montreal, Quebec, Canada Banque Nationale du Canada Full time

    Site Reliability Engineering Developer (SRE)Work Arrangement: HybridJob Category: Senior ProfessionalStatus: PermanentContract Type: PermanentSchedule: Full-TimeLocation: MontrealArea of Focus: Information TechnologyA career in technology at Banque Nationale du Canada involves engaging in transformative initiatives that directly benefit clients. As a Site...


  • Montreal, Quebec, Canada Royal Bank of Canada Full time

    Job SummaryWe are seeking a highly skilled Senior Site Reliability Engineer to join our Digital Branch SRE organization. As a key member of our team, you will be responsible for developing, implementing, and supporting SRE solutions for our applications.Key ResponsibilitiesParticipate in code reviews and non-functional reviews of production-bound SRE...


  • Montreal, Quebec, Canada Royal Bank of Canada Full time

    Job SummaryWe are seeking a highly skilled Senior Site Reliability Engineer to join our Digital Branch SRE organization. As a key member of our team, you will be responsible for developing, implementing, and supporting SRE solutions for our applications.Key ResponsibilitiesParticipate in code reviews and non-functional reviews of production-bound SRE...


  • Montreal, Quebec, Canada Lyft Full time

    Lyft is a leading micromobility company dedicated to improving urban transportation systems worldwide. They are seeking a Site Reliability Engineer to join their expanding team. This individual will be responsible for designing, implementing, and maintaining the infrastructure systems to ensure reliability and scalability.Responsibilities:Assist in defining...


  • Montreal, Quebec, Canada Lightspeed Full time

    Position Overview:We are seeking a highly skilled Senior Site Reliability Engineer to enhance our Cloud Networking initiatives at Lightspeed. This role is pivotal in developing software solutions that empower merchants to expand their business profitability.Key Responsibilities:Collaborate with development teams to equip them with essential tools and...


  • Montreal, Quebec, Canada Axelon Services Corporation Full time

    {"title": "Site Reliability Engineer", "description": "About the RoleWe are seeking a skilled Site Reliability Engineer to join our HashiVault squad. As an SRE, you will be responsible for implementing new features, dealing with user requests, and reducing repeatable tasks to allow more time for strategic initiatives.About the Company*** is a leading global...


  • Montreal, Quebec, Canada Axelon Services Corporation Full time

    {"title": "Site Reliability Engineer", "description": "About the RoleWe are seeking a skilled Site Reliability Engineer to join our HashiVault squad. As an SRE, you will be responsible for implementing new features, dealing with user requests, and reducing repeatable tasks to allow more time for strategic initiatives.About the Company*** is a leading global...


  • Montreal, Quebec, Canada Lyft Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at Lyft in Montreal. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability and scalability of our production systems, platforms, and tools.Key ResponsibilitiesDefine the team's roadmap and architecture based on technological and...


  • Montreal, Quebec, Canada Lyft Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at Lyft in Montreal. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability and scalability of our production systems, platforms, and tools.Key ResponsibilitiesDefine the team's roadmap and architecture based on technological and...