Lead Site Reliability Engineer

1 month ago


Montreal, Quebec, Canada Lightspeed Full time

Welcome to Lightspeed

Are you exploring new career avenues? You may have found your next opportunity here.

We are seeking a Lead Site Reliability Engineer to enhance our NuOrder by Lightspeed division in North America. NuORDER by Lightspeed develops innovative software solutions aimed at empowering merchants to expand their business and increase profitability. You will be part of a team that addresses essential areas such as cloud infrastructure, reliability, incident management, data analytics, cost efficiency, and more. Additionally, you will assist our expanding development teams by providing the infrastructure and tools necessary for continued growth. Your role will involve constructing and maintaining multi-region infrastructures and networks, ensuring our products operate reliably, efficiently, and securely by applying and promoting established DevOps principles.

Key Responsibilities:

  • Collaborate closely with development teams to equip them with the tools and methodologies required for monitoring software performance in production, establishing and tracking reliability metrics (SLI, SLO), and managing error budgets.
  • Design, construct, and sustain resilient infrastructure utilizing GCP, incorporating cloud-native technologies such as GKE, Cloud SQL, and BigQuery.
  • Develop and oversee CI/CD pipelines for streamlined deployment and release using various technologies (GitLab, GitHub, Helm, Terraform, etc.).
  • Lead the incident management process and perform post-incident reviews to mitigate future disruptions.
  • Guide junior SREs and developers, offering insights on best practices in cloud architecture, data management, and software development.
  • Conduct system performance evaluations and implement improvements to enhance system reliability and throughput.
  • Work with cross-functional teams to identify, design, and execute internal process enhancements in a cost-effective manner.
  • Architect and develop robust, scalable, and highly available systems.
  • Create platform solutions and apply software engineering principles to enhance software reliability and expedite software delivery.
  • Manage infrastructure modifications through infrastructure as code (IaC).
  • Participate in our on-call rotation.
  • Stay updated with industry trends and emerging technologies, advocating for the adoption of innovations that enhance product quality and team productivity.

Qualifications:

  • Bachelor's degree in Computer Science, Engineering, or equivalent practical experience.
  • 9-10+ years of experience in site reliability engineering, systems administration, or software engineering.
  • Extensive knowledge of container orchestration platforms, particularly Kubernetes.
  • Strong grasp of both relational (e.g., PostgreSQL, MySQL) and NoSQL databases (e.g., MongoDB, Cassandra, Redis).
  • In-depth understanding of network protocols and IP networking, along with experience in network troubleshooting.
  • Proficiency in programming languages such as Java, Python, Go, etc.
  • Demonstrated experience managing large-scale infrastructure in cloud environments like Google Cloud, AWS, or Azure.
  • Familiarity with monitoring tools (e.g., Prometheus, Grafana, Datadog) and logging solutions (e.g., ELK stack).
  • Strong understanding of security best practices.
  • Exceptional problem-solving abilities and the capacity to work under pressure to resolve complex issues.
  • Excellent communication skills for effective collaboration with cross-functional teams.
  • Strong leadership capabilities, able to guide projects and influence engineering decisions across the organization.

We recognize that candidates are more than just their resumes. If you feel uncertain about meeting all the qualifications, we encourage you to apply anyway.

What We Offer:

Experience the Lightspeed culture:

  • Flexible work environment;
  • Genuine career advancement opportunities in a rapidly growing company;
  • Work within a team that is large enough for growth yet small enough to make a significant impact.

... and enjoy a variety of benefits designed to keep you happy, healthy, and fulfilled:

  • Lightspeed share scheme (we are all owners)
  • Lightspeed RSU program (we are all owners)
  • Unlimited paid time off policy
  • Flexible working policy
  • Health insurance
  • Health and wellness benefits
  • Paid leave assistance for new parents
  • LinkedIn learning
  • Volunteer day

#LI-AL2

Lightspeed is an equal opportunity employer committed to creating an inclusive and barrier-free workplace. We welcome and encourage applications from individuals with disabilities. Accommodations are available upon request for candidates participating in all aspects of the selection process.

About Us:
Lightspeed is dedicated to empowering businesses that form the backbone of the global economy. Our comprehensive commerce platform enables merchants to innovate, simplify, scale, and deliver exceptional customer experiences. Founded in 2005 in Montreal, Canada, Lightspeed is publicly traded on the New York Stock Exchange (NYSE: LSPD) and the Toronto Stock Exchange (TSX: LSPD), serving retail, hospitality, and golf businesses in over 100 countries.



  • Montreal, Quebec, Canada Lightspeed Full time

    Welcome to Lightspeed!Are you exploring new career paths or simply assessing the job market? You may have found the perfect opportunity.We are in search of a Principal Site Reliability Engineer to become a vital part of our NuOrder by Lightspeed team in North America. Our company develops innovative software solutions designed to enhance merchants' business...


  • Montreal, Quebec, Canada Lightspeed Full time

    Overview:Thank you for your interest. Are you exploring new career opportunities? You may find what you're looking for here.We are seeking a Principal Site Reliability Engineer to be part of our team at Lightspeed. Our company develops innovative software solutions that assist merchants in enhancing their business growth and profitability. In this role, you...


  • Montreal, Quebec, Canada Lightspeed Full time

    About the Role:We are seeking a Principal Site Reliability Engineer to become an integral part of our innovative team at Lightspeed. Our organization develops advanced software solutions designed to enhance the growth and profitability of businesses. In this role, you will focus on critical aspects such as cloud infrastructure, operational reliability,...


  • Montreal, Quebec, Canada Lightspeed Full time

    Welcome to Lightspeed!We are on the lookout for a Principal Site Reliability Engineer to enhance our NuOrder by Lightspeed division in North America. Our team is dedicated to developing software solutions that empower merchants to expand their business profitability and reach. In this role, you will be pivotal in addressing essential aspects such as cloud...


  • Montreal, Quebec, Canada Axelon Services Corporation Full time

    Job Title: Site Reliability EngineerJob Summary:We are seeking a highly skilled Site Reliability Engineer to join our Application Infrastructure team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our ServiceNow SaaS implementation.Key Responsibilities:Design and implement automated...


  • Montreal, Quebec, Canada Axelon Services Corporation Full time

    Job Title: Site Reliability EngineerJob Summary:We are seeking a highly skilled Site Reliability Engineer to join our Application Infrastructure team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our ServiceNow SaaS implementation.Key Responsibilities:Design and implement automated...


  • Montreal, Quebec, Canada Lyft Full time

    Lyft is a leading micromobility company dedicated to improving urban transportation systems worldwide. They are seeking a Site Reliability Engineer to join their expanding team. This individual will be responsible for designing, implementing, and maintaining the infrastructure systems to ensure reliability and scalability.Responsibilities:Assist in defining...


  • Montreal, Quebec, Canada Axelon Services Corporation Full time

    {"title": "Site Reliability Engineer", "description": "About the RoleWe are seeking a skilled Site Reliability Engineer to join our HashiVault squad. As an SRE, you will be responsible for implementing new features, dealing with user requests, and reducing repeatable tasks to allow more time for strategic initiatives.About the Company*** is a leading global...


  • Montreal, Quebec, Canada Axelon Services Corporation Full time

    {"title": "Site Reliability Engineer", "description": "About the RoleWe are seeking a skilled Site Reliability Engineer to join our HashiVault squad. As an SRE, you will be responsible for implementing new features, dealing with user requests, and reducing repeatable tasks to allow more time for strategic initiatives.About the Company*** is a leading global...


  • Montreal, Quebec, Canada Axelon Services Corporation Full time

    {"Job Title:Site Reliability Engineer (SRE)Montreal QC12 MonthsThe ideal candidate would have at least one of:ServiceNow administration or development experience, orSoftware development skills in one or more programming language, e.g. PythonThe Application Infrastructure (AI) department is seeking a Site Reliability Engineer (SRE) to help drive the...


  • Montreal, Quebec, Canada Axelon Services Corporation Full time

    {"Job Title:Site Reliability Engineer (SRE)Montreal QC12 MonthsThe ideal candidate would have at least one of:ServiceNow administration or development experience, orSoftware development skills in one or more programming language, e.g. PythonThe Application Infrastructure (AI) department is seeking a Site Reliability Engineer (SRE) to help drive the...


  • Montreal, Quebec, Canada Axelon Services Corporation Full time

    About Axelon Services CorporationAxelon Services Corporation is a leading global financial services firm providing a wide range of investment banking, securities, investment management, and wealth management services.We are committed to excellence and have a strong team ethic. Our company culture values integrity, excellence, and teamwork, providing a...


  • Montreal, Quebec, Canada Axelon Services Corporation Full time

    About Axelon Services CorporationAxelon Services Corporation is a leading global financial services firm providing a wide range of investment banking, securities, investment management, and wealth management services.We are committed to excellence and have a strong team ethic. Our company culture values integrity, excellence, and teamwork, providing a...


  • Montreal, Quebec, Canada Axelon Services Corporation Full time

    About Axelon Services CorporationAxelon Services Corporation is a leading global financial services firm providing a wide range of investment banking, securities, investment management, and wealth management services.We are committed to excellence and have a strong team ethic. Our company provides a superior foundation for building a professional career, a...


  • Montreal, Quebec, Canada Axelon Services Corporation Full time

    About Axelon Services CorporationAxelon Services Corporation is a leading global financial services firm providing a wide range of investment banking, securities, investment management, and wealth management services.We are committed to excellence and have a strong team ethic. Our company provides a superior foundation for building a professional career, a...


  • Montreal, Quebec, Canada Royal Bank of Canada> Full time

    Job SummaryWe are seeking a highly skilled and experienced Site Reliability Engineer to lead our team in developing, implementing, and supporting Site Reliability Engineering (SRE) solutions for applications supported by the Digital Branch SRE organization.Key ResponsibilitiesLead code and non-functional reviews of all production-bound SRE solutionsDrive...


  • Montreal, Quebec, Canada Royal Bank of Canada> Full time

    Job SummaryWe are seeking a highly skilled and experienced Site Reliability Engineer to lead our team in developing, implementing, and supporting Site Reliability Engineering (SRE) solutions for applications supported by the Digital Branch SRE organization.Key ResponsibilitiesLead code and non-functional reviews of all production-bound SRE solutionsDrive...


  • Montreal, Quebec, Canada Axelon Services Corporation Full time

    Job Title:Site Reliability EngineerAbout the Role:We are seeking a skilled Site Reliability Engineer to join our Application Infrastructure team. As a Site Reliability Engineer, you will play a critical role in driving the reliability engineering, operations, and customer support services for our ServiceNow SaaS implementation.Key Responsibilities:Deliver...


  • Montreal, Quebec, Canada Axelon Services Corporation Full time

    Job Title:Site Reliability EngineerAbout the Role:We are seeking a skilled Site Reliability Engineer to join our Application Infrastructure team. As a Site Reliability Engineer, you will play a critical role in driving the reliability engineering, operations, and customer support services for our ServiceNow SaaS implementation.Key Responsibilities:Deliver...


  • Montreal, Quebec, Canada Axelon Services Corporation Full time

    {"title": "Site Reliability Engineer", "description": "Job SummaryWe are seeking a skilled Site Reliability Engineer to join our Application Infrastructure team. As a Site Reliability Engineer, you will be responsible for delivering reliable and resilient systems without wasteful operational effort.Key ResponsibilitiesDelivery of improvements that will...