Senior Reliability Engineer

3 weeks ago


Toronto, Ontario, Canada Flinks Full time

About Flinks

We're shaping the future of finance by empowering consumers with control over their financial data and unlocking its full potential. Our mission is to equip fintechs and banks with cutting-edge data tools, enabling them to create innovative, client-centric products that transform the financial industry.

Flinks is trusted by hundreds of companies and connects over 250 million financial accounts. Our products power digital finance, helping businesses streamline their processes, improve user experiences, and drive the next wave of financial innovation.

About the Reliability Team

As a Senior Reliability Engineer, you'll play a pivotal role in ensuring our systems and applications run reliably while scaling rapidly. You'll handle Service Reliability Engineer (SRE) tasks within a support capacity, driving improvements in system stability and acting as a leader in debugging and resolving complex production issues.

Key Responsibilities

  • Provide live operational support for multiple client software applications, monitoring services and alerts to detect critical failures, ensuring rapid restoration of services and minimal downtime.
  • Develop and maintain code to resolve production issues quickly, leveraging strong development skills to ensure fast service recovery and long-term system stability.
  • Own and resolve incidents reported by clients and internal stakeholders, adhering to client SLA and internal SLO timelines.
  • Troubleshoot complex incidents, perform thorough root cause analyses, and implement solutions to prevent the recurrence of issues.
  • Utilize a data-driven approach to prepare detailed analyses and reports, presenting findings through charts, layouts, and diagrams.
  • Conduct deep technical analyses of product and feature deficiencies, addressing client pain points based on actual use cases.
  • Develop and enhance monitoring systems to proactively detect issues, implementing robust alert mechanisms to ensure continuous system stability.
  • Provide expert guidance on improving operational system stability and scalability.
  • Lead and execute initiatives that automate processes, improving operational efficiency across LiveOps.
  • Facilitate postmortem meetings following incidents, documenting findings, and assigning action items for future prevention.
  • Collaborate with cross-functional teams to ensure rapid resolution of production issues, implementing long-term fixes.
  • Lead and motivate project teams, ensuring tasks are completed on schedule and that high-quality standards are consistently met.
  • Mentor and provide ongoing training to reliability engineers, tracking their progress and ensuring adherence to high standards.
  • Actively contribute to maintaining the highest quality standards as the organization continues to scale.
  • Participate in after-hours on-call support as part of the LiveOps rotation.

Requirements

  • Operationally focused with expertise in incident management and resolving live production issues.
  • Strong debugging and troubleshooting skills, particularly in performance optimization of large-scale applications.
  • Proven experience in building and maintaining reliable monitoring and alerting systems in high-demand environments, with a focus on production support.
  • 7+ years of experience with .NET Framework (C#), ensuring production system stability.
  • Strong knowledge of Kubernetes, Docker, and cloud platforms (GCP preferred).
  • Proficiency with monitoring tools like Prometheus, Grafana, and Kibana.
  • Experience with incident ticketing/documentation tools like FreshDesk and Confluence.
  • Critical thinker who can identify system weaknesses and find innovative solutions.
  • Strong project management skills with a focus on scalability and system stability.
  • ITIL Service Management certification (or equivalent) is highly desired, such as ITIL v3, ITIL v4, or other equivalent certifications.
  • Experience with PowerBI, web scraping, or Golang (nice to have).

What's in it for You?

  • Clear Impact: You'll ensure that millions of users have reliable access to their financial data, directly contributing to the success of Flinks and its customers.
  • Autonomy and Ownership: Senior Engineers at Flinks are empowered to lead major initiatives, drive strategy, and influence the direction of our tech stack.
  • Trailblazing Technology: Be part of a cutting-edge company at the forefront of open banking and financial data management, during a pivotal time of growth and innovation.
  • Professional Growth: You'll be continuously challenged by working with a passionate, smart team on a variety of technical and business problems.

The Interview Process

  • People Ops Generalist
  • Team Lead Interview
  • Case Assignment & Presentation
  • Stakeholder Interview
  • Director Interview


  • Toronto, Ontario, Canada Metrolinx Full time

    Job Title: Senior Reliability EngineerJob Summary:Metrolinx is a leading transportation agency in the Greater Golden Horseshoe region, operating GO Transit, UP Express, and the PRESTO fare payment system. We are committed to providing reliable and efficient transportation services to our customers. As a Senior Reliability Engineer, you will play a critical...


  • Toronto, Ontario, Canada Metrolinx Full time

    Job Title: Senior Reliability EngineerJob Summary:Metrolinx is a leading transportation agency in the Greater Golden Horseshoe region, operating GO Transit, UP Express, and the PRESTO fare payment system. We are committed to providing reliable and efficient transportation services to our customers. As a Senior Reliability Engineer, you will play a critical...


  • Toronto, Ontario, Canada Metrolinx Full time

    Job Title: Senior Reliability EngineerMetrolinx is a leading transportation agency in the Greater Golden Horseshoe region, connecting communities through its GO Transit and UP Express services, as well as the PRESTO fare payment system. We are committed to delivering efficient and reliable transportation solutions to our customers.Job Summary:We are seeking...


  • Toronto, Ontario, Canada Metrolinx Full time

    Job Title: Senior Reliability EngineerMetrolinx is a leading transportation agency in the Greater Golden Horseshoe region, connecting communities through its GO Transit and UP Express services, as well as the PRESTO fare payment system. We are committed to delivering efficient and reliable transportation solutions to our customers.Job Summary:We are seeking...


  • Toronto, Ontario, Canada Metrolinx Full time

    Job Title: Senior Reliability EngineerMetrolinx is a leading transportation agency in the Greater Golden Horseshoe region, connecting communities through its GO Transit and UP Express services, as well as the PRESTO fare payment system. We are committed to delivering efficient and reliable transportation solutions to our customers.Job Summary:We are seeking...


  • Toronto, Ontario, Canada Metrolinx Full time

    Job Title: Senior Reliability EngineerMetrolinx is a leading transportation agency in the Greater Golden Horseshoe region, connecting communities through its GO Transit and UP Express services, as well as the PRESTO fare payment system. We are committed to delivering efficient and reliable transportation solutions to our customers.Job Summary:We are seeking...

  • Senior Data Engineer

    1 month ago


    Toronto, Ontario, Canada Data Engineer Jobs Full time

    About the RoleWe are seeking a highly skilled Senior Data Engineer to join our team. As a Senior Data Engineer, you will be responsible for designing, building, and maintaining large-scale data systems that support our business operations.Key ResponsibilitiesDesign and implement data models, data warehouses, and data pipelines to support business...

  • Senior Data Engineer

    4 weeks ago


    Toronto, Ontario, Canada Data Engineer Jobs Full time

    About the RoleWe are seeking a highly skilled Senior Data Engineer to join our team. As a Senior Data Engineer, you will be responsible for designing, building, and maintaining large-scale data systems that support our business operations.Key ResponsibilitiesDesign and implement data models, data warehouses, and data pipelines to support business...


  • Toronto, Ontario, Canada Northbridge Financial Corporation Full time

    Senior Site Reliability EngineerAt Northbridge Financial Corporation, we are seeking a highly skilled Senior Site Reliability Engineer to join our team. As a key member of our engineering team, you will be responsible for designing, developing, and implementing site reliability solutions that align with our business goals.Key Responsibilities:Collaborate...


  • Toronto, Ontario, Canada Northbridge Financial Corporation Full time

    Senior Site Reliability EngineerAt Northbridge Financial Corporation, we are seeking a highly skilled Senior Site Reliability Engineer to join our team. As a key member of our engineering team, you will be responsible for designing, developing, and implementing site reliability solutions that align with our business goals.Key Responsibilities:Collaborate...


  • Toronto, Ontario, Canada Metrolinx Full time

    Job SummaryMetrolinx is seeking a Senior Reliability Engineer to provide expertise in reliability, availability, maintainability, and safety (RAMS) engineering for our GO Transit and UP Express assets. The successful candidate will analyze performance metrics and asset failure history to identify areas for improvement and develop reliability simulations and...


  • Toronto, Ontario, Canada Flinks Full time

    About FlinksWe're not just building data infrastructure; we're shaping the future of finance. Our mission is to empower consumers with control over their financial data and unlock its full potential. We equip fintechs and banks with cutting-edge data tools, enabling them to create innovative, client-centric products that are transforming the financial...


  • Toronto, Ontario, Canada Vantage Full time

    Senior Cloud Reliability EngineerWe are seeking a highly skilled Senior Cloud Reliability Engineer to join our team at Vantage. As a key member of our engineering team, you will play a pivotal role in ensuring the seamless operation of our large-scale, distributed systems.Key ResponsibilitiesCollaborate with our software engineering team to drive project...


  • Toronto, Ontario, Canada Vantage Full time

    Senior Site Reliability EngineerWe are seeking a highly skilled Senior Site Reliability Engineer to join our team at Vantage. As a key member of our engineering team, you will play a pivotal role in ensuring the seamless operation of our large-scale, distributed systems.Key ResponsibilitiesCollaborate with software engineers to drive project success through...


  • Toronto, Ontario, Canada Vantage Full time

    Senior Site Reliability EngineerWe are seeking a highly skilled Senior Site Reliability Engineer to join our team at Vantage. As a key member of our engineering team, you will play a pivotal role in ensuring the seamless operation of our large-scale, distributed systems.Key ResponsibilitiesCollaborate with software engineers to drive project success through...


  • Toronto, Ontario, Canada Northbridge Financial Corporation Full time

    Senior Site Reliability EngineerThe Senior Site Reliability Engineer plays a crucial role in ensuring the reliability and efficiency of our systems. This position oversees the creation and implementation of Service Level Objectives (SLOs) and handles service reliability solutions and processes of increasing complexity.Key Responsibilities:Interface with...


  • Toronto, Ontario, Canada Vantage Full time

    Senior Site Reliability EngineerWe are seeking a highly skilled Senior Site Reliability Engineer to join our team at Vantage. As a key member of our engineering team, you will play a pivotal role in ensuring the seamless operation of our large-scale, distributed systems.Key ResponsibilitiesCollaborate with software engineers to drive project success through...


  • Toronto, Ontario, Canada Vantage Full time

    Senior Site Reliability EngineerWe are seeking a highly skilled Senior Site Reliability Engineer to join our team at Vantage. As a key member of our engineering team, you will play a pivotal role in ensuring the seamless operation of our large-scale, distributed systems.Key ResponsibilitiesCollaborate with software engineers to drive project success through...


  • Toronto, Ontario, Canada Vantage Full time

    Senior Site Reliability EngineerWe are seeking a highly skilled Senior Site Reliability Engineer to join our team at Vantage. As a key member of our engineering team, you will play a pivotal role in ensuring the seamless operation of our large-scale, distributed systems.Key ResponsibilitiesCollaborate with software engineers to drive project success and...


  • Toronto, Ontario, Canada Criteo Full time

    About the Role:We are seeking a skilled Senior Site Reliability Engineer to join our team at Criteo. As a key member of our Product Reliability Engineering group, you will work closely with product engineering to improve the reliability of our apps, systems, and pipelines.Your Responsibilities:Collaborate with product engineering to identify and prioritize...