Lead Site Reliability Engineer
4 days ago
FactSet creates flexible, open data and software solutions for over 200,000 investment professionals worldwide, providing instant access to financial data and analytics that investors use to make crucial decisions.
At FactSet, our values are the foundation of everything we do. They express how we act and operate, serve as a compass in our decision-making, and play a big role in how we treat each other, our clients, and our communities. We believe that the best ideas can come from anyone, anywhere, at any time, and that curiosity is the key to anticipating our clients' needs and exceeding their expectations.
About Irwin
Irwin, a FactSet Company, is a leading provider of capital markets-focused financial technology with a mission to seamlessly connect the world's capital seekers and allocators to make them more productive, innovative, and successful. Our flagship product, Irwin, is a software platform used by investor relations and investment banking professionals all over the world. In October 2024, Irwin joined FactSet and added its investor relations solution to its existing offering.
Role Overview
We are seeking a seasoned Senior Site Reliability Engineer with deep expertise in AWS to own, architect, and continuously evolve Irwin's core infrastructure. You will plan, build, and optimize the systems that support our web applications and internal tools, ensuring scalability, reliability, observability, and security. Your technical judgment, roadmap planning skills, and hands-on expertise will enable our engineering teams to ship features with velocity and confidence.
Key Responsibilities
- Strategic Road mapping:
Design and execute long-term strategies for scalable, secure infrastructure to host the Irwin web application and associate tooling on AWS/EKS with PostgreSQL.
- AWS Infrastructure:
Architect and manage highly available cloud environments on EKS/Kubernetes using best practices for cost, performance, and security.
- Database Operations:
Oversee, tune, and ensure the high availability of large-scale PostgreSQL databases; optimize for performance, backup, disaster recovery, and observability. Bonus points for experience using Snowflake or other OLAP systems.
- Infrastructure as Code:
Lead the adoption and maintenance of Terraform workflows to manage infrastructure; ensure reproducibility, modularity, and CI/CD integration.
- Continuous Integration & GitOps:
Build, maintain and scale CI/CD pipelines using GitOps principles to automate deployments, reduce risk, and speed up delivery cycles.
- Kubernetes:
Design, deploy, and manage production-grade Kubernetes clusters; automate scaling and implement robust security practices.
- Monitoring & Incident Response:
Implement monitoring, logging, and alerting solutions; establish best practices for incident detection and resolution.
- Security & Compliance:
Apply industry best practices for infrastructure and data security; ensure governance and compliance with relevant standards (e.g., SOC2, GDPR).
- Collaboration:
Mentor SRE peers and engineering teams on DevOps/SRE methodologies; document, communicate, and evangelize infrastructure best practices.
Required Skills And Experience
Minimum Requirements:
- 10+ years as a Site Reliability Engineer, DevOps, or similar role in cloud-native environments (AWS focus).
Critical Skills
- Deep technical proficiency with AWS services (EC2, EKS, S3, RDS, IAM, etc.).
- Expert-level experience managing, tuning, and scaling PostgreSQL databases.
- Advanced skill in Terraform (modular design, environment promotion, CI/CD integration).
- Proficient in building and operating CI/CD systems (Gitlab CI, GitHub Actions, or equivalent).
- Hands-on experience with GitOps workflows (Argo CD, Flux, etc.).
- Strong knowledge of Kubernetes (deployment, scaling, networking, security).
- Experience with monitoring and logging stacks (DataDog, Prometheus, Grafana, ELK, etc.).
- Track record in designing, communicating, and executing complex infrastructure roadmaps.
- Experience mentoring and enabling engineering teams.
- Strong written and verbal communication skills.
Preferred/Desired Qualifications
- Professional certifications (AWS Solutions Architect, Kubernetes, Terraform).
- Experience in fin-tech, SaaS, or high-compliance industries.
- Exposure to data privacy regulations and secure software development practices.
Education
- Bachelors degree in computer science or similar
Why Irwin?
- Influence the technology roadmap at a pivotal growth stage.
- Build infrastructure for mission-critical applications.
- Work with passionate, high-performing teams.
- Competitive compensation, benefits, and equity options.
Here Is What To Expect
- First interview with me (hiring manager) to assess experience, basic technical skills and cultural fit - 1 hour
- Second deeper technical interview with two team member for a deeper technical assessment - 1 hour
- Optional third technical interview also with the same two team members if they are not able to cover everything in the first hour
- Last 30 minute interview with hiring manager to answer any remaining questions before we proceed to offer.
When You Apply
Please attach your resume and a cover letter describing your approach to architecting scalable infrastructure and your experience with AWS, PostgreSQL, Terraform, GitOps, CI/CD, and Kubernetes.
Irwin is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all.
Company Overview
FactSet (NYSE:FDS | NASDAQ:FDS) helps the financial community to see more, think bigger, and work better. Our digital platform and enterprise solutions deliver financial data, analytics, and open technology to more than 8,200 global clients, including over 200,000 individual users. Clients across the buy-side and sell-side, as well as wealth managers, private equity firms, and corporations, achieve more every day with our comprehensive and connected content, flexible next-generation workflow solutions, and client-centric specialized support. As a member of the S&P 500, we are committed to sustainable growth and have been recognized among the Best Places to Work in 2023 by Glassdoor as a Glassdoor Employees' Choice Award winner. Learn more at and follow us on X and LinkedIn.
At FactSet, we celebrate difference of thought, experience, and perspective. Qualified applicants will be considered for employment without regard to characteristics protected by law.
-
Lead Site Reliability Engineer
2 weeks ago
Toronto, Ontario, Canada AceStack Full time $120,000 - $200,000 per yearJob Title: Lead Site Reliability Engineer – Banking Domain (Wealth Management Preferred)Location: Toronto Downtown, ON (Onsite – 5 Days/Week)Duration: ContractExperience: 14+ YearsAbout the Role:We are looking for a highly skilled Site Reliability Engineering (SRE) Lead with a strong background in the Banking domain, ideally within Wealth Management. The...
-
Site Reliability Engineer
5 days ago
Toronto, Ontario, Canada Procom Full time $80,000 - $120,000 per yearSite Reliability Engineer (SRE)/ Ingénieur Fiabilité des SitesOn behalf of our banking client, Procom is seeking a Site Reliability Engineer (SRE) for a 12-month contract role. This position is a hybrid role, 3 days a week onsite at our client's Montréal, Quebec office.Site Reliability Engineer - Job Description:The Site Reliability Engineer is...
-
Site Reliability Engineer
1 week ago
Toronto, Ontario, Canada Tecsys Inc. Full time $85,000 - $130,000 per yearHaving recognized the advantages of remote work, including employee morale, productivity, reduced commuting on employee wellbeing and the environment, we are proud to be a digital-first company. The technologies and programs in which we invested have provided a fantastic foundation to this end. Our digital-first work environment, together with our...
-
Site Reliability Engineer
2 weeks ago
Toronto, Ontario, Canada Pixomondo Full time $120,000 - $180,000 per yearWe're seeking an experienced Site Reliability Engineer to join our team and lead infrastructure automation, CI/CD workflows, and deployment operations for a custom web platform. You'll be working with a modern DevOps stack including GitHub Actions, GCP, Kubernetes, Terraform, PostgreSQL, CodeDeploy, and Cloudflare to ensure our platform is robust, scalable,...
-
Site Reliability Engineer
5 days ago
Toronto, Ontario, Canada Maneva Full time US$80,000 - US$120,000 per yearAbout ManevaManeva builds and deploys edge AI solutions powering real-time intelligence for industrial environments. Our systems run on distributed edge compute devices (NVIDIA Jetson platforms), integrate with local network cameras, PLCs, sensors, and other on-premise equipment, and securely communicate with cloud services via client- or site-based VPNs....
-
Lead Site Reliability Engineer
2 days ago
Toronto, Ontario, Canada FactSet Full time $120,000 - $180,000 per yearFactSet creates flexible, open data and software solutions for over 200,000 investment professionals worldwide, providing instant access to financial data and analytics that investors use to make crucial decisions. At FactSet, our values are the foundation of everything we do. They express how we act and operate, serve as a compass in our...
-
Site Reliability Engineer
1 week ago
Toronto, Ontario, Canada Kablamo Full time $90,000 - $120,000 per yearReports to: Technical Support ManagerLocation: Toronto (Hybrid)Role Type: Full timeLevel: Intermediate/MidIntroductionKablamo is a fast-growing cloud digital product development company. Founded in 2017 in Australia, the business has grown quickly over the last several years, including the expansion of the team to Canada in 2021. We are proud to have...
-
Site Reliability Engineer
4 days ago
Toronto, Ontario, Canada Xplor Full time $125,000 - $150,000Company Description Take a seat on the Xplor rocketship and join us as Site Reliability Engineer to help people succeed across the world.From dropping your kids off at childcare, getting something at home repaired, going to the gym or a fitness studio, to picking up your dry cleaning — our software, payments, and commerce-enabling solutions help everyday...
-
Site Reliability Engineer
4 days ago
Toronto, Ontario, Canada McCain Foods Full time $102,700 - $137,000 per yearPosition Title:Site Reliability EngineerPosition Type:Regular - Full-TimePosition Location:Toronto HQRequisition ID:36904Our Global Technology team's goal is to leverage technology and data to drive profitable growth, focus on enhancing customer experience and to further our purpose of 'Celebrating real connections through delicious, planet-friendly food'....
-
Head of Site Reliability Engineering
4 days ago
Toronto, Ontario, Canada Shakudo Full time $120,000 - $200,000 per yearAbout the Job & Shakudo At Shakudo, we are building the world's first operating system for data and AI. We use the term operating system in the truest sense of the word. Like iOS, Windows and Linux, Shakudo's end-to-end OS offers ever-evolving, automatically operated, best-of-breed open-source components tailored to each business's unique needs. The...