Current jobs related to Senior Site Reliability Engineer - Toronto, Ontario - Vitech Systems Group
-
Senior Site Reliability Engineer
3 days ago
Toronto, Ontario, Canada RBC Full time $90,000 - $120,000 per yearJob DescriptionWhat is the opportunity?Join our Commercial, Core Banking and Payments Technology (CCBPT) team as a Senior Site Reliability Engineer, where you'll play a key role in supporting our cloud and distributed environments for the Personal Commercial Credit SRE & Ops team. This exciting opportunity will challenge you to work with cutting-edge...
-
Site Reliability Engineer
3 days ago
Toronto, Ontario, Canada Procom Full time $80,000 - $120,000 per yearSite Reliability Engineer (SRE)/ Ingénieur Fiabilité des SitesOn behalf of our banking client, Procom is seeking a Site Reliability Engineer (SRE) for a 12-month contract role. This position is a hybrid role, 3 days a week onsite at our client's Montréal, Quebec office.Site Reliability Engineer - Job Description:The Site Reliability Engineer is...
-
Senior Site Reliability Engineer
1 week ago
Toronto, Ontario, Canada 3cf5cb8c-b08d-42c2-a6cd-1ee0c7026e02 Full time $120,000 - $180,000 per yearAbout Us:Zensurance is redefining commercial insurance for Canadian businesses.As a leading InsurTech, we make getting the right coverage simple, fast, and accessible through a digital-first experience. Our platform combines advanced technology with deep industry expertise to deliver tailored insurance solutions that help businesses thrive.Zensurance has...
-
Senior Site Reliability Engineer
1 week ago
Toronto, Ontario, Canada Zensurance Full time $120,000 - $180,000 per yearAbout Us: Zensurance is redefining commercial insurance for Canadian businesses As a leading InsurTech, we make getting the right coverage simple, fast, and accessible through a digital-first experience. Our platform combines advanced technology with deep industry expertise to deliver tailored insurance solutions that help businesses thrive Zensurance has...
-
Senior Site Reliability Engineer
1 week ago
Toronto, Ontario, Canada Zensurance Full time $120,000 - $180,000 per yearAbout Us: Zensurance is redefining commercial insurance for Canadian businesses. As a leading InsurTech, we make getting the right coverage simple, fast, and accessible through a digital-first experience. Our platform combines advanced technology with deep industry expertise to deliver tailored insurance solutions that help businesses thrive. Zensurance...
-
Senior Site Reliability Engineer
1 week ago
Toronto, Ontario, Canada Zensurance Full time $900,000 - $1,200,000 per yearAbout Us:Zensurance is redefining commercial insurance for Canadian businesses. As a leading InsurTech, we make getting the right coverage simple, fast, and accessible through a digital-first experience. Our platform combines advanced technology with deep industry expertise to deliver tailored insurance solutions that help businesses thrive.Zensurance has...
-
Senior Manager, Site Reliability Engineering
19 hours ago
Toronto, Ontario, Canada Tubi Full time $120,000 - $180,000 per yearAbout Tubi:Boldly built for every fandom, Tubi is a free streaming service that entertains over 100 million monthly active users. Tubi offers the world's largest collection of Hollywood movies and TV shows, thousands of creator-led stories and hundreds of Tubi Originals made for the most passionate fans. Headquartered in San Francisco and founded in 2014,...
-
Site Reliability Engineer
1 week ago
Toronto, Ontario, Canada Kablamo Full time $90,000 - $120,000 per yearReports to: Technical Support ManagerLocation: Toronto (Hybrid)Role Type: Full timeLevel: Intermediate/MidIntroductionKablamo is a fast-growing cloud digital product development company. Founded in 2017 in Australia, the business has grown quickly over the last several years, including the expansion of the team to Canada in 2021. We are proud to have...
-
Site Reliability Engineer
3 days ago
Toronto, Ontario, Canada Maneva Full time US$80,000 - US$120,000 per yearAbout ManevaManeva builds and deploys edge AI solutions powering real-time intelligence for industrial environments. Our systems run on distributed edge compute devices (NVIDIA Jetson platforms), integrate with local network cameras, PLCs, sensors, and other on-premise equipment, and securely communicate with cloud services via client- or site-based VPNs....
-
Site Reliability Engineer
3 days ago
Toronto, Ontario, Canada McCain Foods Full time $102,700 - $137,000 per yearPosition Title:Site Reliability EngineerPosition Type:Regular - Full-TimePosition Location:Toronto HQRequisition ID:36904Our Global Technology team's goal is to leverage technology and data to drive profitable growth, focus on enhancing customer experience and to further our purpose of 'Celebrating real connections through delicious, planet-friendly food'....
Senior Site Reliability Engineer
2 weeks ago
Department:
Development Operations (DevOps)
Location:
Canada
Description
At Vitech, we believe in the power of technology to simplify complex business processes. Our mission is to bring better software solutions to market, addressing the intricacies of the insurance and retirement industries. We combine deep domain expertise with the latest technological advancements to deliver innovative, user-centric solutions that future-proof and empower our clients to thrive in an ever-changing landscape. With over 1,600 talented professionals on our team, our innovative solutions are recognized by industry leaders like Gartner, Celent, Aite-Novarica, and ISG.
**We offer a competitive compensation package along with comprehensive benefits that support your health, well-being, and financial security.
Senior Site Reliability Engineer (SRE)
Location: Canada or United States (Remote Role)
Senior Site Reliability Engineer (SRE) – Join Our Global Engineering Team**
At Vitech we believe that excellence in production systems starts with engineering-driven solutions to operational challenges. Our Site Reliability Engineering (SRE) team is at the heart of ensuring seamless performance for our clients, preventing potential outages, and proactively identifying and resolving issues before they arise.
Our SRE team is a diverse group of talented engineers across India, the US, and Canada. We have T-shaped expertise spanning application development, database management, networking, and system administration across both on-premise environments and AWS cloud. Together, we support mission-critical client environments and drive automation to reduce manual toil, freeing our team to focus on innovation.
About the Role: Senior SRE
As an SRE, you'll be a key player in revolutionizing how we operate production systems for single and multi-tenant environments. You'll support SRE initiatives, support production, and drive infrastructure automation. Working in an Agile team environment, you'll have the opportunity to explore and implement the latest technologies, engage in on-call duties, and contribute to continuous learning as part of an ever-evolving tech landscape.
If you're passionate about scalability, reliability, security, and automation of business-critical infrastructure, this role is for you.
What you will do:
- Own and manage our AWS cloud-based technology stack, using native AWS services and top-tier SRE tools to support multiple client environments with Java-based applications and microservices architecture.
- Define SRE strategy, vision, and goals aligned to Vitech's overall objectives. Establish roadmaps and plans for improving system reliability, scalability, and efficiency.
- Collaborate with Architecture review boards, Solution Architects, engage in viable solutions reviews/implementations.
- Design/refine and implement SLIs and SLO's that covers broad spectrum of SRE – availability, performance, Error budgeting
- Design, deploy, and manage AWS Aurora PostgreSQL clusters for high availability and scalability. Optimize SQL queries, indexes, and database parameters for performance tuning.
- Automate database operations using Terraform, Ansible, AWS Lambda, and AWS CLI. Manage Aurora's read replicas, auto-scaling, and failover mechanisms.
- Enhance infrastructure as code (IAC) patterns using technologies like Terraform, CloudFormation, Ansible, Python, and SDK. Collaborate with DevOps teams to integrate Aurora with CI/CD pipelines.
- Provide full-stack support, as per assigned schedule, on applications across technologies such as Oracle WebLogic, AWS Aurora PostgreSQL, Oracle Database, Apache Tomcat, AWS Elastic Beanstalk, Docker/ECS, EC2, S3, etc.,
- Troubleshoot database incidents, perform root cause analysis, and implement preventive measures. Document database architecture, configurations, and operational procedures.
- Ensure high availability, scalability, and performance of PostgreSQL databases on AWS Aurora. Monitor database health, troubleshoot issues, and perform root cause analysis for incidents.
- Embrace SRE principles such as Chaos Engineering, Reliability, Reducing Toil, etc.,
What We're Looking For:
- Proven hands-on experience as an SRE for critical, client-facing applications, with the ability to dive deep into daily SRE tasks, manage incidents, and oversee operational tools.
- 4+ years of experience developing and/or administering software in AWS public cloud and deep level experience in hosting applications in AWS (EC2, EBS, ECS/EKS, Elastic Beanstalk, RDS, CloudWatch).
- 3+ years of experience in managing relational databases (Oracle, and/or PostgreSQL) in both cloud and on-prem environments, including SRE tasks like backup/restore, Performance issues and replication.
- Demonstrable cross-functional full-stack knowledge with compute, storage, networking, security and databases
- Strong understanding of AWS networking concepts (VPC, VPN/DX/Endpoints, Route53, CloudFront, Load Balancers, WAF).
- Experience with containerized applications (Docker, Kubernetes, ECS). Leverage AWS Aurora features (e.g., read replicas, auto-scaling, multi-region deployments) to enhance database performance and reliability.
- Familiarity with Datalake architecture, Elasticsearch, Zookeeper, DynamoDB, a plus.
- Familiarity with tools like pgAdmin, psql, or other database management utilities. Automate routine database maintenance tasks (e.g., vacuuming, reindexing, patching). Knowledge of backup and recovery strategies (e.g., pg_dump, PITR).
- Set up and maintain monitoring and alerting systems for database performance and availability (e.g., CloudWatch, Honeycomb, New Relic, Dynatrace etc.,).
- Work closely with development teams to optimize database schemas, queries, and application performance. Provide database support during application deployments and migrations.
- Hands-on experience with web/application layers (Oracle WebLogic, Apache Tomcat, AWS Elastic Beanstalk, SSL certificates, S3 buckets).
- Automation experience with Infrastructure as Code (Terraform, CloudFormation, Python, Jenkins, GitHub/Actions). Knowledge of multi-region Aurora Global Databases for disaster recovery.
- Scripting experience in Python, Bash, Java, JavaScript,
- Oversee and streamline change management procedures, efficiently handling daily production change requests to ensure seamless operations.
- Excellent written/verbal communication, critical thinking.
Join Us at Vitech
At Vitech, we believe in empowering our teams to drive innovation through technology.
If you thrive in a dynamic environment and are eager to drive innovation in SRE practices, we want to hear from you
You'll be part of a forward-thinking team that values collaboration, innovation, and continuous improvement. We provide a supportive and inclusive environment where you can grow as a leader while helping shape the future of our organization.