Cloud Reliability Engineer
1 month ago
We are seeking a skilled Cloud Reliability Engineer to collaborate with our IT team in Toronto. In this role, you will work closely with the client services team to diagnose, troubleshoot, and resolve system reliability issues.
Responsibilities:
- Take ownership of customer-reported issues and drive them to resolution.
- Develop proactive measures to prevent recurring issues.
- Escalate unresolved issues to internal teams using standard procedures.
Infrastructure Management:
- Design, configure, deploy, and maintain AWS infrastructure using best practices.
- Implement Infrastructure as Code (IaC) using Terraform for scalability, repeatability, and maintainability.
- Collaborate with the development team to optimize .NET applications for peak performance in a cloud environment.
Monitoring and Alerting:
- Design and implement advanced system monitoring solutions for high performance, availability, and security.
- Use monitoring tools proactively to identify and diagnose infrastructure and application-level issues.
- Collaborate on defining Service Level Objectives (SLOs), Service Level Indicators (SLIs), and Error Budgets.
Reliability and Availability:
- Optimize cloud resource availability, performance, and cost using best practices.
- Plan and execute disaster recovery drills and ensure high availability of critical systems.
- Respond promptly to system alerts, lead incident resolution, and contribute to post-mortem analyses.
Automation and Optimization:
- Automate repetitive tasks related to infrastructure provisioning, configuration, and deployment.
- Ensure continuous deployment and continuous integration best practices are implemented and maintained.
Collaboration and Knowledge Sharing:
- Collaborate with developers, product managers, and other teams to ensure seamless and stable application deployment.
- Document processes, architectures, and best practices to facilitate knowledge sharing.
Requirements:
- AWS certifications such as AWS Certified Solutions Architect or AWS Certified DevOps Engineer.
- Experience with monitoring and alerting tools in the AWS ecosystem.
- Familiarity with Site Reliability Engineering (SRE) philosophy, SLOs, SLIs, and Error Budgets.
- Strong analytical and troubleshooting skills.
- Excellent communication and collaboration skills.
What We Seek in Our Ideal Candidate:
- AWS certifications such as AWS Certified Solutions Architect or AWS Certified DevOps Engineer.
- Experience with monitoring and alerting tools in the AWS ecosystem.
- Familiarity with Site Reliability Engineering (SRE) philosophy, SLOs, SLIs, and Error Budgets.
- Strong analytical and troubleshooting skills.
- Excellent communication and collaboration skills.
Why Work at Ascend Fundraising Solutions:
- Intellectual curiosity, dedication, and a team willing to get the job done.
- Opportunity to make a significant impact on the business in the short and long term.
- Contribute to a company that supports charities and NPOs in funding their causes.
- Beautiful downtown Toronto office with lake views and proximity to transit.
- Hybrid work environment.
-
Cloud Reliability Engineer
4 weeks ago
Old Toronto, Canada Mastech Inc. Full timeMastech Digital is a leading provider of IT staffing and digital transformation services.We are currently seeking a highly skilled Cloud Reliability Engineer to join our client's team in the United States.Responsibilities of the Cloud Reliability Engineer include:Designing and implementing scalable and reliable cloud architectures.Collaborating with...
-
Cloud Reliability Engineer Lead
2 weeks ago
Old Toronto, Canada The Home Depot Canada Full timeAbout The JobAs a Cloud Reliability Engineer Lead at The Home Depot Canada, you will play a crucial role in ensuring the reliability, performance, and operational support of our eCommerce systems.Job OverviewThis position requires a strong background in reliability reviews, performance engineering practices, production engineering, and operational support,...
-
Reliability Engineer
4 weeks ago
Old Toronto, Canada Thomson Reuters Full timeAbout the RoleWe are seeking a skilled Reliability Engineer - Cloud Systems to join our team at Thomson Reuters.As a Reliability Engineer - Cloud Systems, you will be responsible for analyzing and resolving chronic and major issues affecting our cloud-based services.Key responsibilities include:Designing and implementing scalable systems and...
-
Site Reliability Engineer
3 weeks ago
Old Toronto, Canada Thomson Reuters Full timeSite Reliability Engineer Job DescriptionThis role is part of our Service Management Organization and involves IT Service Management, cloud providers, software development, and technology infrastructure experience.The Site Reliability Engineer will analyze chronic and major issues, evaluate products and their services, and make recommendations to improve...
-
AWS Site Reliability Engineer
3 weeks ago
Old Toronto, Canada Sentry Full timeSentry is on a mission to simplify software development and improve application performance. We need a skilled AWS Site Reliability Engineer to join our team and help us achieve our goals. This role involves ensuring the uptime and reliability of our hosted platform, architecting and automating services and systems to meet scaling demands, and collaborating...
-
Asset Reliability Engineer
2 months ago
Old Toronto, Canada Chelsea Avondale Full timeChelsea Avondale is the world’s most cutting-edge home insurance group. We have developed sophisticated risk modeling and insurance pricing technologies for home insurance and deploy that technology through our own insurance company. Our team consists of some of the brightest minds in insurance, software development, finance, and operations. Our group...
-
Cloud Engineer
4 weeks ago
Old Toronto, Canada LanceSoft Full timeDescription: Business group: Data and Analytics Technology Time Tracking Employees In partnership with the Customer Insights Data and Analytics teams and our IT partners, the Data and Analytics Technology team supports the bank's Data and Analytics needs with tooling, projects, and IT operational support. The Cloud Engineer role will be responsible for...
-
Site Reliability Engineer
4 weeks ago
Old Toronto, Canada Lorien Full timeHybrid - Manchester We are currently working with a leading gambling company dedicated to providing exceptional gaming experiences. They are looking for an experienced Site Reliability Engineer with a strong skill set in system reliability to join its world-class technology team. This role is ideal for someone who has 4+ years of experience within the...
-
Cloud Native Site Reliability Engineer
2 weeks ago
Toronto, Ontario, Canada Thomson Reuters Full timeWe are seeking an experienced Senior SRE to join our Shared Capabilities, Service Reliability and Operation team in Toronto. As a Cloud Native Site Reliability Engineer, you will be responsible for implementing site reliability engineering and DevOps best practices, building and maintaining monitoring for all aspects of infrastructure, micro-services, usage...
-
Cloud Systems Engineer
3 weeks ago
Old Toronto, Canada Quantumbricks Full timeJob Title: DevOps EngineerJob Description:Work closely with Engineering stakeholders to design and maintain a reliable, scalable, and secure platform.Collaborate with the Engineering team to identify areas for improvement and implement solutions.Optimize existing deployment tooling and infrastructure, including but not limited to creating and maintaining new...
-
Cloud Infrastructure Engineer
4 weeks ago
Old Toronto, Canada HOOPP Thames Limited Full time**About the Role**We are seeking a highly skilled Cloud Infrastructure Engineer to join our IT Investment Solutions Group at HOOPP. As a Cloud Infrastructure Engineer, you will play a critical role in designing, implementing, and managing our cloud infrastructure to support the organization's strategic objectives.**Responsibilities**Design, deploy, and...
-
Site Reliability Engineer
2 weeks ago
Greater Toronto Area, Canada GlossGenius Full timeAbout GlossGeniusGlossGenius is a leading fintech company empowering small business owners to succeed by offering a range of business management tools, including booking and scheduling, marketing, analytics, payment processing, and more. Our platform serves over 75,000 entrepreneurs daily.As a pioneering force in the industry, GlossGenius is expanding its...
-
Site Reliability Engineer
3 weeks ago
Old Toronto, Canada Sentry Full timeBad software is everywhere, and we’re tired of it. Sentry is on a mission to help developers write better software faster, so we can get back to enjoying technology. With more than $217 million in funding and 100,000+ organizations that believe we’re on to something, we're building performance and error monitoring tools that help companies like Disney,...
-
Cloud Engineer
2 months ago
Old Toronto, Canada Scotiabank Full timeRequisition ID: 206977Join a purpose driven winning team, committed to results, in an inclusive and high-performing culture. Scotiabank has embarked on the journey to modernize both development practices and tools. One of the main areas of transformation is the public cloud and the various platform technologies that support both development and operations on...
-
Cloud Engineer
3 weeks ago
Old Toronto, Canada Ontario Health Full timeJob Title: Senior Cloud EngineerOngoing development and implementation of cloud-based systems and infrastructure for Ontario Health.Key Responsibilities:Design, implement, and manage cloud-based infrastructure and applications.Collaborate with cross-functional teams to ensure efficient and secure cloud services.Provide expert-level guidance on cloud...
-
Cloud Engineer
1 month ago
Old Toronto, Canada Scotiabank Full timeJoin a purpose-driven winning team, committed to results, in an inclusive and high-performing culture.Scotiabank has embarked on the journey to modernize both development practices and tools. One of the main areas of transformation is the public cloud and the various platform technologies that support both development and operations on the cloud. The aim is...
-
Site Reliability Engineering Lead
2 weeks ago
Old Toronto, Canada Infotree Global Solutions Full timeAbout Infotree Global SolutionsInfotree Global Solutions is a leading provider of innovative solutions, and we're seeking an experienced Site Reliability Engineer to lead our team.Your RoleAs our Site Reliability Engineering Lead, you will be responsible for supervising a team of skilled engineers and ensuring the reliability and scalability of our global...
-
Reliable Cloud Solutions Architect
3 weeks ago
Toronto, Ontario, Canada LTIMindtree Full timeAbout Us: LTIMindtree is a global technology consulting and digital solutions company. We enable enterprises to reimagine business models, accelerate innovation, and maximize growth by harnessing digital technologies.Job Title: SRE EngineerLocation: Mississauga, Ontario (Remote)Job DescriptionWe are seeking an experienced Site Reliability Engineer with 10+...
-
Site Reliability Engineering Linux or Windows
2 months ago
Old Toronto, Canada Thomson Reuters Full timeh3>(Canada) Site Reliability Engineer (Contract)Contract (9 months 4 days)Published 3 days agoNew RelicData DogSite Reliability Engineer - in the Service Management OrganizationDo you have experience in IT Service Management, working with cloud providers, software development, and technology infrastructure?The Site Reliability Engineer will analyze chronic...
-
Cloud Platform Engineer
4 weeks ago
Old Toronto, Canada Scotiabank Full timeAs a Principal Cloud Engineer – Cloud Operations Engineering, you will contribute to the overall success of the Cloud and Platform Engineering department at Scotiabank. Your primary objective will be to ensure the stability and dependability of our cloud platform, which serves millions of customers every day.Key Responsibilities:You will be responsible for...