Site Reliability Engineer

4 weeks ago


Toronto ON, Canada Nityo Infotech Full time

Job Responsibilities:

  1. Objectives of this Role
  2. Run the IKP clusters by monitoring availability and taking a holistic view of system health
  3. Build tools and automation to manage platform infrastructure and services
  4. Improve reliability, quality, and time to upgrade cluster and service versions
  5. Measure and optimize system performance and resource utilization, and plan for future capacity
  6. Build dashboards and visualizations to graph system health
  7. Define system alerts and automate responses where possible
  8. Provide operational support and engineering for multiple software development teams
Daily and Monthly Responsibilities
  • Gather and analyze metrics from cluster components and services to assist in performance tuning and fault finding
  • Partner with Core Engineering and Services Engineering teams to improve services through rigorous testing and release procedures
  • Participate in system design consulting, platform management, and capacity planning
  • Create sustainable systems and services through automation and uplifts
  • Balance feature development speed and reliability with well-defined service level objectives
Work Experience

Typical candidates will have at least 5-10 years of experience in the technology field, preferably in software engineering.

Education

Bachelor’s degree in Computer Science or related field. Experience Required 5 - 10 Years

Industry Type: IT

Employment Type: Contract

Location: China

#J-18808-Ljbffr

  • Toronto, Canada CB Canada Full time

    Site Reliability Engineer On behalf of our client in the Banking Sector, PROCOM is looking for a Site Reliability Engineer. Site Reliability Engineer – Job Description Azure cloud Jira and confluence CICD Experience with automating (provisioning, configuration management, deployment) and integrating Azure PaaS solutions (Azure App services, Azure...


  • Old Toronto, Canada CB Canada Full time

    Site Reliability Engineer On behalf of our client in the Banking Sector, PROCOM is looking for a Site Reliability Engineer. Site Reliability Engineer – Job Description Azure cloud Jira and confluence CICD Experience with automating (provisioning, configuration management, deployment) and integrating Azure PaaS solutions (Azure App services, Azure...


  • Toronto, Ontario, Canada Zortech Solutions Full time

    Hi,Hope you are doing GreatThis side Priya Rajput from Zortech Solutions trying to reach you for an exciting job opening, kindly have a look to job description and revert me with your positive feedback. My mail ID is or call me on .Role: Site Reliability EngineerLocation: Toronto, ON-OnsiteDuration: Fulltime PermanentSkills and Responsibilities:...


  • Toronto, ON, Canada Tata Consultancy Services Full time

    TCS is an equal opportunity employer, and embraces diversity in race, nationality, ethnicity, gender, age, physical ability, neurodiversity, and sexual orientation, to create a workforce that reflects the societies we operate in. Our continued commitment to Culture and Diversity and is reflected in our people stories across our workforce implemented through...


  • Toronto, ON, Canada Tata Consultancy Services Full time

    TCS is an equal opportunity employer, and embraces diversity in race, nationality, ethnicity, gender, age, physical ability, neurodiversity, and sexual orientation, to create a workforce that reflects the societies we operate in. Our continued commitment to Culture and Diversity and is reflected in our people stories across our workforce implemented through...


  • Toronto, Canada Autodesk Full time

    Position Overview Autodesk, the leading Design and Make Software Company, is looking for a Principal Site Reliability Engineer to join the Autodesk Platform Services Engineering team in Toronto, Canada. On this position, you will help build trusted services of APS (Autodesk Platform Services) as measured by Service Level Objectives (SLOs) and Mean...


  • Old Toronto, Canada Thomson Reuters Full time

    (Canada) Site Reliability Engineer (Contract) Contract (9 months 4 days) Published 3 days ago New Relic Data Dog Site Reliability Engineer - in the Service Management OrganizationDo you have experience in IT Service Management, working with cloud providers, software development, and technology infrastructure?The Site Reliability Engineer will...


  • Old Toronto, Canada Autodesk Full time

    Position Overview Autodesk, the leading Design and Make Software Company, is looking for a Principal Site Reliability Engineer to join the Autodesk Platform Services Engineering team in Toronto, Canada. In this role, you will help build trusted services of APS (Autodesk Platform Services) measured by Service Level Objectives (SLOs) and Mean Time to Recovery...


  • Toronto, Canada eTeam Full time

    Remote work Duration - 4 months - Preference is to find candidates who are willing to be converted to full time employee . The conversion decision will be made based on performance. Job description - ::: Role Desc : Defining and measuring reliability goals—SLIs, SLOs, and error budgets for user journey Designing for and implementing...


  • Old Toronto, Canada eTeam Full time

    Remote Work Duration 4 months - Preference is to find candidates who are willing to be converted to full-time employees. The conversion decision will be made based on performance. Job Description Role Description: Defining and measuring reliability goals—SLIs, SLOs, and error budgets for user journey. Designing for and implementing observability (ELK,...


  • Toronto, Canada Rakuten Kobo Full time

    The Role At Rakuten Kobo, we develop software that covers a rich set of domains, including hardware devices, eCommerce, content rendering, and an expanding data ecosystem. Our SRE team provides the safety net that empowers our 50+ product developers to move fast. We are seeking an experienced Site Reliability Engineer III to help ensure the reliability...


  • Toronto, Canada BMO Full time

    Application Deadline: 04/29/2024Address:33 Dundas Street WestThis role is Hybrid (1-2 days per week in the office)The Director - Site Reliability Engineering will lead a team that will work with application teams, infrastructure teams, and business partners to continuously improve the stability, reliability and efficiency of Finance and Enterprise Risk...


  • Old Toronto, Canada Lightspeed Full time

    Hi there! Thanks for stopping by. Are you actively looking for a new opportunity? Or just checking the market? Well… you might just be in the right place! We’re looking for a Principal Site Reliability Engineer to join our NuOrder by Lightspeed team in North America. NuORDER by Lightspeed builds software solutions that help merchants grow the size and...


  • Toronto, ON, Canada Behavox Full time

    Behavox is shaping the future for how businesses harness their most important raw material - data. Organize enterprise data into actionable information that protects and promotes the business growth of multinational companies around the world. From managing enterprise risk and compliance to maximizing revenue and value, our data operating platform presents...


  • Old Toronto, Canada Nityo Infotech Full time

    Job Responsibilities: Objectives of this Role Run the IKP clusters by monitoring availability and taking a holistic view of system health Build tools and automation to manage platform infrastructure and services Improve reliability, quality, and time to upgrade cluster and service versions Measure and optimize system performance and resource utilization,...


  • Toronto, Canada Tata Consultancy Services Full time

    TCS is an equal opportunity employer, and embraces diversity in race, nationality, ethnicity, gender, age, physical ability, neurodiversity, and sexual orientation, to create a workforce that reflects the societies we operate in. Our continued commitment to Culture and Diversity and is reflected in our people stories across our workforce implemented through...


  • Toronto, Canada Tata Consultancy Services Full time

    TCS is an equal opportunity employer, and embraces diversity in race, nationality, ethnicity, gender, age, physical ability, neurodiversity, and sexual orientation, to create a workforce that reflects the societies we operate in. Our continued commitment to Culture and Diversity and is reflected in our people stories across our workforce implemented through...


  • Toronto, Canada Tata Consultancy Services Full time

    TCS is an equal opportunity employer, and embraces diversity in race, nationality, ethnicity, gender, age, physical ability, neurodiversity, and sexual orientation, to create a workforce that reflects the societies we operate in. Our continued commitment to Culture and Diversity and is reflected in our people stories across our workforce implemented through...


  • Old Toronto, Canada ClickHouse Full time

    We are committed to providing our customers with reliable and secure services so we are building out our newly formed Site Reliability Engineering team. As one of the first joiners to our Reliability Engineering Team at ClickHouse, you will be responsible for building and leading processes to ensure the reliability, availability, scalability, and performance...


  • Old Toronto, Canada ClickHouse Full time

    We are committed to providing our customers with reliable and secure services so we are building out our newly formed Site Reliability Engineering team. As one of the first joiners to our Reliability Engineering Team at ClickHouse, you will be responsible for building and leading processes to ensure the reliability, availability, scalability, and performance...