Production and Reliability Management Expert

2 days ago


Montreal, Canada Compunnel, Inc. Full time

Production and Reliability Management Expert 05/29/2025 Contract Active Job Description: Job Summary Client is seeking a skilled Production & Reliability Management Expert to join our Cyber Data Risk & Resilience (CDRR) team within the Identity & Access Management (IAM) domain. In this role, you will act as a key member of a global team responsible for safeguarding the firm through the reliability and operational excellence of IAM control platforms. You’ll be managing incident response, supporting Agile development integration, and driving automation initiatives while working with the latest cloud and data technologies. This is a unique opportunity to contribute to cybersecurity defense at a global financial leader through cutting-edge technologies and agile development principles.

Key Responsibilities

Manage critical production incidents and communicate effectively with key business and technology stakeholders Embed production support principles in Agile/DevOps development cycles to ensure high standards for production readiness Own issue resolution and incident management, including leading incident calls and coordinating cross-functional teams Reduce support costs through automation, optimization, and development of operational tools Analyze technical debt and operational inefficiencies to prioritize remediation and stability improvements Identify, design, and implement automation solutions for business process improvements Develop, test, and deploy automation code; monitor and troubleshoot automation workflows Collaborate with stakeholders to understand requirements and deliver scalable and reliable solutions Work within Agile, Scrum, DevOps, and Site Reliability Engineering (SRE) frameworks to ensure continuous delivery and operational excellence Required Qualifications Bachelor’s degree in Computer Science, Software Engineering, or a related technical field 4–5+ years of industry experience in software development and production support Strong Java development experience in building multi-threaded, scalable applications Proficiency in Python and Shell scripting Hands-on experience with web programming and developing REST/SOAP APIs Strong SQL skills and familiarity with DB2, Sybase, or Snowflake Experience with automated testing, SDLC pipelines, and automated deployment practices Solid working knowledge of Unix/Linux environments and infrastructure components like load balancing Familiarity with DevOps tools such as Ansible, GitHub, or other CI/CD and release management tools Excellent problem-solving skills and ability to work independently in high-pressure environments Strong interpersonal and communication skills to effectively interact across all organizational levels Preferred Qualifications (if any) Experience working in financial services or cybersecurity operations Familiarity with IAM platforms and concepts such as user lifecycle, entitlements, and privileged access management Understanding of cloud technologies, infrastructure-as-code, and enterprise monitoring systems Certifications in Agile, DevOps, or SRE methodologies (e.g., SAFe, CKA, SRE Practitioner) Certifications (if any) Relevant technical certifications (e.g., Java, Python, DevOps, Cloud, or SRE) are a plus but not required. #J-18808-Ljbffr



  • Montreal, Canada Capgemini Full time

    Production & Reliability Management Expert (contract) 3 months ago Be among the first 25 applicants We are seeking a Production & Reliability Management Expert to drive operational excellence and system reliability for Identity & Access Management platforms within Morgan Stanley’s Cyber Data Risk & Resilience (CDRR) division. This role requires strong...


  • Montreal, Canada Capgemini Full time

    Production & Reliability Management Expert (contract) 3 months ago Be among the first 25 applicants We are seeking a Production & Reliability Management Expert to drive operational excellence and system reliability for Identity & Access Management platforms within Morgan Stanley’s Cyber Data Risk & Resilience (CDRR) division. This role requires strong...

  • Senior Production

    2 days ago


    Montreal, Canada Capgemini Full time

    A technology consulting firm in Montreal is seeking a Production & Reliability Management Expert to optimize operational excellence in Identity & Access Management platforms. The ideal candidate will manage production incidents, enhance system reliability, and drive automation processes. Strong skills in Java, Python, and Agile methodologies are essential....

  • Senior Production

    2 days ago


    Montreal, Canada Capgemini Full time

    A technology consulting firm in Montreal is seeking a Production & Reliability Management Expert to optimize operational excellence in Identity & Access Management platforms. The ideal candidate will manage production incidents, enhance system reliability, and drive automation processes. Strong skills in Java, Python, and Agile methodologies are essential....


  • Montreal (administrative region), Canada Lightspeed Full time

    Are you actively seeking a new opportunity, or simply exploring the market? Either way, you might have just found the right place! We’re looking for a Senior SRE to join our Lightspeed Retail group in North America, a team responsible for multiple POS systems infrastructure and developer experiences. The team is at the helm of providing a stable, reliable...


  • Montreal (administrative region), Canada Lightspeed Full time

    Are you actively seeking a new opportunity, or simply exploring the market? Either way, you might have just found the right place! We’re looking for a Senior SRE to join our Lightspeed Retail group in North America, a team responsible for multiple POS systems infrastructure and developer experiences. The team is at the helm of providing a stable, reliable...


  • Montreal (administrative region), Canada Lightspeed Full time

    Are you actively seeking a new opportunity, or simply exploring the market? Either way, you might have just found the right place! We’re looking for a Senior SRE to join our Lightspeed Retail group in North America, a team responsible for multiple POS systems infrastructure and developer experiences. The team is at the helm of providing a stable, reliable...


  • Montreal, Canada Compunnel, Inc. Full time

    The Reliability Production Engineer (RPE) plays a critical role in providing production support services within the RPE organization. This role involves developing automation and tooling to support Site Reliability Engineering (SRE) activities, with a focus on improving system reliability and supportability—such as reducing manual toil, optimizing...


  • Montreal, Canada Compunnel, Inc. Full time

    The Reliability Production Engineer (RPE) plays a critical role in providing production support services within the RPE organization. This role involves developing automation and tooling to support Site Reliability Engineering (SRE) activities, with a focus on improving system reliability and supportability—such as reducing manual toil, optimizing...


  • Montreal, Canada Compunnel, Inc. Full time

    The Reliability Production Engineer (RPE) plays a critical role in providing production support services within the RPE organization. This role involves developing automation and tooling to support Site Reliability Engineering (SRE) activities, with a focus on improving system reliability and supportability—such as reducing manual toil, optimizing...