Technical Lead

1 week ago


Mississauga, Ontario, Canada Citi Full time
Discover your future at Citi

Working at Citi is far more than just a job. A career with us means joining a team of more than 230,000 dedicated people from around the globe. At Citi, you'll have the opportunity to grow your career, give back to your community and make a real impact.

Job Overview
Overview
We are seeking a highly skilled and experienced individual to fill a unique hybrid role that combines senior-level DevOps and Infrastructure Engineering with the responsibilities of a Working Scrum Master. This position is for a hands-on engineer who actively contributes to the design, implementation, and maintenance of our infrastructure and automation, while simultaneously facilitating the agile development process for their technical team. The ideal candidate will be a strong technical leader, a passionate advocate for agile practices, and a driver of continuous improvement within a complex engineering environment. This role is for someone who thrives on both coding and coaching, with an additional understanding of the infrastructure needs and operational considerations for Artificial Intelligence and Machine Learning initiatives.
Responsibilities
Hands-on DevOps & Infrastructure Engineering
  • Design & Implementation: Lead the design, implementation, and ongoing management of secure, scalable, and resilient infrastructure components.
  • Secret & Certificate Management: Administer and maintain secret and certificate management solutions using HashiCorp Vault, including policy definition and integration.
  • Database Management: Perform hands-on administration and optimization of database systems (PostgreSQL, Oracle, MongoDB), including performance tuning, backup, and recovery strategies.
  • Workflow Orchestration: Deploy, monitor, and troubleshoot data orchestration workflows using Apache Airflow, and develop/optimize DAGs.
  • Messaging Systems: Implement and manage messaging queues such as Kafka and IBM MQ, including cluster setup and configuration.
  • API Integrations: Develop, maintain, and troubleshoot RESTful API and SOAP integrations critical for system connectivity.
  • Build Automation: Implement and optimize build and deployment processes using Gradle.
  • Container Orchestration: Design, implement, and manage container orchestration platforms with Kubernetes and Helm, including integration with CyberArk and HashiCorp for secrets management. Create, debug, and troubleshoot Kubernetes PODs, Jobs, and Deployments using YAML.
  • Storage Management: Configure and manage persistent storage solutions including PVC, SONiC NAS, and S3, with an awareness of storage requirements for AI/ML workloads.
  • Networking & Load Balancing: Set up and maintain load balancing solutions (e.g., Nginx, HAProxy, AWS ELB/ALB, Kubernetes Ingress controllers) for high availability and performance.
  • Monitoring & Logging: Implement, configure, and utilize comprehensive monitoring and logging solutions (Prometheus, Grafana, ELK Stack) to ensure system health and proactively identify issues, including those relevant to AI/ML applications.
  • Automation & Scripting: Develop robust automation scripts and tools using Python, Bash, Go, or similar languages to streamline operations and enhance efficiency.
  • Incident Response: Participate actively in on-call rotations, responding to and resolving critical incidents with hands-on troubleshooting.
  • Documentation: Create and maintain technical documentation, architecture diagrams, and runbooks for infrastructure components and processes.
Working Scrum Master & Agile Facilitation
  • Agile Facilitation: Facilitate all Scrum ceremonies (Sprint Planning, Daily Scrum, Sprint Review, Sprint Retrospective) for the DevOps/Infrastructure engineering team.
  • Technical Coaching: Coach the team on advanced engineering practices, self-organization, cross-functionality, and continuous improvement in the context of infrastructure development, including support for AI/ML initiatives.
  • Impediment Resolution: Proactively identify and resolve technical impediments and process bottlenecks within the team and across organizational boundaries, paying special attention to unique challenges posed by AI/ML infrastructure.
  • Backlog Refinement: Collaborate closely with stakeholders (e.g., product owners, technical leads) to ensure a well-defined and prioritized backlog for infrastructure work, technical debt, operational improvements, and AI/ML platform needs.
  • Process Improvement: Drive continuous improvement in the team's agile and DevOps practices, helping them adapt and optimize their workflow for maximum efficiency and quality.
  • Team Shielding: Protect the team from external distractions, allowing focused time for hands-on engineering work.
Required Skills and Experience
Hands-on DevOps & Infrastructure Engineering Expertise
  • Secret & Certificate Management: Proven hands-on experience with HashiCorp Vault (installation, configuration, policy management, integrations).
  • Database Administration: Strong hands-on experience with at least two of PostgreSQL, Oracle, or MongoDB (installation, tuning, replication, backup/restore).
  • Workflow Orchestration: Hands-on experience deploying, managing, and developing DAGs for Apache Airflow.
  • Messaging Systems: Solid hands-on experience with Kafka and/or IBM MQ (cluster setup, topic management, producer/consumer configuration).
  • Container Orchestration: In-depth hands-on experience with Kubernetes and Helm, including YAML configuration, troubleshooting PODs/Jobs/Deployments, and integrations with secrets management (CyberArk, HashiCorp).
  • Storage Management: Practical experience with Kubernetes PVCs, Persistent Volumes, S3, and/or enterprise NAS solutions (e.g., SONiC NAS).
  • Monitoring & Logging: Strong hands-on experience with Prometheus, Grafana, and the ELK Stack (setup, dashboard creation, query optimization, alert configuration).
  • Scripting & Automation: High proficiency in Python, Bash, or Go for automation, tooling development, and system administration.
  • Cloud Platforms: Extensive hands-on experience with at least one major cloud provider (AWS, Azure, GCP).
  • Infrastructure as Code (IaC): Proficiency with IaC tools such as Terraform or Ansible.
  • CI/CD: Experience designing, implementing, and maintaining CI/CD pipelines (e.g., Jenkins, GitLab CI, GitHub Actions).
  • API Integration: Experience with RESTful API and SOAP web services.
  • Build Tools: Proficiency with Gradle for build automation.
AI/ML Awareness & Support
  • AI/ML Infrastructure Concepts: Understanding of the specific infrastructure requirements for deploying, managing, and scaling Artificial Intelligence and Machine Learning workloads (e.g., GPU resources, specialized storage, MLOps pipelines).
  • Data for AI/ML: Awareness of data management strategies and data governance principles relevant to AI/ML models and training datasets.
  • Monitoring AI/ML Systems: Familiarity with metrics and monitoring approaches for the performance and health of AI/ML applications and their underlying infrastructure.
Agile & Leadership Skills
  • Working Scrum Master Experience: Proven experience acting as a Scrum Master within a technical team where you also performed significant hands-on engineering.
  • Agile & Scrum Mastery: In-depth knowledge and practical application of Agile principles and the Scrum framework.
  • Facilitation & Coaching: Excellent facilitation, coaching, and mentoring skills within a technical context.
  • Communication: Strong verbal and written communication skills, able to bridge technical and process discussions.
  • Technical Leadership: Ability to guide technical discussions, influence architectural decisions, and drive best practices.
Preferred Qualifications
  • Bachelor's or Master's degree in Computer Science, Engineering, or a related technical field.
  • Certified ScrumMaster (CSM) or Professional Scrum Master (PSM) certification.
  • Relevant cloud certifications (e.g., AWS Certified DevOps Engineer, Azure DevOps Engineer Expert, GCP Professional Cloud DevOps Engineer).
  • Experience with site reliability engineering (SRE) principles and practices.
  • Familiarity with other Agile scaling frameworks (e.g., SAFe, LeSS).
  • Exposure to MLOps platforms or tools (e.g., Kubeflow, MLflow).

-

Job Family Group:

Technology

-

Job Family:

Applications Development

-

Time Type:

Full time

-

Primary Location Full Time Salary Range:

$120, $170,800.00

-

Most Relevant Skills

Please see the requirements listed above.

-

Other Relevant Skills

For complementary skills, please see above and/or contact the recruiter.

-

Automated Processing and AI

We use automated processing, including artificial intelligence, for our legitimate business interests (or our reasonable and appropriate business purposes) to identify and align the candidate's skills and abilities with a specific job opening. Additionally, if you so choose, or consent, we can match your skills and abilities to other suitable roles at Citi.

Importantly, all our hiring processes and decisions, including determining your suitability for a role, are conducted, checked, and decided by individuals. Our automated processing and AI do not involve relying on automatic or autonomous decision-making. Please refer to any Jurisdictional Considerations, with specific provisions for your country (where relevant) for further details.

-

Citi is an equal opportunity employer, and qualified candidates will receive consideration without regard to their race, color, religion, sex, sexual orientation, gender identity, national origin, disability, status as a protected veteran, or any other characteristic protected by law.

If you are a person with a disability and need a reasonable accommodation to use our search tools and/or apply for a career opportunity review Accessibility at Citi.

View Citi's EEO Policy Statement and the Know Your Rights poster.


  • Technical Lead

    2 weeks ago


    Mississauga, Ontario, Canada Yochana Full time

    Role : Technical Lead (Angular, Ruby, AWS etcLocation: Hybrid, Mississauga,ON5+ years of handson experience in Angular, Ruby, and Ruby on Rails.· Extensive experience in Ruby Gems and Intercom integration.· Strong experience in Docker, Bitbucket, DevOps, CICD pipelines, and software deployment.· Excellent proficiency in AWS services and cloud computing...

  • Technical Lead

    2 weeks ago


    Mississauga, Ontario, Canada Citi Full time US$120,800 - US$170,800

    OverviewWe are seeking a highly skilled and experienced individual to fill a unique hybrid role that combines senior-level DevOps and Infrastructure Engineering with the responsibilities of a Working Scrum Master. This position is for a hands-on engineer who actively contributes to the design, implementation, and maintenance of our infrastructure and...


  • Mississauga, Ontario, Canada Agilent Technologies Full time

    Job DescriptionWe are seeking a highly motivated and experienced IT Technical Lead to architect and implement Agilent's next-generation entitlement management system for software and services. This role will be instrumental in designing scalable solutions, guiding technical teams, and collaborating with cross-functional partners to ensure alignment with...


  • Mississauga, Ontario, Canada Brilliance Cyber Systems INC Full time

    Collaborate with software product teams to design and implement license management strategies that enable post-sales monetization relevant to B2B. This includes analyzing business and technical requirements, crafting innovative solutions, and influencing product teams to adopt optimal approaches for software distribution and monetization.This is a hybrid...


  • Mississauga, Ontario, Canada Yochana Full time

    Title: Ruby on Rails-Technical Lead (Angular, Ruby, AWS etc)Location: Hybrid, MississaugaDuration: ContractJob Description5+ years of hands-on experience in Angular, Ruby, and Ruby on Rails.·Extensive experience in Ruby Gems and Intercom integration.·Strong experience in Docker, Bitbucket, DevOps, CICD pipelines, and software deployment.·Excellent...


  • Mississauga, Ontario, Canada J&M Group Full time

    Minimum of 12+ years of software development experience10+ years of experience in Design and implementing cloud-based architecture using Azure services10+ years of strong expertise in Microsoft Azure services, including:Function AppEvent HubBlobAzure VaultMinimum of 8+ years of experience: Java, .NET, PythonDevelop and deploy scalable, secure, and efficient...

  • Technical Manager

    1 week ago


    Mississauga, Ontario, Canada PGW Auto Glass Full time

    Ontario Technical ManagerPosition OverviewThe Ontario Technical Manager participates in all aspects of the technical production process, from problem-solving and team coordination to the implementation of technical initiatives. They utilize their expertise in technology trends and business management to mentor team members, collaborate with cross-functional...

  • Technical Service

    1 week ago


    Mississauga, Ontario, Canada Inforesight Consumer Products Inc. Full time

    Technical Service & Quality LeadFull-Time | Hybrid (On-Site)About UsInforesight Consumer Products Inc. (ICP) is a leader in radiant heating and control technology. We are growing and looking for a bright, motivated, and energetic individual to join us in our mission to deliver world-class heating solutions under ourSolaira() andAura() brands.We are seeking a...

  • Technical Engineer

    7 days ago


    Mississauga, Ontario, Canada Sectra Full time

    MississaugaHybridJob DescriptionJoin a tech company that's making a difference in healthcare At Sectra Canada, our objective is to introduce a leading global medical IT solution to the Canadian market, with the aim of establishing ourselves as the top provider of Enterprise imaging and PACS solutions in the country. Our customers operate in some of Canada's...

  • Technical Test Lead

    2 weeks ago


    Mississauga, Ontario, Canada Infosys Full time

    Infosys is seeking a Automation Test Lead.As a Automation Test Lead, you will act as a validation and quality assurance expert and review the functionality of existing systems. You will conduct requirement analysis, define test strategy & design and lead execution to guarantee superior outcomes. You will have the opportunity to collaborate with some of the...