Site Reliability Engineering

1 week ago


Montreal, Quebec, Canada Cisco Full time
```html

Who We Are
As a part of Cisco, Accedian is a leader in performance analytics and end user experience solutions for service providers and mid-to-large size enterprises. The Accedian Skylight service assurance platform offers granular end-to-end visibility within "the massive multi" - multi-layer, multi-domain, and multi-vendor networks. Accedian's open and scalable platform removes roadblocks to innovation, enabling cloud-native analytics and empowering customers to launch new assured services based on 5G, SD-WAN, and edge technologies.

Who You Are
You are an expert in deployment and network operations, skilled in using scripts and automation tools to enhance software processes. With a passion for scripting and automation, you contribute to effective software strategies, oversee maintenance, and optimize systems. Proficient with Kubernetes and Docker Swarm, you seek new ways to monitor deployment health and performance. Your proactive nature and dedication to tech excellence make you a valuable team member in operational efficiency and reliability.

Who You'll Work With
Our team prioritizes your growth in technical, business, and soft skills within a culture that values team strength and investment. We adopt a "You build it, you run it" approach, empowering team members to actively manage and improve our software. Committed to continuous learning, we support mastering new technologies and champion a culture of ambition and innovation in cloud computing.

What You'll Do
Our growing team is looking for a dedicated Service Reliability Engineering professional (SRE) to work with a small, innovative team of industry experts to help perfect our platform by improving our automation processes around deployment and operations.

You will take charge of enhancing the product life cycle, manage configuration, assist in deployment and scripting for management purposes, and collaborate within a cross-functional team. Your responsibility will be to spearhead the initiatives and orchestrate the DevOps cycle. Your responsibilities will include:

  1. Monitoring our cloud and Customer On-Premise infrastructure: Assessing its health to offer 24/7 service to our customers.
  2. Detecting potential issues: Configure monitoring to intercept them before an outage occurs.
  3. Participating in system troubleshooting and recommend improvements to our platform and tools, regular and systematic code testing, and deployment.
  4. Supporting our public cloud deployments: Research, propose and participate in the implementation of security best practices for public cloud deployments and data management.
  5. Prioritizing and escalating: Raising problems to Development, collaborating with our Operations lead and on-call engineer to investigate operational issues impacting users and identify root causes.
  6. Driving automation development: Build configuration management tools and scripts to address operational incidents.
  7. Improving our Security posture: Enforce policies for environment security and their application to our DevOps tools.
This role includes periodic participation in an on-call rotation approximately once every six weeks.

Minimum Qualifications:
  1. 12 years of related experience as a Software Engineer, DevOps Engineer, Site Reliability Engineer, or a role in a related field.
  2. Experience administering Cloud or Virtualized environments using UNIX/LINUX command line and scripting.
  3. IT support experience focused on handling and troubleshooting system-wide solutions.
  4. Demonstrated experience deploying multi-service applications on cloud platforms such as AWS, Google Cloud, or Azure using a modern toolset.
  5. Experience in developing continuous monitoring and automated alerting systems to ensure the stability and reliability of IT systems.

Preferred Qualifications:

  1. Experience with configuration management tools such as Ansible, Salt, Puppet, Chef, or similar.
  2. Bachelors in a STEM related discipline.
  3. A deep understanding of Docker containerization and orchestration, with Kubernetes experience.
  4. Knowledge of IP networking, VPNs, DNS, load balancing, and firewall management.
  5. Familiarity with infrastructure management solutions; experience with HashiCorp Terraform and HashiCorp Vault is.
  6. Experience in setting up and maintaining continuous integration and deployment pipelines.
  7. Ability to write and speak French.

```
#J-18808-Ljbffr

  • Montreal, Quebec, Canada Cisco Systems, Inc. Full time

    Site Reliability Engineering - Technical Leader Location: Alternate Location Area of Interest Compensation Range CAD CAD Job Type Professional Cloud and Data Center, Software Development Job Id Who We Are As a part of Cisco, Accedian is a leader in per


  • Montreal, Quebec, Canada Noverka Conseil Full time

    At Noverka, our values illustrate who we are and define our beliefs: Human, Transparent, Passionate. We are driven by innovation and success, both in our relationships and in our practices.Finding the right job for the right person is what we do bestOur client, an organization in the banking industry is looking for a Site Reliability Engineering (SRE)...


  • Montreal, Quebec, Canada Lyft Full time

    At Lyft, our mission is to enhance people's lives with top-notch transportation services. We strive to foster an inclusive and diverse environment in our community, valuing the unique contributions of each team member. Our goal is to revolutionize the way the world approaches transportation, envisioning a future where cities feel more connected and...


  • Montreal, Quebec, Canada SAP Full time

    We help the world run better Our company culture is focused on helping our employees enable innovation by building breakthroughs together. How? We focus every day on building the foundation for tomorrow and creating a workplace that embraces differences, values flexibility, and is aligned to our purpose-driven and future-focused work. We offer a highly...


  • Montreal, Quebec, Canada Socotra, Inc. Full time

    At Lyft, our mission is to improve people's lives with the world's best transportation. Imagine cities where streets are safe, communities thrive, and personal cars are a thing of the past. We envision a future where shared and active transportation modes are the norm, fostering vibrant, connected neighborhoods. As a leader in micromobility, Lyft powers...


  • Montreal, Quebec, Canada Lightspeed Full time

    Hi there Thanks for stopping by Are you actively looking for a new opportunity? Or just checking the market? Well... you might just be in the right place We're looking for a Principal Site Reliability Engineer to join our NuOrder by Lightspeed team in North America. NuORDER by Lightspeed builds software solutions that help merchants grow the size and...


  • Montreal, Quebec, Canada Lightspeed Full time

    Welcome to NuOrder by LightspeedAre you actively looking for a new opportunity? Or just checking the market? Well... you might just be in the right place We're looking for a Principal Site Reliability Engineer to join our NuOrder by Lightspeed team in North America.NuORDER by Lightspeed builds software solutions that help merchants grow the size and the...


  • Montreal, Quebec, Canada CGI Full time

    Position Description:CGI is a dynamic and innovative technology firm committed to delivering cutting-edge solutions. We are currently seeking a highly skilled and motivated individual to join our team as a FinOps and Site Reliability Engineer (SRE). This role is pivotal in bridging our finance and technology teams to ensure the successful implementation and...


  • Montreal, Quebec, Canada Cisco Systems, Inc. Full time

    Cloud and Data Center, Software Development As a part of Cisco, Accedian is a leader in performance analytics and end user experience solutions for service providers and mid-to-large size enterprises. The Accedian Skylight service assurance platform offers granular end-to-end visibility within "the massive multi" - multi-layer, multi-domain, and...


  • Montreal, Quebec, Canada Lightspeed Full time

    Hi there Thanks for stopping by Are you actively looking for a new opportunity? Or just checking the market? Well... you might just be in the right place We're looking for a Principal Site Reliability Engineer to join our NuOrder by Lightspeed team in North America. NuORDER by Lightspeed builds software solutions that help merchants grow the size and the...


  • Montreal, Quebec, Canada Behavox Full time

    About BehavoxBehavox is shaping the future for how businesses harness their most important raw material - data. Our mission is bold: Organize enterprise data into actionable information that protects and promotes the business growth of multinational companies around the world.From managing enterprise risk and compliance to maximizing revenue and value, our...


  • Montreal, Quebec, Canada Behavox Full time

    About BehavoxBehavox is shaping the future for how businesses harness their most important raw material - data. Our mission is bold: Organize enterprise data into actionable information that protects and promotes the business growth of multinational companies around the world.From managing enterprise risk and compliance to maximizing revenue and value, our...


  • Montreal, Quebec, Canada TMX Full time

    Venture outside the ordinary - TMX Careers The TMX group of companies includes leading global exchanges such as the Toronto Stock Exchange, Montreal Exchange, and numerous innovative organizations enhancing capital markets. United as a global team, we're connecting cross-functionally, traversing industries and geographies, moving opportunity into action,...


  • Montreal, Quebec, Canada Genpact Full time

    SRE Engineer Montreal Canada, Fulltime, Onsite role Responsibilities Overall, experience in IT Infrastructure Management Services, Service Delivery Management, IT Operation Management, Project Management, Multi Cloud Delivery Management, Transitioning IT services and Account Management Work


  • Montreal, Quebec, Canada LanceSoft, Inc. Full time

    Job Title: Production Reliability & Support Expert (SRE)Location : Montreal ( Office attendance from Day 1 – Hybrid mode 3x per week)Years of experience : 3 to 5 years Ensure Production Management is closely aligned/embedded in the Agile software development process and our code meets prod


  • Montreal, Quebec, Canada National Bank Full time

    As a Site Reliability Specialist, Business Intelligence and Data Management, you will play a key role within a DevOps squad that is working to innovate, develop new ways of integrating data into our assets and maintain the availability and reliability of our assets in production. You will be tasked


  • Montreal, Quebec, Canada TMX Full time

    Through a rich exchange of ideas, meaningful collaboration, and a nimble operating model, we're powering some of the nation's most critical systems, fueling capital formation and innovation, bringing increased opportunity to business visionaries, product ingenuity to consumers, and career exploration to our team. Global Technology Services (GTS) GTS is one...


  • Montreal, Quebec, Canada National Bank Full time

    As a Site Reliability Specialist, Business Intelligence and Data Management, you will play a key role within a DevOps squad that is working to innovate, develop new ways of integrating data into our assets and maintain the availability and reliability of our assets in production. You will be tasked with helping clients and consumers more easily use the data...


  • Montreal, Quebec, Canada Hunter Bond Full time $125,000

    Job Title:Application Support Engineer Client: Fintech My client are looking to expand their Application Support team, and would like someone with prior front office experience to provide technical support and engineering functions in support of their proprietary and third party trading systems. Automate software configuration and work towards flawless...


  • Montreal, Quebec, Canada Banque Nationale du Canada Full time

    Site Reliability Engineering Developper SRE Hybrid Job Number 21241 Category Senior Professional Status: Permanent Type of Contract Permanent Schedule: Full-Time Full Time / Part Time? Full-Time 06-Jun-2024 City Montreal Province/State Area of Interest: Information technology A career in technology at National Bank means participating...