Site Reliability Engineer

3 weeks ago


Montreal, Canada Atlantis IT Group Full time

OverviewSite Reliability Engineer (Linux / Cloud Infrastructure) role with hands-on experience across Linux, distributed systems, scripting, databases, monitoring, containers, cloud SaaS integrations, messaging, load balancers, security, and incident management.Responsibilities Provide hands-on administration of Linux 7.x and related infrastructure. Work with Service Oriented Architecture, distributed systems, and scripting (Python, shell). Manage relational databases (e.g., Sybase, DB2, SQL, Postgres) and application integration, configuration, and troubleshooting. Operate observability and monitoring tools: Open Telemetry, Prometheus, Grafana, Splunk, Ansible. Manage web servers (Apache, Nginx) and application servers (Tomcat, JBoss) for integration and troubleshooting. Work with Docker containers, Kubernetes, and SaaS platform integration. Understand messaging systems (e.g., Kafka) and their role in the architecture. Design and implement load balancing, web proxies, and storage platforms (NAS/SAN) from an implementation perspective. Apply basic security policies for secure hosting solutions, including Kerberos and encryption methods (SSL/TLS). Experience in managing large web-based, multi-tier (n-tier) applications in secure cloud environments. Apply SRE principles with appropriate tooling approach; strong Linux/Unix admin, storage, networking, and web technologies knowledge. Troubleshoot application issues and manage incidents effectively. Exhibit excellent verbal and written communication skills.Qualifications Hands-on experience with Linux 7.x operating system (5+ years) at an advanced level. Hands-on experience with SOA, distributed systems, and scripting (Python, shell). Experience with relational databases (Sybase, DB2, SQL, Postgres). Exposure to tools: Open Telemetry, Prometheus, Grafana, Splunk, Ansible. Hands-on experience with web servers (Apache, Nginx) and application servers (Tomcat, JBoss). Experience with Docker, Kubernetes, and SaaS platform integration. Experience with Kafka and messaging technologies. Understanding of load balancers, web proxies, and NAS/SAN storage from an implementation perspective. Familiar with security policies for secure hosting, Kerberos, SSL/TLS. Experience managing large web-based n-tier applications in secure cloud environments. Strong knowledge of SRE principles and tooling. Strong infrastructure knowledge in Linux/Unix administration, storage, networking, and web technologies. Excellent troubleshooting and incident management capabilities.Senioriry levelMid-Senior levelEmployment typeContractJob functionInformation TechnologyIndustriesIT Services and IT Consulting #J-18808-Ljbffr



  • Montreal, Canada ApTask Full time

    Direct message the job poster from ApTask Looking for an intermediate between 2 to 5 years' experience. The Application Infrastructure (Al) department is seeking a Site Reliability Engineer (SRE) to help drive the reliability engineering, operations and customer support services clients ServiceNow SaaS implementation. Reporting to a Site Reliability...


  • Montreal, Canada ApTask Full time

    Direct message the job poster from ApTaskLooking for an intermediate between 2 to 5 years' experience.The Application Infrastructure (Al) department is seeking a Site Reliability Engineer (SRE) to help drive the reliabilityengineering, operations and customer support services clients ServiceNow SaaS implementation.Reporting to a Site Reliability Engineering...


  • Montreal, Canada DevOps projects Full time

    Site Reliability Engineer We're hiring a Site Reliability Engineer for a remote work engineering position at Botpress Technologies Inc. This role focuses on ensuring the stability, scalability, and security of our platform, making it a critical software engineer position for maintaining high service performance. Key Responsibilities Architect and maintain...


  • Montreal, Canada DevOps projects Full time

    Site Reliability Engineer We're hiring a Site Reliability Engineer for a remote work engineering position at Botpress Technologies Inc. This role focuses on ensuring the stability, scalability, and security of our platform, making it a critical software engineer position for maintaining high service performance. Key Responsibilities - Architect and...


  • Montreal, Canada DevOps projects Full time

    Site Reliability Engineer We're hiring a Site Reliability Engineer for a remote work engineering position at Botpress Technologies Inc. This role focuses on ensuring the stability, scalability, and security of our platform, making it a critical software engineer position for maintaining high service performance. Key Responsibilities Architect and maintain...


  • Montreal, Canada DevOps projects Full time

    Site Reliability Engineer We're hiring a Site Reliability Engineer for a remote work engineering position at Botpress Technologies Inc. This role focuses on ensuring the stability, scalability, and security of our platform, making it a critical software engineer position for maintaining high service performance. Key Responsibilities Architect and maintain...


  • Montreal, Canada Botpress Full time

    3 weeks ago Be among the first 25 applicants Help bring AI agents to companies worldwide.Over the next decade, autonomous agents will redefine how we work.Botpress allows companies to build and deploy advanced AI agents that move beyond conversation into real business logic.Our product works today and at scale, across industries, regions, and limitless use...


  • Montreal, Canada Botpress Full time

    3 weeks ago Be among the first 25 applicantsHelp bring AI agents to companies worldwide.Over the next decade, autonomous agents will redefine how we work.Botpress allows companies to build and deploy advanced AI agents that move beyond conversation into real business logic.Our product works today and at scale, across industries, regions, and limitless use...


  • Montreal, Canada Open Systems Technologies Full time

    Site Reliability Engineer (SRE), ServiceNow, Application Infrastructure Location: Montreal – Hybrid – 3 days/week The Application Infrastructure (AI) department is seeking a Site Reliability Engineer (SRE) to help drive reliability engineering, operations and customer support services for client’s ServiceNow SaaS implementation. Reporting to a Site...


  • Montreal, Canada Open Systems Technologies Full time

    Site Reliability Engineer (SRE), ServiceNow, Application Infrastructure Location: Montreal – Hybrid – 3 days/week The Application Infrastructure (AI) department is seeking a Site Reliability Engineer (SRE) to help drive reliability engineering, operations and customer support services for client’s ServiceNow SaaS implementation. Reporting to a Site...