SRE - Python - Airflow - Dynatrace

5 days ago


Toronto, Ontario, Canada TECONICA SOFTWARES Full time

Job Description

We're looking for an SRE to elevate the reliability, performance, and efficiency of mission-critical batch workloads across Capital Markets Technology. You'll be the technical lead for hand-on automation, application development, host systems engineering, and observability via Dynatrace, with a primary focus on optimizing batch runtimes. If you love shaving milliseconds off latency, removing toil with code, and building resilient systems that just don't fail—you'll thrive here.

This role is critical to our operational excellence strategy and will play a key role in maturing our reliability engineering practices across the Capital Markets domain.

Key Responsibilities:

Reliability & Performance: Ensure stability and optimize batch processing pipelines; reduce runtime and failure rates, engineering for resiliency.

Observability: Implement and maintain monitoring with Dynatrace; create dashboards, alerts, and runbooks.

Systems Engineering: Manage and tune Linux and Windows systems for performance and resilience.

Automation & Orchestration: Create/Modify and optimize Airflow DAGs; build CI/CD pipelines for automation.

Incident Management: Lead incident response, root cause analysis, and postmortems; enforce SLOs and reliability practices.

Security & Compliance: Apply security best practices and ensure regulatory compliance in systems and automation.

Qualifications:

Expert-level Python: Advanced coding, performance tuning, concurrency (async/multiprocessing), testing, and packaging.

Linux Systems Expertise: Kernel/OS tuning, networking, filesystem optimization, process management, and troubleshooting.

Dynatrace Mastery: Custom dashboards, KPIs, anomaly detection, tagging strategy, and alerting configuration.

Airflow Expertise: DAG design best practices, SLA management, scheduler/executor tuning, and scaling strategies.

Proven experience optimizing batch workloads for performance, reliability, and cost.

Strong understanding of distributed systems concepts retries, idempotency, backpressure, and data integrity.

Strong understanding of backend systems and database optimization.

Proficiency with CI/CD pipelines (GitHub Actions, Azure DevOps, Jenkins) and Infrastructure as Code (Terraform, Ansible).

Proven experience with containers and orchestration (Docker, Kubernetes).

Excellent incident management and root cause analysis skills.

Strong communication and collaboration abilities.



  • Toronto, Ontario, Canada Yochana Full time

    Position Name – Dynatrace - Application Performance EngineerType of hiring – FulltimeLocation – Toronto, ON (Hybrid - 2 days a week)Job Description:We are seeking an experiencedApplication Performance Engineerto lead the Observability function for Capital Markets Technology. In this role, you will collaborate with Site Reliability Engineering (SRE),...


  • Toronto, Ontario, Canada Aarorn Technologies Inc Full time

    Job Title: Dynatrace - Application Performance EngineerLocation: Toronto, ON (3x onsite a week)Employment Type: ContractJob DescriptionWe are seeking an experienced Application Performance Engineer to lead the Observability function for Capital Markets Technology. In this role, you will collaborate with Site Reliability Engineering (SRE), Application...

  • AWS SRE Engineer

    1 day ago


    Toronto, Ontario, Canada BULL-IT SOLUTIONS LTD Full time

    Required Skill Set:• Design, implement, and maintain highly available and scalable systems on AWS.• Develop and manage CICD pipelines for automated deployments and testing.• Configure and optimize Dynatrace monitoring for application performance and infrastructure health.• Implement observability practices (metrics, logging, tracing) to improve...


  • Toronto, Ontario, Canada Aarorn Technologies Inc Full time

    Job Title: Site-Reliability Engineer (SRE)Location: Toronto, ON (3x onsite a week)Employment Type: ContractJob DescriptionWe are seeking a highly skilled Site Reliability Engineer (SRE) to enhance the reliability, performance, and efficiency of mission-critical batch workloads within Capital Markets Technology. In this role, you will serve as the technical...

  • Site Reliability

    2 weeks ago


    Toronto, Ontario, Canada TECONICA SOFTWARES Full time

    Site-Reliability EngineerLocation: Toronto, CanadaReports To: Director, Reliability Engineering – Capital Markets TechnologyRole Overview:We're looking for an SRE to elevate the reliability, performance, and efficiency of mission-critical batch workloads across Capital Markets Technology. You'll be the technical lead for hand-on automation, application...


  • Toronto, Ontario, Canada Zeal Solutions Inc Full time

    Looking for a skilled Dynatrace Deployment Specialist to lead the implementation and configuration of Dynatrace observability solutions. The ideal candidate will ensure the successful deployment, integration, and optimization of Dynatrace across enterprise environments.Key Responsibilities:Install, configure, and optimize Dynatrace OneAgent , ActiveGate ...

  • Azure SRE

    3 days ago


    Toronto, Ontario, Canada Aarorn Technologies Inc Full time

    Job Title: Azure SRELocation: Toronto, ON (Hybrid - 4x Onsite a Week)Employment Type: Contract OpportunityInterview Type: Face 2 Face (Onsite Interview Only)Job DescriptionMonitoring and Alerting: Implement and maintain monitoring systems to proactively identify potential issues and alert engineers to problems before they impact usersIncident Response:...


  • Toronto, Ontario, Canada Technology Hub Inc Full time

    Key ResponsibilitiesInstall, configure, and optimize Dynatrace OneAgent, ActiveGate, Grail , and related components.Design and implement monitoring strategies for applications, infrastructure, and cloud environments.Lead Dynatrace deployment and integration across enterprise-scale environments.Integrate Dynatrace with ITSM tools and automation...


  • Toronto, Ontario, Canada Viva Tech Solutions Full time

    Qualifications:Bachelor's degree in Computer Science, Engineering, or related field.5+ years of experience in application performance engineering or a related role.Strong proficiency with Dynatrace.Solid knowledge of performance testing tools (e.g., JMeter, LoadRunner).Understanding of distributed systems, microservices, and cloud environments.Experience...


  • Toronto, Ontario, Canada TECONICA SOFTWARES Full time

    Qualifications:· Expert-level Python: Advanced coding, performance tuning, concurrency (async/multiprocessing), testing, and packaging.· Linux Systems Expertise: Kernel/OS tuning, networking, filesystem optimization, process management, and troubleshooting.· Dynatrace Mastery: Custom dashboards, KPIs, anomaly detection, tagging strategy, and alerting...