Site Reliability Engineer
1 week ago
We're Hiring
We're looking for brilliant thinkers to join our #Rocketeers. If you've ever wondered what it's like to work in a place where people enjoy their work and where talent is more important than the title, then keep reading.
What is ScalePad?
ScalePad is a market-leading software-as-a-service (SaaS) company with headquarters in Vancouver, Toronto, Montreal and Phoenix, AZ. However, we are proud to say our employee reach is now global so we can best serve our partners all over the world.
Our success is no accident: ScalePad provides MSPs of every size with the knowledge, technology, and community they need to deliver increased client value while navigating the continuously changing terrain of the IT landscape. With a suite of integrated products that automate and standardize MSP's operations, analyze and uncover new opportunities, and expand value to clients, ScalePad is equipping the MSP adventure.
ScalePad has received awards such as MSP Today's Product of the Year, G2's 2024 Fastest Growing Product, and 2024 Best IT Management Product. In 2023, it was named a Best Workplace in Canada by Great Place to Work. ScalePad is a privately held company serving over 12,000 MSPs across the globe.
You can contribute to our innovation and appreciate how your work is helping take this company to a higher level of operational maturity. More on that here.
Your mission should you choose to accept it.
As a Site Reliability Engineer (SRE) at ScalePad, you play a crucial role in ensuring the reliability, scalability, and efficiency of our infrastructure and development platforms. You support developer experience, automate operational tasks, and optimize system performance to maintain high availability and seamless deployments. Your expertise in monitoring, incident management, and automation helps ensure that our applications run smoothly and meet reliability targets.
Responsibilities.
- Strong proficiency in system operations, observability, and infrastructure monitoring
- Full understanding of AWS offerings, including core compute, networking, storage, IAM
- Experience with Infrastructure as Code (IaC) tools such as Terraform
- Proficiency in scripting and automation using Python, Bash, or equivalent languages
- Base knowledge of Java, Go, and Python is a strong plus
- Knowledge of CI/CD pipelines and best practices for continuous integration and delivery
- Experience with containerization and orchestration technologies such as Kubernetes and Docker
- Strong understanding of SLOs, SLAs, and incident management best practices.
- Ability to troubleshoot and resolve complex system issues in a high-availability environment.
- Familiarity with Agile methodologies and DevOps culture
Qualifications.
- System Operations and Reliability
- Maintain and improve system uptime and reliability according to established Service Level Objectives (SLOs)
- Monitor and optimize system performance using observability tools like Prometheus and Grafana
- Implement and maintain alerting systems to proactively detect and resolve issues
- Execute capacity planning and scaling activities, ensuring infrastructure efficiency
- Participate in the 24/7 on-call rotation, responding to and resolving system outages
- Incident Management
- Respond to and resolve production incidents within defined Service Level Agreements (SLAs)
- Document incident responses and contribute to post-mortem analysis to improve system resilience
- Implement preventive measures based on insights from incidents
- Manage escalations and coordinate with teams to resolve complex system issues
- Development and Automation
- Develop and maintain Infrastructure as Code (IaC) to enable automated infrastructure management
- Create and optimize CI/CD pipelines, ensuring smooth and reliable software releases
- Write automation scripts for routine operational tasks, reducing manual workload
- Implement monitoring solutions and dashboards to provide real-time system visibility
- Collaboration
- Work closely with development teams, ensuring seamless integration of SRE principles into application design
- Participate in team planning and retrospective meetings, contributing to continuous improvement
- Document technical processes and procedures, making knowledge accessible across teams
- Contribute to knowledge base maintenance, sharing best practices and troubleshooting insights
What You'll Love Working As A Rocketeer:
- Everyone's an Owner: Through our Employee Stock Option Plan (ESOP), each team member has a stake in our success. As we scale, your contributions directly shape our future – and you share in the rewards.
- Growth, Longevity and Stability: Benefit from insights and training from our leadership and founder, whose extensive experience in funding and scaling successful software companies creates a stable environment for your long-term career growth. Their proven track record fosters a culture of lasting success.
- Annual Training & Development: Every employee receives an annual budget for professional development, empowering you to advance your skills and career on your terms.
- Hybrid Flexibility: Enjoy a world-class office at our headquarters in downtown Vancouver, Toronto, and Montreal
- Cutting-Edge Gear: Whether in the office or at home, you'll be set up for success with top-of-the-line hardware.
- Wellness at Work: Our Vancouver office features a fitness facility, outdoor ping-pong tables
- Comprehensive Benefits: We've got you covered with an extensive benefits package with 100% medical and dental coverage fully employer-paid, RRSP matching after one year of employment, and even a monthly stipend to help offset the costs of the hybrid experience.
- Flexible Time Off: With our unlimited flex-time policy in addition to all accrued vacation allows you to take the time you need to recharge and thrive.
Dream jobs don't knock on your door every day.
ScalePad is not your typical software company. When we hire you, we aren't just offering you a job, but rather we are committing to investing in both you and your long-term career. You'll help shape how this modern SaaS company operates and make a genuine impact on the future of our people, product, and partners.
We invite all qualified candidates to apply. Please note, you must be eligible to work in Canada to be considered for this role. We thank you for your interest. However, only successful applicants will be contacted.
At ScalePad, we believe in the power of Diversity, Equity, Inclusion, and Belonging (DEIB) to drive innovation, collaboration, and success. We are committed to fostering a workplace where every individual's unique experiences and perspectives are valued, and where employees from all backgrounds can thrive. Our dedication to DEIB is woven into the fabric of our culture, guiding our actions and decisions as we build a stronger and more inclusive future together.
Join us and be part of a team that celebrates differences, embraces fairness, and ensures that everyone has an equal opportunity to contribute and grow. Together, we're creating an environment where diverse voices are not only heard but also amplified, where everyone feels valued, and where we can all achieve our full potential.
Please no recruiters or phone calls.
-
Site Reliability Engineer
1 week ago
Vancouver, British Columbia, Canada LayerZero Labs Full time $120,000 - $200,000 per yearLayerZeroThe Future is Omnichain.Founded in 2021, LayerZero's vision is to create a community of cross-chain developers, building dApps that are no longer constrained by individual blockchain capabilities. With LayerZero's simple, generic messaging protocol, builders will develop cross-chain dApps designed to unify the power of individual blockchains.We are...
-
Site Reliability Engineer
2 weeks ago
Vancouver, British Columbia, Canada LayerZero Labs Full time $120,000 - $200,000 per yearLayerZeroThe Future is Omnichain.Founded in 2021, LayerZero's vision is to create a community of cross-chain developers, building dApps that are no longer constrained by individual blockchain capabilities. With LayerZero's simple, generic messaging protocol, builders will develop cross-chain dApps designed to unify the power of individual blockchains.We are...
-
Site Reliability Engineer
2 weeks ago
Vancouver, British Columbia, Canada Blockscout Limited Full time $120,000 - $180,000 per yearBlockscout is a leading provider of indexing and UI services for EVM chains. Our team hosts explorers for many of the largest chains in the industry. Reliability is vital to our company's success. We are looking for a Site Reliability Engineer to strengthen our DevOps and Support teams.Key responsibilitiesMonitor systems: Proactively watch production systems...
-
Site Reliability Engineer
24 hours ago
Vancouver, British Columbia, Canada Motorola Solutions Full time US$70,000 - US$95,000Company OverviewAt Motorola Solutions, we believe that everything starts with our people. We're a global close-knit community, united by the relentless pursuit to help keep people safer everywhere. Our critical communications, video security and command center technologies support public safety agencies and enterprises alike, enabling the coordination that's...
-
Senior Site Reliability Engineer
1 week ago
Vancouver, British Columbia, Canada RBC Full time $120,000 - $180,000 per yearJob Description What is the Opportunity?City National Bank (CNB), an RBC company, is seeking a Senior Site Reliability Engineer (SRE), who will be responsible for supporting CNB digital, corporate applications along with the implementation of Site Reliability Engineering solutions.As a Sr. SRE, you will play a critical role in ensuring the availability,...
-
Site Reliability Engineer
1 week ago
Vancouver, British Columbia, Canada ScalePad Full time $100,000 - $120,000 per yearWe're HiringWe're looking for brilliant thinkers to join our #Rocketeers. If you've ever wondered what it's like to work in a place where people enjoy their work and where talent is more important than the title, then keep reading.What is ScalePad?ScalePad is a market-leading software-as-a-service (SaaS) company with headquarters in Vancouver, Toronto,...
-
Senior Site Reliability Engineer
6 days ago
Vancouver, British Columbia, Canada Canonical - Jobs Full time $120,000 - $180,000 per yearCanonical is a leading provider of open source software and operating systems to the global enterprise and technology markets. Our platform, Ubuntu, is very widely used in breakthrough enterprise initiatives such as public cloud, data science, AI, engineering innovation and IoT. Our customers include the world's leading public cloud and silicon providers,...
-
Principal Site Reliability Engineer
1 week ago
Vancouver, British Columbia, Canada Red Hat Full time $120,000 - $180,000 per yearAbout the JobWe're seeking an Site Reliability Engineer (SRE) with passion for maintaining highly reliable cloud-based services. In this role, you will support Red Hat's software manufacturing services on our hybrid cloud infrastructure. You will partner with development, quality engineering and release engineering colleagues to support the health and...
-
Senior Site Reliability Engineer I
1 week ago
Vancouver, British Columbia, Canada Axon Full time $120,000 - $180,000 per yearJoin Axon and be a Force for Good. At Axon, we're on a mission to Protect Life. We're explorers, pursuing society's most critical safety and justice issues with our ecosystem of devices and cloud software. Like our products, we work better together. We connect with candor and care, seeking out diverse perspectives from our customers, communities and each...
-
Senior Reliability Engineer
1 day ago
Vancouver, British Columbia, Canada Sanctuary AI Full time $120,000 - $180,000 per yearYour New Role and Team Sanctuary AI, a world leader in building dexterity-driven Physical AI for general purpose robots, is looking to hire a skilled and motivated Senior Reliability Engineer to join our Operations team. The Senior Reliability Engineer is a crucial part of our team, supporting product commercialization, NPI and volume manufacturing.....