Senior AI Platform Engineer

2 weeks ago


Vancouver, British Columbia, Canada Semantic Enterprise AI Full time

About the Role

Semantic Enterprise AI (SEAI) builds next-generation Decision Engine workflows that integrate machine learning, agentic automation, and advanced reasoning tools into enterprise products that empower organizations to make better upside decisions faster.

As a Senior AI Platform Engineer, you'll architect and build the foundational platform infrastructure that powers SEAI's Decision Engine at enterprise scale. You'll drive technical architecture decisions, establish platform patterns and best practices, and build the critical systems that enable reliable multi-agent orchestration for Fortune 1000 clients. This is a senior technical role where you'll shape the platform's technical direction, solve complex distributed systems challenges, and build infrastructure that supports client-critical AI workflows.

What You'll Do

  • Architect and implement highly scalable multi-agent orchestration platforms that handle thousands of concurrent agent executions with sub-second latency requirements.
  • Design advanced state management systems for distributed agent coordination, including checkpoint/recovery mechanisms, distributed locking strategies, and event sourcing architectures for full workflow reproducibility.
  • Build sophisticated configuration management systems that enable declarative workflow definitions with versioning, A/B testing capabilities, canary deployments, and automatic rollback mechanisms.
  • Architect zero-trust security models for agent-to-agent communication, including mTLS implementation, service mesh integration, secret rotation systems, and fine-grained RBAC for multi-tenant isolation.
  • Design and implement advanced observability platforms specifically for AI workflows, including distributed tracing across agent boundaries, custom metrics for LLM performance, cost attribution systems, and automated anomaly detection.
  • Create sophisticated evaluation frameworks that combine multiple validation strategies (rule-based, statistical, LLM-as-judge) with automatic performance regression detection and workflow reliability scoring.
  • Build intelligent resource optimization systems including predictive scaling for agent workloads, intelligent request routing based on model capabilities, and cost-aware execution planning for LLM inference.
  • Design fault-tolerant integration patterns for external services, including circuit breakers, intelligent retry mechanisms with exponential backoff, and graceful degradation strategies when downstream services fail.
  • Architect data pipeline infrastructure for agent context management, including vector database optimization, semantic caching layers, and efficient state hydration for long-running workflows.

Required Qualifications

  • Bachelor's or Master's degree in Computer Science, Distributed Systems, or related technical field (or equivalent practical experience).
  • 8+ years of production engineering experience with 5-7 years specifically focused on platform infrastructure and distributed systems architecture.
  • Expert-level Python proficiency with deep understanding of async programming, concurrency patterns, and performance optimization at scale.
  • Extensive production experience with modern agent frameworks (LangChain, LlamaIndex, AutoGen, CrewAI) and workflow orchestration systems (Temporal, Cadence, Airflow, Prefect) including custom extensions and performance tuning.
  • Advanced cloud architecture expertise (AWS, GCP, or Azure) including serverless patterns, container orchestration (ECS, GKE, AKS), service mesh implementations, and multi-region deployment strategies.
  • Deep Infrastructure-as-Code expertise with production experience managing complex multi-environment deployments using Terraform or other tools.
  • Proven track record designing and building enterprise multi-tenant platforms with production experience in data isolation patterns, tenant resource quotas, cross-tenant security boundaries, and compliance framework implementation.
  • Expert-level distributed systems knowledge including consensus algorithms, distributed transactions, event-driven architectures, and sophisticated service failure management.
  • Production experience with advanced observability stacks including distributed tracing (OpenTelemetry), time-series databases (Prometheus, InfluxDB), log aggregation at scale, and custom instrumentation for AI/ML workloads.
  • Strong background in platform reliability engineering including SLI/SLO definition, load testing frameworks, and incident response automation.

Preferred Qualifications

  • Experience building production LLM infrastructure including prompt caching systems, semantic routing, model gateway design, and inference optimization strategies (batching, quantization, distillation).
  • Deep knowledge of distributed state machines, workflow DAG optimization, dynamic task scheduling, and building domain-specific languages (DSLs) for workflow definition.
  • Production experience with vector databases (Pinecone, Weaviate, Qdrant) including index optimization, hybrid search strategies, and scaling to billions of embeddings.
  • Background in AI safety and governance including prompt injection detection, output validation frameworks, PII redaction systems, and audit trail implementation for regulatory compliance.
  • Experience with advanced testing strategies for AI systems including property-based testing, metamorphic testing, adversarial testing, and building synthetic test data generation pipelines.
  • Track record of technical leadership including driving architecture reviews, creating technical RFCs, establishing engineering standards, and mentoring teams on complex technical topics.
  • Contributions to open-source projects in the agent/LLM/workflow orchestration space, published technical articles, or conference speaking experience.
  • Relevant advanced certifications (AWS Solutions Architect Professional, Google Cloud Professional Architect, CKS, or similar).

We value diverse perspectives and encourage all qualified candidates to apply, even if you don't match every qualification perfectly.

*We are currently seeking candidates who are legally authorized to work in the United States or Canada. Preference will be given to applicants located in
Washington
,
Oregon
, or
British Columbia
. We are committed to providing equal employment opportunities and do not discriminate based on race, color, religion, sex, national origin, age, disability, or genetic information.*

Salary Range:
Up to $137,000 USD (US) / $175,000 CAD (Canada), depending on experience and location.



  • Vancouver, British Columbia, Canada Inworld AI Full time

    About InworldAt Inworld, we believe that the benefits of AI should extend beyond business workflows to the applications and experiences that we enjoy every day. We began by pushing the frontier of lifelike, interactive characters for games and entertainment, pioneering realtime conversational AI at scale. Today, we apply that expertise to provide the...


  • Vancouver, British Columbia, Canada Inworld AI Full time

    Why Join InworldAt Inworld, we believe the processes of building, scaling, and evolving applications are monsters that consume value before it can reach users. Our mission is to solve evolution and transform static software into AI systems that autonomously evolve to better serve their users. We are building an intelligent runtime to conquer these monsters...


  • Vancouver, British Columbia, Canada Inworld AI Full time

    About InworldAt Inworld, we believe that the benefits of AI should extend beyond business workflows to the applications and experiences that we enjoy every day. We began by pushing the frontier of lifelike, interactive characters for games and entertainment, pioneering realtime conversational AI at scale. Today, we apply that expertise to provide the...


  • Vancouver, British Columbia, Canada Inworld AI Full time

    About InworldAt Inworld, we believe that the benefits of AI should extend beyond business workflows to the applications and experiences that we enjoy every day. We began by pushing the frontier of lifelike, interactive characters for games and entertainment, pioneering realtime conversational AI at scale. Today, we apply that expertise to provide the...


  • Vancouver, British Columbia, Canada Inworld AI Full time

    Why Join InworldAt Inworld, we believe the processes of building, scaling, and evolving applications are monsters that consume value before it can reach users. Our mission is to solve evolution and transform static software into AI systems that autonomously evolve to better serve their users. We are building an intelligent runtime to conquer these monsters...

  • Senior AI Engineer

    2 weeks ago


    Vancouver, British Columbia, Canada SiteMax Systems Inc. Full time

    Location:320 Granville StreetEmployment Type:Full-TimeReports To:Engineering LeadAbout SiteMax SystemsSiteMax Systems is a leading provider of comprehensive construction and project management software, serving over 150,000 jobsites and tens of thousands of construction professionals daily. Our unified platform eliminates the need for multiple applications...


  • Vancouver, British Columbia, Canada Mastercard Full time

    Our PurposeMastercard powers economies and empowers people in 200+ countries and territories worldwide. Together with our customers, we're helping build a sustainable economy where everyone can prosper. We support a wide range of digital payments choices, making transactions secure, simple, smart and accessible. Our technology and innovation, partnerships...


  • Vancouver, British Columbia, Canada Inworld AI Full time

    About InworldAt Inworld, we believe that the benefits of AI should extend beyond business workflows to the applications and experiences that we enjoy every day. We began by pushing the frontier of lifelike, interactive characters for games and entertainment, pioneering realtime conversational AI at scale. Today, we apply that expertise to provide the...


  • Vancouver, British Columbia, Canada Comm100 Network Corporation Full time

    Apply Now: Senior Algorithm Engineer, AIWho We AreComm100 is an award-winning digital customer engagement platform, enabling organizations to better engage, convert and support their customers online. Established in 2009, Comm100 serves over 10,000 clients globally including HP, Rackspace, Government of Canada, Google, Stanford University, and many more. We...


  • Vancouver, British Columbia, Canada Inworld AI Full time

    About InworldAt Inworld, we believe that the benefits of AI should extend beyond business workflows to the applications and experiences that we enjoy every day. We began by pushing the frontier of lifelike, interactive characters for games and entertainment, pioneering realtime conversational AI at scale. Today, we apply that expertise to provide the...