Data Infrastructure Engineer

3 days ago


Canada MeshyAI Full time

Join to apply for the Data Infrastructure Engineer role at MeshyAI Get AI-powered advice on this job and more exclusive features. About Meshy Headquartered in Silicon Valley , Meshy is the leading 3D generative AI company on a mission to Unleash 3D Creativity by transforming the content creation pipeline. Meshy makes it effortless for both professional artists and hobbyists to create unique 3D assets—turning text and images into stunning 3D models in just minutes. What once took weeks and cost $1,000 now takes just 2 minutes and $1. Our world‑class team of top experts in computer graphics, AI, and art includes alumni from MIT, Stanford, and Berkeley, as well as veterans from Nvidia and Microsoft. Our talent spans the globe, with team members distributed across North America, Asia, and Oceania , fostering a diverse and innovative multi‑regional culture focused on solving global 3D challenges. Meshy is trusted by top developers, backed by premiere venture capital firms like Sequoia and GGV , and has successfully raised $52 Million in funding. Meshy is the market leader, recognized as the No.1 in popularity among 3D AI tools (according to 2024 A16Z Games) and No.1 in website traffic (according to SimilarWeb, with 3 Million monthly visits). The platform boasts over 5 Million users and has generated 40 Million models . Founder and CEO Yuanming (Ethan) Hu earned his Ph.D. in graphics and AI from MIT, where he developed the acclaimed Taichi GPU programming language (27K stars on GitHub, used by 300+ institutes). His work is highly influential, including an honorable mention for the SIGGRAPH 2022 Outstanding Doctoral Dissertation Award and over 2,700 research citations. About the Role We are seeking a Data Infrastructure Engineer to join our growing team. In this role, you will design, build, and operate distributed data systems that power large‑scale ingestion, processing, and transformation of datasets used for AI model training. These datasets span traditional structured data as well as unstructured assets such as images and 3D models, which often require specialized preprocessing for pretraining and fine‑tuning workflows. This is a versatile role: you’ll own end‑to‑end pipelines (from ingestion to transformation), ensure data quality and scalability, and collaborate closely with ML researchers to prepare diverse datasets for cutting‑edge model training. What You’ll Do Core Data Pipelines Design, implement, and maintain distributed ingestion pipelines for structured and unstructured data (images, 3D/2D assets, binaries). Build scalable ETL/ELT workflows to transform, validate, and enrich datasets for AI/ML model training and analytics. Distributed Systems & Storage Architect pipelines across cloud object storage (S3, GCS, Azure Blob), data lakes, and metadata catalogs. Optimize large‑scale processing with distributed frameworks (Spark, Dask, Ray, Flink, or equivalents). Implement partitioning, sharding, caching strategies, and observability (monitoring, logging, alerting) for reliable pipelines. Pretrain Data Processing Support preprocessing of unstructured assets (e.g., images, 3D/2D models, video) for training pipelines, including format conversion, normalization, augmentation, and metadata extraction. Implement validation and quality checks to ensure datasets meet ML training requirements. Collaborate with ML researchers to quickly adapt pipelines to evolving pretraining and evaluation needs. Infrastructure & DevOps Use infrastructure‑as‑code (Terraform, Kubernetes, etc.) to manage scalable and reproducible environments. Integrate CI/CD best practices for data workflows. Data Governance & Collaboration Maintain data lineage, reproducibility, and governance for datasets used in AI/ML pipelines. Work cross‑functionally with ML researchers, graphics/vision engineers, and platform teams. Embrace versatility: switch between infrastructure‑level challenges and asset/data‑level problem solving. Contribute to a culture of fast iteration, pragmatic trade‑offs, and collaborative ownership. What We’re Looking For Technical Background 5+ years of experience in data engineering, distributed systems, or similar. Strong programming skills in Python (plus Scala/Java/C++ a plus). Solid skills in SQL for analytics, transformations, and warehouse/lakehouse integration. Proficiency with distributed frameworks (Spark, Dask, Ray, Flink). Familiarity with cloud platforms (AWS/GCP/Azure) and storage systems (S3, Parquet, Delta Lake, etc.). Experience with workflow orchestration tools (Airflow, Prefect, Dagster). Domain Skills (Preferred) Experience handling large‑scale unstructured datasets (images, video, binaries, or 3D/2D assets). Familiarity with AI/ML training data pipelines, including dataset versioning, augmentation, and sharding. Exposure to computer graphics or 3D/2D data processing is strongly preferred. Mindset Comfortable in a startup environment: versatile, self‑directed, pragmatic, and adaptive. Strong problem solver who enjoys tackling ambiguous challenges. Commitment to building robust, maintainable, and observable systems. Nice to Have Kubernetes for distributed workloads and orchestration. Data warehouses or lakehouse platforms (Snowflake, BigQuery, Databricks, Redshift). Familiarity with GPU‑accelerated computing and HPC clusters. Experience with 3D/2D asset processing (geometry transformations, rendering pipelines, texture handling). Rendering engines (Blender, Unity, Unreal) for synthetic data generation. Open‑source contributions in ML infrastructure, distributed systems, or data platforms. Familiarity with secure data handling and compliance. Our Values Brain: We value intelligence and the pursuit of knowledge. Our team is composed of some of the brightest minds in the industry. Heart: We care deeply about our work, our users, and each other. Empathy and passion drive us forward. Gut: We trust our instincts and are not afraid to take bold risks. Innovation requires courage. Taste: We have a keen eye for quality and aesthetics. Our products are not just functional but also beautiful. Why Join Meshy Competitive salary, equity, and benefits package. Opportunity to work with a talented and passionate team at the forefront of AI and 3D technology. Flexible work environment, with options for remote and on‑site work. Opportunities for fast professional growth and development. An inclusive culture that values creativity, innovation, and collaboration. Unlimited, flexible time off. Benefits Stock options available for core team members. Comprehensive health, dental, and vision insurance. Referrals increase your chances of interviewing at MeshyAI by 2x. Seniority Level Mid‑Senior level Employment Type Full‑time Job Function Information Technology Industries Technology, Information and Internet Get notified about new Infrastructure Engineer jobs in Canada Greater Montreal Metropolitan Area 1 month agoCalgary, Alberta, Canada CA$80,080.00‑CA$120,120.00 4 days agoToronto, Ontario, Canada $130,000.00‑$160,000.00 3 weeks agoCanada CA$107,500.00‑CA$192,000.00 4 days ago Site Reliability Engineer (OpenShift & Infrastructure) We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI. #J-18808-Ljbffr



  • , , Canada Meshy Full time

    Headquartered in Silicon Valley , Meshy is the leading 3D generative AI company on a mission to Unleash 3D Creativity by transforming the content creation pipeline. Meshy makes it effortless for both professional artists and hobbyists to create unique 3D assets—turning text and images into stunning 3D models in just minutes. Meshy is trusted by top...


  • , , Canada EverCommerce Full time

    Overview EverCommerce - Head of Data Infrastructure & Engineering EverCommerce (Nasdaq: EVCM) is a leading service commerce platform, providing vertically-tailored, integrated SaaS solutions that help more than 690,000 global service-based businesses accelerate growth, streamline operations, and increase retention. Its modern digital and mobile applications...


  • Canada Remote Meshy Full time

    About Meshy Headquartered in Silicon Valley, Meshy is the leading 3D generative AI company on a mission to Unleash 3D Creativity by transforming the content creation pipeline. Meshy makes it effortless for both professional artists and hobbyists to create unique 3D assets—turning text and images into stunning 3D models in just minutes. What once took...


  • , , Canada Instacart Full time

    Join to apply for the Senior Data Engineer, Core Experience role at Instacart Overview At Instacart, our mission is to create a world where everyone has access to the food they love and more time to enjoy it together. The Data Engineering team builds critical data pipelines that underpin how data is used across Instacart to support customers and partners....


  • , , Canada Webflow Full time

    Senior Software Engineer, Data Infrastructure U.S. Remote At Webflow, our mission is to bring development superpowers to everyone. As the pioneer of the Website Experience Platform (WXP), we’re redefining how teams Build, Manage, and Optimize for the web — combining visual development, powerful content management systems, AI-driven personalization,...


  • , , Canada Instacart Full time

    Overview We\'re transforming the grocery industry. At Instacart, we invite the world to share love through food because we believe everyone should have access to the food they love and more time to enjoy it together. Instacart has become a lifeline for millions of people, and we\'re building the team to help push our shopping cart forward. If you\'re ready...


  • , , Canada Quora Full time

    A leading knowledge-sharing platform is seeking a Senior Software Engineer for Data Infrastructure. This remote role involves designing scalable data architectures, enhancing data systems using technologies like Python, SQL, and various big data frameworks. Candidates should have at least 4 years of relevant experience and be comfortable working in a...

  • Software Engineer

    1 day ago


    , , Canada Quora Full time

    Software Engineer – Data Infrastructure (Remote) Join us at Quora , a privately‑held, remote‑first company. This role can be performed remotely from multiple countries worldwide. Eligibility details for remote workers can be found on careers.quora.com/eligible-countries. About Quora Quora’s mission is to grow and share the world’s knowledge. To do...


  • , , Canada Apollo.io Full time

    Apollo.io is the leading go-to-market solution for revenue teams, trusted by over 500,000 companies and millions of users globally, from rapidly growing startups to some of the world's largest enterprises. Founded in 2015, the company is one of the fastest growing companies in SaaS, raising approximately $250 million to date and valued at $1.6 billion....


  • , , Canada Quora Full time

    Senior Software Engineer - Data Infrastructure, Quora (Remote) Join to apply for the Senior Software Engineer - Data Infrastructure, Quora (Remote) role at Quora Quora is a privately held, remote-first company. This position can be performed remotely from multiple countries around the world. For eligibility by country, please visit...