Software Engineer – Inference Serving
4 weeks ago
Join to apply for the Software Engineer – Inference Serving role at Taalas At Taalas we believe that fundamental progress is achieved by those who are willing to understand and assail a problem end-to-end, without regard for commonly accepted abstractions and boundaries. We are building a team of hands‑on technologists who dislike overspecialization and seek to excel in both depth and breadth. In this position the successful candidate will build software infrastructure for an inference serving cluster built around Taalas hardcore AI model chips. Job Responsibilities Adapt open‑source inference servers like vLLM and Punica to interface with Taalas’ hardcore AI models Implement a highly efficient LoRA swapping solution for multi-{tenant,LoRA} environments Build and test a scalable inference serving cluster using K8 and Traefik or similar Qualifications Bachelor’s or higher degree in Computer Science, or Electrical/Computer Engineering Experience with K8, HTTP load balancers, web‑servers Good knowledge of computer architecture and low‑level programming: Linux virtual memory and page table management, direct memory access, CUDA Familiarity with ML, Python and Pytorch Interested in joining our team? Submit your resume to careers@taalas.com to be considered for the exciting opportunity Seniority level: Entry level Employment type: Full‑time Job function: Engineering and Information Technology Industries: Semiconductor Manufacturing #J-18808-Ljbffr
-
Software Engineer – Inference Serving
4 weeks ago
Toronto, Canada Taalas Full timeJoin to apply for the Software Engineer – Inference Serving role at Taalas At Taalas we believe that fundamental progress is achieved by those who are willing to understand and assail a problem end-to-end, without regard for commonly accepted abstractions and boundaries. We are building a team of hands‑on technologists who dislike overspecialization and...
-
Software Engineer – Inference Serving
4 weeks ago
Toronto, Canada Taalas Full timeJoin to apply for the Software Engineer – Inference Serving role at Taalas At Taalas we believe that fundamental progress is achieved by those who are willing to understand and assail a problem end-to-end, without regard for commonly accepted abstractions and boundaries. We are building a team of hands‑on technologists who dislike overspecialization and...
-
Inference Serving Engineer — Scalable AI Infra
4 weeks ago
Toronto, Canada Taalas Full timeA technology firm specializing in AI is seeking a Software Engineer – Inference Serving. This entry-level role involves building software infrastructure for an inference serving cluster. Responsibilities include adapting open-source inference servers and implementing efficient solutions for AI models. Ideal candidates should have a relevant degree and...
-
Inference Serving Engineer — Scalable AI Infra
4 weeks ago
Toronto, Canada Taalas Full timeA technology firm specializing in AI is seeking a Software Engineer – Inference Serving. This entry-level role involves building software infrastructure for an inference serving cluster. Responsibilities include adapting open-source inference servers and implementing efficient solutions for AI models. Ideal candidates should have a relevant degree and...
-
Toronto, Canada Taalas Full timeA technology firm specializing in AI is seeking a Software Engineer – Inference Serving. This entry-level role involves building software infrastructure for an inference serving cluster. Responsibilities include adapting open-source inference servers and implementing efficient solutions for AI models. Ideal candidates should have a relevant degree and...
-
Senior Software Engineer, AI Inference Platform
2 weeks ago
Toronto, Canada Cerebras Systems Full timeCerebras Systems builds the world's largest AI chip, 56 times larger than GPUs. Our novel wafer‑scale architecture provides the AI compute power of dozens of GPUs on a single chip, with the programming simplicity of a single device. This approach allows Cerebras to deliver industry‑leading training and inference speeds and empowers machine learning users...
-
Senior Software Engineer, AI Inference Platform
2 weeks ago
Toronto, Canada Cerebras Systems Full timeCerebras Systems builds the world's largest AI chip, 56 times larger than GPUs. Our novel wafer‑scale architecture provides the AI compute power of dozens of GPUs on a single chip, with the programming simplicity of a single device. This approach allows Cerebras to deliver industry‑leading training and inference speeds and empowers machine learning users...
-
Senior Software Engineer, AI Inference Platform
2 weeks ago
Toronto, Canada Cerebras Systems Full timeCerebras Systems builds the world's largest AI chip, 56 times larger than GPUs. Our novel wafer‑scale architecture provides the AI compute power of dozens of GPUs on a single chip, with the programming simplicity of a single device. This approach allows Cerebras to deliver industry‑leading training and inference speeds and empowers machine learning users...
-
Engineering Manager, Inference Platform
2 weeks ago
Toronto, Canada Cerebras Systems Full timeCerebras Systems builds the world's largest AI chip, 56 times larger than GPUs. Our novel wafer-scale architecture provides the AI compute power of dozens of GPUs on a single chip, with the programming simplicity of a single device. This approach allows Cerebras to deliver industry-leading training and inference speeds and empowers machine learning users to...
-
Engineering Manager, Inference Platform
3 weeks ago
Toronto, Canada Cerebras Systems Full timeCerebras Systems builds the world's largest AI chip, 56 times larger than GPUs. Our novel wafer-scale architecture provides the AI compute power of dozens of GPUs on a single chip, with the programming simplicity of a single device. This approach allows Cerebras to deliver industry-leading training and inference speeds and empowers machine learning users to...