LLM Evaluation Scientist — Benchmark Innovation

4 weeks ago

Toronto, Canada Cohere Full time

A leading AI research firm in Toronto is seeking a Senior Research Scientist in Model Evaluation. The role involves creating novel evaluation benchmarks, refining measurement techniques, and working collaboratively to improve AI capabilities. Candidates should have strong software engineering skills and a passion for rigorous evaluation methods. This position offers comprehensive benefits and a supportive work environment.
#J-18808-Ljbffr

LLM Evaluation Scientist — Benchmark Innovation

3 weeks ago

Toronto, Canada Cohere Full time

A leading AI research firm in Toronto is seeking a Senior Research Scientist in Model Evaluation. The role involves creating novel evaluation benchmarks, refining measurement techniques, and working collaboratively to improve AI capabilities. Candidates should have strong software engineering skills and a passion for rigorous evaluation methods. This...
LLM Evaluation Scientist — Benchmark Innovation

4 weeks ago

Toronto, Montreal, Calgary, Vancouver, Edmonton, Old Toronto, Ottawa, Mississauga, Quebec, Winnipeg, Halifax, Saskatoon, Burnaby, Hamilton, Victoria, Surrey, Halton Hills, London, Regina, Markham, Brampton, Vaughan, Kelowna, Laval, Southwestern Ontario, R, Canada Cohere Full time

A leading AI research firm in Toronto is seeking a Senior Research Scientist in Model Evaluation. The role involves creating novel evaluation benchmarks, refining measurement techniques, and working collaboratively to improve AI capabilities. Candidates should have strong software engineering skills and a passion for rigorous evaluation methods. This...
Technical Lead Manager, Evaluation

2 weeks ago

Toronto, Canada Waabi Innovation Inc. Full time

Waabi, founded by AI pioneer and visionary Raquel Urtasun, is an AI company building the next generation of self‑driving technology. With a world class team and an innovative approach that unleashes the power of AI to “drive” safely in the real world, Waabi is bringing the promise of self‑driving closer to commercialization than ever before. Waabi is...
Technical Lead Manager, Evaluation

3 weeks ago

Toronto, Canada Waabi Innovation Inc. Full time

Waabi, founded by AI pioneer and visionary Raquel Urtasun, is an AI company building the next generation of self‑driving technology. With a world class team and an innovative approach that unleashes the power of AI to “drive” safely in the real world, Waabi is bringing the promise of self‑driving closer to commercialization than ever before. Waabi is...
Technical Lead Manager, Evaluation

2 weeks ago

Toronto, Canada Waabi Innovation Inc. Full time

Waabi, founded by AI pioneer and visionary Raquel Urtasun, is an AI company building the next generation of self‑driving technology. With a world class team and an innovative approach that unleashes the power of AI to “drive” safely in the real world, Waabi is bringing the promise of self‑driving closer to commercialization than ever before. Waabi is...
Technical Lead Manager, Evaluation

2 weeks ago

Toronto, Canada Waabi Innovation Inc. Full time

Waabi, founded by AI pioneer and visionary Raquel Urtasun, is an AI company building the next generation of self‑driving technology. With a world class team and an innovative approach that unleashes the power of AI to “drive” safely in the real world, Waabi is bringing the promise of self‑driving closer to commercialization than ever before. Waabi is...
Technical Lead Manager, Evaluation

1 week ago

Toronto, Canada Waabi Innovation Inc. Full time

Waabi, founded by AI pioneer and visionary Raquel Urtasun, is an AI company building the next generation of self‑driving technology. With a world class team and an innovative approach that unleashes the power of AI to “drive” safely in the real world, Waabi is bringing the promise of self‑driving closer to commercialization than ever before. Waabi is...
Senior Research Scientist, Model Evaluation

4 days ago

Toronto, Canada Cohere Full time

Overview Our mission is to scale intelligence to serve humanity. We’re training and deploying frontier models for developers and enterprises who are building AI systems to power magical experiences like content generation, semantic search, RAG, and agents. We believe that our work is instrumental to the widespread adoption of AI. Cohere is a team of...
Senior Research Scientist, Model Evaluation

3 weeks ago

Toronto, Canada Cohere Full time

Senior Research Scientist, Model Evaluation Who are we? Our mission is to scale intelligence to serve humanity. We’re training and deploying frontier models for developers and enterprises building AI systems that power content generation, semantic search, RAG, and agents. We believe our work is instrumental to the widespread adoption of AI and that each...
Senior Research Scientist, Model Evaluation

4 weeks ago

Toronto, Canada Cohere Full time

Senior Research Scientist, Model Evaluation Who are we? Our mission is to scale intelligence to serve humanity. We’re training and deploying frontier models for developers and enterprises building AI systems that power content generation, semantic search, RAG, and agents. We believe our work is instrumental to the widespread adoption of AI and that each...

Americas

Europe

Asia / Oceania

Africa

LLM Evaluation Scientist — Benchmark Innovation