Senior Research Scientist, Model Evaluation

2 weeks ago

Toronto, Canada Cohere Full time

Senior Research Scientist, Model Evaluation Who are we? Our mission is to scale intelligence to serve humanity. We’re training and deploying frontier models for developers and enterprises building AI systems that power content generation, semantic search, RAG, and agents. We believe our work is instrumental to the widespread adoption of AI and that each person on the team contributes to increasing the capabilities of our models and the value they bring to customers. Why this role? Evaluation is critical to making progress in scaling intelligence. As models become superhuman in many real-world use cases, we continue to develop new evaluation techniques that accurately reflect current capabilities and set the agenda for future progress. In this role you will create next‑generation evaluation methods and infrastructure to measure LLM progress. Responsibilities Create ambitious new evaluation benchmarks that push the limits of what our models can accomplish. Work cross‑functionally with teams to translate model feedback into trustworthy, repeatable evaluations. Conduct research to advance the state-of-the-art in LLM evaluation methods, including training LLM judges, refining LLM‑based data synthesis pipelines, and improving evaluation efficiency. Build scalable and reusable tools for digging into model performance. Qualifications Rapidly build prototypes that demonstrate LLM boundaries and develop resources to measure those capabilities. Have spent significant time reviewing complex data and LLM outputs to ensure high data quality. Are obsessive about rigorously measuring AI capabilities and ensuring measurements align with desired outcomes. Have strong software engineering skills. If some of the above doesn’t line up perfectly with your experience, we still encourage you to apply Inclusive Hiring We value and celebrate diversity and strive to create an inclusive work environment for all. We welcome applicants from all backgrounds and are committed to providing equal opportunities. Should you require any accommodations during the recruitment process, please submit an Accommodations Request Form, and we will work together to meet your needs. Perks Open and inclusive culture and work environment. Work closely with a team on the cutting edge of AI research. Weekly lunch stipend, in‑office lunches & snacks. Full health and dental benefits, including a separate budget for mental health. 100% parental leave top‑up for up to 6 months. Personal enrichment benefits towards arts, culture, fitness, well‑being, quality time, and workspace improvement. Remote‑flexible offices in Toronto, New York, San Francisco, London, and Paris, plus a co‑working stipend. 6 weeks of vacation (30 working days). Seniority Level Mid‑Senior level Employment Type Full‑time Job Function Other. Industries: Software Development #J-18808-Ljbffr

Senior Research Scientist, Model Evaluation

1 week ago

Toronto, Canada Cohere Full time

Senior Research Scientist, Model Evaluation Who are we? Our mission is to scale intelligence to serve humanity. We’re training and deploying frontier models for developers and enterprises building AI systems that power content generation, semantic search, RAG, and agents. We believe our work is instrumental to the widespread adoption of AI and that each...
Senior Research Scientist, Model Evaluation

4 weeks ago

Toronto, Canada Cohere Full time

Overview Our mission is to scale intelligence to serve humanity. We’re training and deploying frontier models for developers and enterprises who are building AI systems to power magical experiences like content generation, semantic search, RAG, and agents. We believe that our work is instrumental to the widespread adoption of AI. Cohere is a team of...
Senior Research Scientist, Model Evaluation

2 weeks ago

Toronto, Ontario, Canada Cohere Full time

Who are we?Our mission is to scale intelligence to serve humanity. We're training and deploying frontier models for developers and enterprises who are building AI systems to power magical experiences like content generation, semantic search, RAG, and agents. We believe that our work is instrumental to the widespread adoption of AI.We obsess over what we...
Senior Research Engineer, Model Evaluation

1 week ago

Toronto, Canada Cohere Full time

Senior Research Engineer, Model EvaluationJoin to apply for the Senior Research Engineer, Model Evaluation role at CohereSenior Research Engineer, Model Evaluation21 hours ago Be among the first 25 applicantsJoin to apply for the Senior Research Engineer, Model Evaluation role at CohereGet AI-powered advice on this job and more exclusive features.Who are...
Senior Research Engineer, Model Evaluation

1 week ago

Toronto, Canada Cohere Full time

Who are we?Our mission is to scale intelligence to serve humanity. We’re training and deploying frontier models for developers and enterprises who are building AI systems to power magical experiences like content generation, semantic search, RAG, and agents. We believe that our work is instrumental to the widespread adoption of AI.We obsess over what we...
Senior Research Engineer, Model Evaluation

6 hours ago

Toronto, Canada Cohere Full time

Who are we? Our mission is to scale intelligence to serve humanity. We’re training and deploying frontier models for developers and enterprises who are building AI systems to power magical experiences like content generation, semantic search, RAG, and agents. We believe that our work is instrumental to the widespread adoption of AI. We obsess over what...
Senior Research Engineer, Model Evaluation

1 week ago

Toronto, Ontario, Canada Cohere Full time

Who are we?Our mission is to scale intelligence to serve humanity. We're training and deploying frontier models for developers and enterprises who are building AI systems to power magical experiences like content generation, semantic search, RAG, and agents. We believe that our work is instrumental to the widespread adoption of AI.We obsess over what we...
Senior Applied Scientist – Research Products

6 days ago

Toronto, Canada Refinitiv Full time

# **Our Privacy Statement & Cookie Policy**Senior Applied Scientist – Research Products page is loaded## Senior Applied Scientist – Research Productsremote type: Hybridlocations: CAN-Toronto-19 Duncan Streettime type: Full timeposted on: Posted Yesterdayjob requisition id: JREQ196295**Senior Applied Research Scientist** ***About the...
Senior Applied Scientist – Research Products

6 days ago

Toronto, Canada Refinitiv Full time

# **Our Privacy Statement & Cookie Policy**Senior Applied Scientist – Research Products page is loaded## Senior Applied Scientist – Research Productsremote type: Hybridlocations: CAN-Toronto-19 Duncan Streettime type: Full timeposted on: Posted Yesterdayjob requisition id: JREQ **Senior Applied Research Scientist** ***About the Role:*** Senior Applied...
LLM Evaluation Scientist — Benchmark Innovation

2 weeks ago

Toronto, Canada Cohere Full time

A leading AI research firm in Toronto is seeking a Senior Research Scientist in Model Evaluation. The role involves creating novel evaluation benchmarks, refining measurement techniques, and working collaboratively to improve AI capabilities. Candidates should have strong software engineering skills and a passion for rigorous evaluation methods. This...

Americas

Europe

Asia / Oceania

Africa

Senior Research Scientist, Model Evaluation