Mercor logo

Researcher, AI Evaluation

Mercor
Full-time
Remote
Worldwide
$180,000 - $300,000 USD yearly
AI Trainer Jobs – Train AI Systems with Your Skills, AI Research Operations – Support Data Collection for AI

Researcher, AI Evaluation

About Mercor

Mercor is training models that predict how well someone will perform on a job better than a human can. Similar to how a human would review a resume, conduct an interview, and decide who to hire, we automate all of those processes with LLMs. Our technology is so effective it’s used by all of the top 5 AI labs.

We crossed a $100M revenue run rate and have averaged 59% month over month growth for the last 6 months, making us the fastest growing company in the world. The team is small and we remain extremely profitable because we can’t hire great people as fast as revenue is growing.

About the Role

Silicon Valley’s top AI companies work with Mercor to find domain experts who can help train and evaluate their models. As a researcher on the evaluation team at Mercor, you will be responsible for advancing the frontier of model evaluations to drive model improvements across the industry that create real world economic value. You will be frequently publishing impactful papers with industry leading collaborators, have ample resources to create high-impact datasets, and access to the frontier of evaluation and training data. You will work closely with Mercors’s Forward Deployed Research, Applied AI, and Operations teams, and have unmatched access to evaluate frontier models

We are looking for an experienced AI researcher. A track record of LLM evaluation publications is preferred but publication experience in evaluation of other types of models or other AI related publications are of interest as well.

Key Responsibilities

  • Build benchmarks that measure real world value of AI models.
  • Publish LLM evaluation papers in top conferences with the support of the Mercor Applied AI and Operations teams.
  • Push the frontier of understanding data ROI in model development including multi-modality, code, tool-use, and more.
  • Design and validate novel data collection and annotation offerings for the leading industry labs and big tech companies.

What Are We Looking For?

  • PhD or M.S. and 2+ years of work experience in a computer science, electrical engineering, econometrics, or another STEM field that provides a solid understanding of ML and model evaluation.
  • Strong publication record in AI research, ideally in LLM evaluation. Dataset and evaluation papers are preferred.
  • Strong understanding of LLMs and the data on which they are trained and evaluated against.
  • Strong communication skills and ability to present findings clearly and concisely.
  • Familiarity with data annotation workflows.
  • Good understanding of statistics.

Compensation

  • Base cash comp from $180K-$300K
  • Generous equity grant
  • A $20K relocation bonus (if moving to the Bay Area)
  • A $10K housing bonus (if you live within 0.5 miles of our office)
  • A $1K monthly stipend for meals
  • Free Equinox membership
  • Health insurance

We consider all qualified applicants without regard to legally protected characteristics and provide reasonable accommodations upon request.