Mechanistic Interpretability (llms) Machine Learning Expert

AfterQuery

3 hours ago

Contract

Remote

Worldwide

AI Trainer Jobs – Train AI Systems In Your Area Of Expertise

This is a remote, project-based role for machine learning researchers with deep expertise in mechanistic interpretability. You will complete tasks at the frontier of interpretability research — including analyzing internal model representations, reverse-engineering learned circuits, and developing tools and techniques to understand how neural networks compute. Work is over the next 2–3 weeks, asynchronous, and assigned on a project-by-project basis, with an expected commitment of 10–20 hours per week for the projects you accept. This position offers exceptional pay, exposure to cutting-edge AI safety and interpretability research, and a strong addition to your research portfolio.

Commitment: 10 hours/week | Pay: $150 - $200/hr | Type: Contract

Responsibilities

Conduct mechanistic interpretability research on transformer-based and other neural network architectures
Identify, isolate, and analyze computational circuits responsible for specific model behaviors
Apply and extend techniques such as activation patching, probing, sparse autoencoders, and attention analysis
Develop tools and frameworks to automate or scale interpretability workflows across model families
Document methodologies, findings, and technical approaches clearly and reproducibly

Required Qualifications

Published researcher with at least one first-author publication in a peer-reviewed venue (e.g., NeurIPS, ICML, ICLR, or equivalent)
Master's or PhD in Machine Learning, Artificial Intelligence, Computer Science, or a related quantitative field
Demonstrated expertise in mechanistic interpretability, model analysis, or AI safety research
Deep familiarity with transformer architectures and modern large language model internals
Strong problem-solving skills and ability to work independently on open-ended research tasks

Preferred Qualifications

Hands-on experience with interpretability tools and libraries (e.g., TransformerLens, baukit, or similar)
Familiarity with sparse autoencoders, superposition, and feature geometry research
Background in TA'ing or teaching deep learning, NLP, or AI safety courses

Why Apply

Flexible Time Commitment – Work on your schedule while tackling meaningful research challenges
Startup Exposure – Work directly with an early-stage Y Combinator-backed company, gaining hands-on experience that sets you apart
Exceptional Pay – Project-based pay ranges from $150–$200/hour
Portfolio Building – Gain experience on frontier interpretability and AI safety research problems
Professional Growth – Sharpen your skills on varied, challenging model analysis and reverse-engineering tasks

Apply now

Mechanistic Interpretability (llms) Machine Learning Expert

Responsibilities

Required Qualifications

Preferred Qualifications

Why Apply

More jobs

Chemistry Expert

AfterQuery

Physics Expert

AfterQuery