Machine Learning Evaluation Specialist
About the Role
We're looking for domain experts with strong machine learning backgrounds to design challenging ML evaluation problems that test the boundaries of state-of-the-art AI systems. You'll draw on your specialized research expertise to craft problems that go beyond textbook knowledge — the kind of challenges that require deep, nuanced domain understanding to solve correctly.
Your work directly shapes how we measure and improve the next generation of AI models.
- Organization: Alignerr
- Type: Hourly Contract
- Compensation: $200–$400 /hour
- Location: Remote
- Commitment: 10–40 hours/week
What You'll Do
- Propose complex, original machine learning problems rooted in your domain of expertise
- Design evaluation tasks that require advanced domain knowledge beyond standard ML pipelines
- Draw from your own research experience to craft problems that would challenge a highly capable LLM
- Define clear problem statements, evaluation criteria, and gold-standard solutions
- Assess AI-generated ML solutions for correctness, creativity, and methodological rigor
- Document problem difficulty, required domain knowledge, and expected failure modes
- Collaborate asynchronously with a global team of researchers and engineers
Who You Are
- Graduate-level expertise (MS or PhD preferred) in a scientific or technical domain that intersects with machine learning
- Strong working knowledge of ML methods — model selection, feature engineering, evaluation metrics, and pipeline design
- Deep familiarity with active research problems in your field
- Ability to identify where general ML knowledge falls short and specialized domain insight becomes critical
- Experience publishing or conducting original research is highly valued
- Excellent written communication — able to articulate complex problems clearly and precisely
- Self-motivated and comfortable working independently on intellectually demanding tasks
Example Domains (Not Exhaustive)
- Computational biology, genomics, or bioinformatics
- Climate science and environmental modeling
- Medical imaging and healthcare ML
- Materials science and computational chemistry
- Astrophysics and signal processing
- Natural language processing for low-resource or specialized corpora
- Robotics, control theory, or reinforcement learning in complex environments
- Financial modeling and quantitative analysis
Why Join Us
- Work at the frontier of AI evaluation and safety research
- Collaborate with top research labs pushing the boundaries of what AI can do
- Leverage your hard-earned domain expertise in a high-impact, meaningful way
- Full autonomy, flexible schedule, and global collaboration
- Potential for ongoing work, contract extension, and deeper research involvement
- Build your profile as a contributor to cutting-edge AI development
Application Process (Takes 10–15 min)
- Submit your resume highlighting your domain expertise and ML experience
- Complete a short screening assessment
- Project matching and onboarding
PS: Our team reviews applications daily. Please complete your application steps to be considered for this opportunity.