Software Engineer – AI Code Evaluator
About the Role
We're looking for experienced software engineers in Poland to evaluate and improve frontier AI models. You'll use your deep expertise in TypeScript, Ruby, Java, or C++ to identify bugs, hallucinations, and subtle failure modes in AI-generated code.
-
Organization: Alignerr
-
Type: Hourly Contract
-
Compensation: $50–$100 /hour
-
Location: Remote
-
Commitment: 10–40 hours/week
What You'll Do
- Evaluate the performance of frontier language models on complex software engineering tasks
- Identify bugs, logical errors, hallucinations, and reliability issues in model outputs
- Design and review prompts, test cases, and evaluation scenarios for advanced coding workflows
- Provide precise written feedback explaining model strengths, weaknesses, and edge cases
- Work across multiple languages and codebases to assess generalization and correctness
Who You Are
- 3–4+ years of professional software engineering experience
- Strong proficiency in at least one of: TypeScript, Ruby, Java, or C++
- Excellent written and spoken English
- Demonstrated ability to reason about complex systems and debug non-obvious issues
- Familiarity with modern AI / LLM tooling (Git, CLI workflows, testing frameworks, etc.)
- Ability to critically evaluate model behavior rather than simply use model outputs
Why Join Us
- Competitive pay and flexible remote work
- Work on cutting-edge AI projects with top research labs
- Freelance perks: autonomy, flexibility, and global collaboration
- Potential for ongoing work and contract extension
Application Process (Takes 10–15 min)
- Submit your resume
- Complete a short screening
- Project matching and onboarding
PS: Our team reviews applications daily. Please complete your application steps to be considered for this opportunity.