Our client is seeking a
Senior-level AI Quality Assurance Leader specializing in Generative AI, LLM systems, and AI agents, responsible for defining and driving end-to-end quality strategy for scalable and responsible AI deployments. (Remote, USA)
Must Have skills
- Generative AI (GenAI) system testing – 7+ years overall QA experience with hands-on GenAI validation
- Large Language Models (LLMs) – 5+ years experience validating LLM outputs for accuracy, safety, and bias
- Retrieval-Augmented Generation (RAG) systems – 5+ years experience testing pipeline performance and retrieval quality
- Azure AI Foundry – 3+ years experience testing Azure-based AI solutions
- LangGraph – 3+ years experience validating orchestration and multi-agent workflows
- Test Automation Frameworks for AI Systems – 5+ years experience building automated validation and evaluation pipelines
- LangSmith evaluation workflows – 3+ years experience in LLM evaluation and monitoring
- Multi-agent AI architectures – 3+ years experience testing agent coordination and decision logic
- Responsible AI practices – 5+ years experience in bias detection, safety validation, and compliance testing
- Performance and load testing for AI systems – 5+ years experience validating scalability of AI services
- CI/CD integration for AI pipelines – 5+ years experience embedding QA into deployment workflows
- AI evaluation metrics design (BLEU, ROUGE, custom scoring, etc.) – 3+ years experience defining quality benchmarks
Responsibilities• Define and lead QA strategy for Generative AI pipelines, RAG systems, and multi-agent workflows
• Validate LLM outputs for accuracy, safety, bias, and performance across environments
• Oversee quality assurance of Azure AI Foundry-based AI solutions
• Ensure quality across LangGraph orchestration and LangSmith evaluation workflows
• Establish automated testing frameworks and AI-specific evaluation metrics
• Lead QA teams to ensure scalable, reliable, and responsible AI deployments
#LI-SB1