Our client is seeking an
AI Engineer to design, build, and productionize agentic AI systems, LLM applications, and RAG pipelines that improve production operations, reliability, automation, and governance. (Hybrid, Toronto, 3 days on site/wk)
Must Have skills
- 5+ years of software development experience in one or more languages such as Python, C/C++, Go, or Java.
- Strong hands-on experience building and maintaining large-scale Python applications.
- 3+ years of experience designing, architecting, testing, and launching production ML systems.
- Experience with model deployment and serving, evaluation and monitoring, data processing pipelines, and model fine-tuning workflows.
- Practical experience with Large Language Models, including API integration, prompt engineering, fine-tuning or adaptation, RAG, and tool-using agents.
- Experience building agentic AI systems using vector retrieval, function calling, secure tool execution, structured reasoning, and guardrails.
- Understanding of commercial and open-source LLMs such as OpenAI, Gemini, Llama, Qwen, and Claude.
- Solid understanding of applied statistics, core machine learning concepts, algorithms, and data structures.
Nice to Have skills
- Proficiency building and operating cloud infrastructure, ideally AWS.
- Experience with AWS containerized services such as ECS or EKS.
- Experience with serverless technologies such as AWS Lambda.
- Experience with AWS data services such as S3, DynamoDB, and Redshift.
- Experience with orchestration tools such as AWS Step Functions.
- Experience with model serving platforms such as SageMaker.
- Experience with infrastructure-as-code tools such as Terraform or CloudFormation.
Responsibilities
- Design and implement tool-calling agentic AI systems that combine retrieval, structured reasoning, and secure action execution.
- Build AI agents using function calling, change orchestration, policy enforcement, and MCP protocol patterns.
- Engineer robust safety guardrails to support compliance, governance, and least-privilege access.
- Build evaluation frameworks for open-source and foundational LLMs.
- Develop retrieval pipelines, prompt synthesis, response validation, and self-correction loops for production operations.
- Connect AI agents to observability, incident management, and deployment systems.
- Enable automated diagnostics, runbook execution, remediation, and post-incident summarization with full traceability.
- Partner with production engineers and application teams to translate operational pain points into agentic AI roadmaps.
- Define objective functions tied to reliability, risk reduction, cost optimization, and business-aligned outcomes.
- Build validator models, adversarial prompts, policy checks, deterministic fallbacks, circuit breakers, and rollback strategies.
- Instrument continuous evaluations for usefulness, correctness, and risk.
- Optimize cost and latency through prompt engineering, context management, caching, model routing, and distillation.
- Use batching, streaming, and parallel tool calls to meet stringent SLOs under real-world production load.
- Build and maintain RAG pipelines, including domain knowledge curation, data-quality validation, feedback loops, and knowledge freshness frameworks.
- Drive design reviews, rigorous experimentation, and high-quality engineering practices.
- Mentor peers on agent architectures, evaluation methodologies, and safe deployment patterns.
#LI-SB1
#D480