Design and ship production agentic AI, RAG, and LLM systems for clients across BPO, tech media, healthtech, legal, e-commerce, talent, and B2B SaaS. Build agents that plan and execute, retrieval pipelines over 250K+ documents and LLM workloads serving millions of users annually.
Agentic orchetration, Claude/GPT/Mistral, Vision models, vector databases, Cloud.
We deliver working production engineering at scale, not POCs.
About the Role
We're looking for a Senior AI Engineer to join our AI/ML practice and lead the design and delivery of production-grade agentic, RAG, and LLM systems for our clients.
You'll build the kind of systems described on our capabilities page: autonomous agents that plan, execute, and iterate; RAG pipelines indexing hundreds of thousands of documents with hybrid search and re-ranking; document intelligence with OCR and vision models; conversational and voice AI grounded in proprietary data. We currently have 20+ AI projects in production, serving millions of annual users at the largest scale, with 250K+ documents in active RAG pipelines.
You'll collaborate directly with client tech leads while being part of our embedded engineering teams covering backend, data, and DevSecOps. Long-term engagements with real ownership, not throwaway PoCs.
What You'll Do
Design and build agentic AI workflows (plan → execute → evaluate → human checkpoint) using frameworks like LangChain, LlamaIndex, LangGraph, or custom orchestration
Build RAG pipelines end-to-end: ingestion, semantic chunking, embedding strategies, hybrid retrieval, re-ranking, and citation tracking
Integrate frontier and open LLMs (Claude Opus/Sonnet/Haiku, GPT-4o/5, Gemini, Llama, Mistral, Qwen, DeepSeek) with proper guardrails, grounding, and cost optimization
Design and ship production tool-calling and multi-agent systems with reliable evaluation, tracing, and rollback
Operate vector and hybrid search infrastructure (Pinecone, Qdrant, Milvus, pgvector, OpenSearch) at scale
Build document intelligence pipelines using OCR and vision models for extraction, classification, and structuring
Implement evaluation harnesses, offline and online metrics, and golden datasets for LLM-driven systems
Deploy and operate AI workloads on AWS, Azure or OVH (Lambda, EKS, SageMaker, Bedrock, Azure AI Services) with IaC and CI/CD
Collaborate with client product, data, and engineering leads to scope, architect, and deliver AI features end-to-end
What We're Looking For
Strong proficiency in Python, including async, typing, and modern API frameworks (FastAPI)
5+ years of professional software engineering experience, with at least 2 building production LLM applications: RAG, agents, tool use, structured output
Hands-on experience with LangChain, LlamaIndex, Haystack, or equivalent orchestration frameworks
Deep understanding of embeddings, vector search, hybrid retrieval, and re-ranking strategies
Working knowledge of at least one vector database (Pinecone, Qdrant, Milvus, pgvector, OpenSearch)
Experience integrating multiple LLM providers (Anthropic, OpenAI, Bedrock, open-source via vLLM/Ollama)
Solid grasp of LLM evaluation: regression suites, eval harnesses, hallucination and grounding metrics
Experience deploying to AWS or Azure with IaC (Terraform), containers (Docker/Kubernetes), and CI/CD
Strong collaboration and written communication skills — you'll work directly with client teams
Nice to Have
Experience building multi-agent systems (planner/executor/critic patterns) or with agent frameworks (LangGraph, AutoGen, CrewAI, OpenAI Agents SDK, Claude Agent SDK)
LLM fine-tuning with LoRA/QLoRA/PEFT, or training custom embedding models for domain adaptation
Self-hosted LLM and GPU deployment experience (vLLM, TGI, SGLang, Triton on Kubernetes, RunPod, Modal, or bare-metal GPU)
Background in classical ML: training, fine-tuning, or distillation with PyTorch, TensorFlow, or scikit-learn
Experience with voice AI (Whisper, ElevenLabs, real-time STT/TTS pipelines)
Document AI experience (Textract, Azure Document Intelligence, vision LLMs)
Knowledge of LLM observability tooling (LangSmith, Langfuse, Arize, Helicone)
Open-source contributions to AI/ML tooling, models, or infrastructure
Awareness of AI safety, prompt injection defence, data privacy, and compliance (GDPR, SOC2)
What We Offer
Long-term, meaningful engagements building real AI products in production, not POCs
Embedded inside multi-disciplinary teams covering backend, data, DevOps, and AI
Cutting-edge stack: Claude (Opus/Sonnet/Haiku/Vision), GPT-4o/5, Bedrock, LangChain, LlamaIndex, Pinecone, Snowflake, AWS, Azure
Variety across domains: BPO, tech media, pharma, legal, e-commerce, talent, travel, fintech — agents, RAG, voice, vision
Remote-friendly with flexible hours
Backed by Eastvantage Group — global delivery, local culture
Part of an official partner company of Anthropic with available certification and upskilling.
Engineering culture that values ownership, transparency, and craft