Home
Varion Framework Case Studies Insights News Contact
← All positions

Senior AI Engineer

AI / ML Senior Sofia, Bulgaria / Remote Full-time

Design and ship production agentic AI, RAG, and LLM systems for clients across BPO, tech media, healthtech, legal, e-commerce, talent, and B2B SaaS. Build agents that plan and execute, retrieval pipelines over 250K+ documents and LLM workloads serving millions of users annually. Agentic orchetration, Claude/GPT/Mistral, Vision models, vector databases, Cloud. We deliver working production engineering at scale, not POCs.

Tech Stack

PythonFastAPIClaudeGPT-4o/5BedrockPineconeQdrantpgvectorOpenSearchAWSAzureTerraformKubernetesvLLM

About the Role

We're looking for a Senior AI Engineer to join our AI/ML practice and lead the design and delivery of production-grade agentic, RAG, and LLM systems for our clients.

You'll build the kind of systems described on our capabilities page: autonomous agents that plan, execute, and iterate; RAG pipelines indexing hundreds of thousands of documents with hybrid search and re-ranking; document intelligence with OCR and vision models; conversational and voice AI grounded in proprietary data. We currently have 20+ AI projects in production, serving millions of annual users at the largest scale, with 250K+ documents in active RAG pipelines.

You'll collaborate directly with client tech leads while being part of our embedded engineering teams covering backend, data, and DevSecOps. Long-term engagements with real ownership, not throwaway PoCs.

What You'll Do

  • Design and build agentic AI workflows (plan → execute → evaluate → human checkpoint) using frameworks like LangChain, LlamaIndex, LangGraph, or custom orchestration
  • Build RAG pipelines end-to-end: ingestion, semantic chunking, embedding strategies, hybrid retrieval, re-ranking, and citation tracking
  • Integrate frontier and open LLMs (Claude Opus/Sonnet/Haiku, GPT-4o/5, Gemini, Llama, Mistral, Qwen, DeepSeek) with proper guardrails, grounding, and cost optimization
  • Design and ship production tool-calling and multi-agent systems with reliable evaluation, tracing, and rollback
  • Operate vector and hybrid search infrastructure (Pinecone, Qdrant, Milvus, pgvector, OpenSearch) at scale
  • Build document intelligence pipelines using OCR and vision models for extraction, classification, and structuring
  • Implement evaluation harnesses, offline and online metrics, and golden datasets for LLM-driven systems
  • Deploy and operate AI workloads on AWS, Azure or OVH (Lambda, EKS, SageMaker, Bedrock, Azure AI Services) with IaC and CI/CD
  • Collaborate with client product, data, and engineering leads to scope, architect, and deliver AI features end-to-end
  • What We're Looking For

  • Strong proficiency in Python, including async, typing, and modern API frameworks (FastAPI)
  • 5+ years of professional software engineering experience, with at least 2 building production LLM applications: RAG, agents, tool use, structured output
  • Hands-on experience with LangChain, LlamaIndex, Haystack, or equivalent orchestration frameworks
  • Deep understanding of embeddings, vector search, hybrid retrieval, and re-ranking strategies
  • Working knowledge of at least one vector database (Pinecone, Qdrant, Milvus, pgvector, OpenSearch)
  • Experience integrating multiple LLM providers (Anthropic, OpenAI, Bedrock, open-source via vLLM/Ollama)
  • Solid grasp of LLM evaluation: regression suites, eval harnesses, hallucination and grounding metrics
  • Experience deploying to AWS or Azure with IaC (Terraform), containers (Docker/Kubernetes), and CI/CD
  • Strong collaboration and written communication skills — you'll work directly with client teams
  • Nice to Have

  • Experience building multi-agent systems (planner/executor/critic patterns) or with agent frameworks (LangGraph, AutoGen, CrewAI, OpenAI Agents SDK, Claude Agent SDK)
  • LLM fine-tuning with LoRA/QLoRA/PEFT, or training custom embedding models for domain adaptation
  • Self-hosted LLM and GPU deployment experience (vLLM, TGI, SGLang, Triton on Kubernetes, RunPod, Modal, or bare-metal GPU)
  • Background in classical ML: training, fine-tuning, or distillation with PyTorch, TensorFlow, or scikit-learn
  • Experience with voice AI (Whisper, ElevenLabs, real-time STT/TTS pipelines)
  • Document AI experience (Textract, Azure Document Intelligence, vision LLMs)
  • Knowledge of LLM observability tooling (LangSmith, Langfuse, Arize, Helicone)
  • Open-source contributions to AI/ML tooling, models, or infrastructure
  • Awareness of AI safety, prompt injection defence, data privacy, and compliance (GDPR, SOC2)
  • What We Offer

  • Long-term, meaningful engagements building real AI products in production, not POCs
  • Embedded inside multi-disciplinary teams covering backend, data, DevOps, and AI
  • Cutting-edge stack: Claude (Opus/Sonnet/Haiku/Vision), GPT-4o/5, Bedrock, LangChain, LlamaIndex, Pinecone, Snowflake, AWS, Azure
  • Variety across domains: BPO, tech media, pharma, legal, e-commerce, talent, travel, fintech — agents, RAG, voice, vision
  • Remote-friendly with flexible hours
  • Backed by Eastvantage Group — global delivery, local culture
  • Part of an official partner company of Anthropic with available certification and upskilling.
  • Engineering culture that values ownership, transparency, and craft
  • Apply for Senior AI Engineer position