Design and optimize scalable data pipelines powering AI-driven products for a leading UK consumer advocacy platform. AWS, Snowflake, Airflow, Python — production engineering at scale.
About the Role
We're looking for a Senior Data Engineer to join our team and work on one of our flagship long-term client engagements — a leading UK consumer advocacy organisation transforming how millions of people make purchasing decisions.
You'll design, build, and optimize scalable data pipelines powering AI-driven products, including a RAG-powered knowledge assistant that searches 100K+ product reviews with semantic understanding. This is production engineering at scale — not a proof of concept.
You'll collaborate directly with the client's data tech lead while being part of our embedded 10-person engineering team covering mobile, backend, data, and DevSecOps.
What You'll Do
Design and develop scalable ETL/ELT pipelines using Python, SQL, and PySpark
Build reusable data frameworks on AWS, Snowflake, and Managed Airflow
Implement infrastructure-as-code using Terraform for cloud-based data environments
Develop and maintain data models, transformations, and orchestration workflows
Ensure data quality, observability, and lineage tracking across the ecosystem
Optimize query performance, storage costs, and compute resources in Snowflake and AWS
Support AI/ML workloads — feeding clean, structured data into RAG pipelines and recommendation systems
Implement CI/CD pipelines for data infrastructure automation
Monitor and troubleshoot data pipelines to maintain SLAs
What We're Looking For
Strong proficiency in SQL, Python, and PySpark for data processing and transformation
Hands-on experience with AWS (S3, Glue, Lambda, Redshift, Bedrock)
Expertise in Snowflake — performance tuning, schema design, cost optimization
Experience with Terraform for infrastructure automation
Proficiency in Airflow or other orchestration tools
Understanding of data observability, monitoring, and governance best practices
Experience with Git and CI/CD for data pipelines
Strong collaboration skills — you'll work directly with the client's team
Experience with code-based ETL/ELT tools
Nice to Have
Experience with RAG pipelines, vector databases (Pinecone, OpenSearch), or AI/ML data preparation
Exposure to Data Mesh and distributed data ownership
Experience with Docker, Kafka, Kinesis
Knowledge of data security and compliance frameworks (GDPR, PCI DSS, SOC2)
Experience in cost optimization for cloud-based data architectures
What We Offer
Long-term, meaningful work on a high-profile product used by millions
Part of a 10-person embedded team — not a solo contractor
Work with cutting-edge AI/ML stack (Amazon Bedrock, Claude, Pinecone, Snowflake)
Remote-friendly with flexible hours
Backed by Eastvantage Group — global delivery, local culture
Engineering culture that values ownership, transparency, and craft