Senior Data Engineer

Data Senior Sofia, Bulgaria / Remote Full-time

Design and optimize scalable data pipelines powering AI-driven products for a leading UK consumer advocacy platform. AWS, Snowflake, Airflow, Python — production engineering at scale.

Tech Stack

PythonSQLPySparkAWSSnowflakeTerraformAirflowRedshiftPineconeCI/CD

About the Role

We're looking for a Senior Data Engineer to join our team and work on one of our flagship long-term client engagements — a leading UK consumer advocacy organisation transforming how millions of people make purchasing decisions.

You'll design, build, and optimize scalable data pipelines powering AI-driven products, including a RAG-powered knowledge assistant that searches 100K+ product reviews with semantic understanding. This is production engineering at scale — not a proof of concept.

You'll collaborate directly with the client's data tech lead while being part of our embedded 10-person engineering team covering mobile, backend, data, and DevSecOps.

What You'll Do

Design and develop scalable ETL/ELT pipelines using Python, SQL, and PySpark

Build reusable data frameworks on AWS, Snowflake, and Managed Airflow

Implement infrastructure-as-code using Terraform for cloud-based data environments

Develop and maintain data models, transformations, and orchestration workflows

Ensure data quality, observability, and lineage tracking across the ecosystem

Optimize query performance, storage costs, and compute resources in Snowflake and AWS

Support AI/ML workloads — feeding clean, structured data into RAG pipelines and recommendation systems

Implement CI/CD pipelines for data infrastructure automation

Monitor and troubleshoot data pipelines to maintain SLAs

What We're Looking For

Strong proficiency in SQL, Python, and PySpark for data processing and transformation

Hands-on experience with AWS (S3, Glue, Lambda, Redshift, Bedrock)

Expertise in Snowflake — performance tuning, schema design, cost optimization

Experience with Terraform for infrastructure automation

Proficiency in Airflow or other orchestration tools

Understanding of data observability, monitoring, and governance best practices

Experience with Git and CI/CD for data pipelines

Strong collaboration skills — you'll work directly with the client's team

Experience with code-based ETL/ELT tools

Nice to Have

Experience with RAG pipelines, vector databases (Pinecone, OpenSearch), or AI/ML data preparation

Exposure to Data Mesh and distributed data ownership

Experience with Docker, Kafka, Kinesis

Knowledge of data security and compliance frameworks (GDPR, PCI DSS, SOC2)

Experience in cost optimization for cloud-based data architectures

What We Offer

Long-term, meaningful work on a high-profile product used by millions

Part of a 10-person embedded team — not a solo contractor

Work with cutting-edge AI/ML stack (Amazon Bedrock, Claude, Pinecone, Snowflake)

Remote-friendly with flexible hours

Backed by Eastvantage Group — global delivery, local culture

Engineering culture that values ownership, transparency, and craft

By Capability

By Industry