Job Title: Remote Data Engineer (UK-based)
Location: Remote
Employment Type: Full-time
Experience: 3+ years commercial
Job Overview:
As a Data Engineer, you will be responsible for designing, building, and maintaining scalable data pipelines and infrastructure. You will work closely with data scientists, analysts, and other engineers to ensure our data is clean, organised, and easily accessible. This role requires hands-on experience with Python, SQL, and AWS services, along with strong analytical and problem-solving skills.
Key Responsibilities:
- Data Pipeline Development: Design, develop, and maintain ETL pipelines that ingest, process, and store large datasets from multiple sources.
- Database Management: Build and manage scalable databases using SQL and cloud-based technologies.
- Cloud Infrastructure: Deploy, monitor, and optimise data infrastructure on AWS, leveraging services such as S3, Redshift, RDS, Lambda, and EC2.
- Data Quality & Governance: Ensure data integrity, accuracy, and consistency through proper validation, testing, and monitoring.
- Collaboration: Work with data scientists, analysts, and business stakeholders to understand data requirements and implement solutions accordingly.
- Automation & Optimisation: Automate repetitive tasks, optimise performance, and troubleshoot any issues that arise in the data pipeline or infrastructure.
- Documentation: Maintain clear and comprehensive documentation of all data processes and pipelines.
Required Qualifications:
- Experience: 3+ years of experience in data engineering or a related field.
- Programming: Proficiency in Python for data manipulation and ETL processes.
- SQL: Strong experience with SQL for querying and managing relational databases (e.g., MySQL, PostgreSQL, etc.).
- AWS: Hands-on experience with AWS services like S3, Redshift, RDS, Lambda, EC2, and IAM.
- Data Warehousing: Knowledge of data warehousing concepts and best practices.
- ETL: Experience designing and building ETL processes.
- Version Control: Familiarity with Git or other version control systems.
- Problem-Solving: Strong analytical skills and the ability to troubleshoot data issues effectively.
- Communication: Excellent written and verbal communication skills with the ability to collaborate across teams.
Preferred Qualifications:
- Experience with containerisation technologies such as Docker and Kubernetes.
- Familiarity with Apache Airflow, Kafka, or other data orchestration and messaging tools.
- Experience working with NoSQL databases (e.g., MongoDB, DynamoDB).
- Knowledge of machine learning and data science workflows.
- Experience with infrastructure as code (e.g., Terraform, CloudFormation).
- Familiarity with CI/CD pipelines for data engineering processes.