- Professional
- Escritório em Bangalore
Job description
What You’ll Do
As a Data Engineer, you'll be a key contributor to building and scaling robust data solutions across the organization. You will:
- Architect Scalable Data Pipelines: Design, develop, and maintain reliable ETL/ELT workflows using Databricks, Spark, and Python.
- Enable Data Access & Analytics: Partner with analytics, product, and engineering teams to ensure timely, accurate, and governed access to data for downstream reporting and analytics.
- Optimize Data Workflows: Improve performance, reduce latency, and streamline processes by tuning SQL, optimizing Spark jobs, and enhancing cloud data pipelines.
- Leverage Cloud Infrastructure: Utilize AWS services (e.g., S3, Glue, Lambda) to manage and scale data engineering workloads.
- Drive Best Practices: Establish and maintain data engineering standards, including code quality, data security, version control, and documentation.
- Build & Maintain Data Models: Construct and support dimensional and normalized data models that support cross-functional use cases and reporting needs.
- Automation & Monitoring: Set up robust pipeline orchestration (e.g., with Airflow, Databricks Jobs, or AWS Step Functions) and monitoring/alerting systems.
- Collaborate Cross-Functionally: Work with data analysts, scientists, and business users to understand requirements and transform raw data into business-ready datasets.
- 5+ years of experience as a Data Engineer or in a similar role.
- Strong hands-on experience with Databricks (Spark, Delta Lake) and Python-based ETL frameworks.
- Solid experience working with AWS cloud services for data processing and storage.
- Proficient in SQL for data wrangling, transformation, and performance tuning.
- Experience with data lake architectures, ELT/ETL development, and orchestration tools.
- Familiarity with software engineering best practices, including CI/CD, version control, and code reviews.
- Strong communication and collaboration skills; comfortable working with technical and non-technical stakeholders.
- Experience with Power BI or other BI tools (e.g., Tableau, Looker) to assist in data visualization or self-service reporting enablement.
- Exposure to data governance and data quality frameworks.
- Understanding of data cataloging tools and metadata management.