Senior/Lead Data Engineer - Databricks.. at Srijan Technologies PVT LTD
Srijan Technologies PVT LTD · Gurgaon, India · Onsite
- Senior
- Office in Gurgaon
Location: Gurgaon,None,None
Senior/Lead Data Engineer - Databricks
We are seeking a highly skilled and experienced Data Engineering Lead with a strong background in the retail domain and exceptional programming abilities. As a Lead, you will play a pivotal role in implementing, and optimizing data architecture to support our retail business operations and analytics initiatives. Your expertise in Spark programming, optimization techniques, and familiarity with Databricks and CI/CD practices will be instrumental in ensuring the efficient and effective management of our data ecosystem.
Imp Pointers as below skills should be in the CV
Data Platform (Data Engineers)
- Data Lake: AWS S3 / Azure / Dell Cloud Data Lake with Delta Lake format
- Data Warehouse: Databricks, IOMETE, Google BigQuery for analytical workloads
- Batch Processing /Stream Processing: Batch/real-time processing of data pipelines
- Database: PostgreSQL for transactional data, NoSQL (ex. MongoDB, Cassendra) for document storage
Responsibilities:
- Design and develop data models, data integration processes, and data pipelines to capture, transform, and load structured and unstructured data from various retail sources.
- Hands-on programming in Spark to develop and optimize data processing applications and analytics workflows.
- Apply optimization techniques to enhance the performance and efficiency of data processing and analytical tasks.
- Evaluate and implement appropriate tools and technologies, including Databricks, to streamline data operations and ensure scalability and reliability.
- Work closely with other team members to ensure data integrity, consistency, and accessibility across the organization.
- Define and enforce best practices for data governance and data management, including data quality, metadata management, and data security.
- Collaborate with DevOps teams to establish and maintain CI/CD pipelines for data engineering and analytics workflows.
- Peer Review of team members' deliverables
- Stay updated with the latest advancements and trends in the retail domain, data architecture, and programming languages to drive continuous improvement.
- At least 5+ years of experience in the data engineering domain.
- Proven experience as a senior Data Engineer /Lead, preferably within the retail industry.
- Strong programming skills with expertise in PySpark programming and optimization techniques.
- Hands-on experience with Databricks, Deltalake and its components for data processing and analytics.
- Hands-on experience in data modelling, data integration, and ETL/ELT processes.
- Experience in working with Gitlab pipelines and an in-depth understanding of CI/CD pipeline designs.
- Experience with data governance, data quality, and metadata management.
- Strong analytical and problem-solving abilities with a detail-oriented mindset.
- Excellent communication and collaboration skills to work effectively with cross-functional teams.
- Ability to adapt to a fast-paced and evolving environment while managing multiple priorities.
- Good to have experience in at least one of the Cloud Vendor (AWS / Azure / GCP)
- Good to have experience with streaming technologies as well