- Senior
- Optionales Büro in Kochi
Skill: Azure Databricks+ Spark
Experience: 6 to 9 years
Location: AIA Kochi
We are seeking a Sr. Developer with 6 to 13 years of experience to join our dynamic team. The ideal candidate will have expertise in Spark in Scala Delta Sharing and Databricks Unity Catalog Admin. This role involves working with cutting-edge technologies like Databricks CLI Delta Live Pipelines and Structured Streaming. The candidate will contribute to risk management solutions and leverage tools such as Apache Airflow Amazon S3 and Amazon Redshift. Proficiency in Python Databricks SQL
Responsibilities
- Develop and implement scalable data processing solutions using Spark in Scala to enhance data-driven decision-making.
- Manage and administer Databricks Unity Catalog to ensure data governance and security compliance.
- Utilize Delta Sharing to facilitate secure and efficient data sharing across various platforms.
- Configure and maintain Databricks CLI for seamless integration and automation of workflows.
- Design and execute Delta Live Pipelines to streamline data ingestion and transformation processes.
- Implement Structured Streaming solutions to handle real-time data processing and analytics.
- Collaborate with cross-functional teams to integrate risk management strategies into data solutions.
- Leverage Apache Airflow for orchestrating complex data workflows and ensuring timely execution.
- Optimize data storage and retrieval using Amazon S3 and Amazon Redshift to improve performance.
- Develop Python scripts to automate data processing tasks and enhance operational efficiency.
- Utilize Databricks SQL for querying and analyzing large datasets to derive actionable insights.
- Implement Databricks Delta Lake to ensure data reliability and consistency across the platform.
- Manage Databricks Workflows to automate and streamline data engineering processes.
Qualifications
- Possess strong expertise in Spark in Scala and Databricks Unity Catalog Admin.
- Demonstrate proficiency in Delta Sharing and Databricks CLI for data management.
- Have experience with Delta Live Pipelines and Structured Streaming for real-time analytics.
- Show capability in risk management and integration with data solutions.
- Be skilled in using Apache Airflow Amazon S3 and Amazon Redshift for data orchestration.
- Exhibit strong Python programming skills for automation and data processing.
- Have a solid understanding of Databricks SQL and Delta Lake for data analysis and reliability.
- Experience with PySpark for distributed data processing is highly desirable.