- Professional
Required Skills & Qualifications:
- Bachelor's degree in Computer Science, Engineering, or related technical field
- 5+ years of software development experience
- Strong proficiency in Python, Java or Scala programming languages
- Extensive hands-on experience with Apache Spark, including:- Spark SQL
- Spark Streaming and its data sources KAFKA, CDC feed, etc
- Performance optimization and tuning
- Data transformation and processing
 
- Hands on Experience working with cloud-based Spark platforms (Databricks, AWS EMR, AWS Glue)
- Strong understanding of Hive, Unity catalog, Glue catalog
- Hands on Experience working with any data quality framework/tools
- Hands on Experience working with Observability and monitoring for data processing pipelines
- Strong understanding of distributed computing concepts
- Proficiency in version control systems (Github/GitLab/Bitbucket)
Technical Requirements:
- Experience building and maintaining high-volume data processing pipelines
- Knowledge of data modeling and ETL / ELT best practices
- Familiarity with SQL and NoSQL databases
- Understanding of data warehouse concepts and dimensional modeling
Preferred Qualifications:
- Experience with real-time data processing and streaming architectures
- Knowledge of Delta Lake or similar data lakehouse technologies
- Experience with CI/CD pipelines
- Cloud platform expertise (AWS/Azure/GCP)
- Contributions to open-source projects
Key Responsibilities:
- Design and implement scalable data pipelines
- Optimize existing data workflows for performance and reliability
- Collaborate with data scientists and analysts to support their data needs
- Implement data quality checks and monitoring
- Maintain documentation for data processes and architectures
Soft Skills:
- Strong problem-solving abilities
- Excellent communication skills
- Team collaboration
- Ability to work independently
- Strong attention to detail
The ideal candidate should have a proven track record of developing and maintaining production-grade data pipelines and be passionate about working with large-scale data systems.
Apply Now
 
			 
			 
			 
			