Data Engineer – Azure & Databricks Focus na Invictus Capital Partners / Verus Mortgage Capital
Invictus Capital Partners / Verus Mortgage Capital · Bloomington, Estados Unidos Da América · Hybrid
- Professional
- Escritório em Bloomington
In this role, you will design and deliver high-performance, governed data pipelines that integrate data from multiple sources—SQL Server Managed Instance, other Azure-based systems, and third-party platforms—using PySpark, SQL, and Databricks utilities.
Responsibilities and Duties:
- Design, develop, and optimize data pipelines in Azure Databricks using PySpark and SQL, applying Delta Lake and Unity Catalog best practices.
- Build modular, reusable libraries and utilities within Databricks to accelerate development and standardize workflows.
- Implement Medallion architecture (Bronze, Silver, Gold layers) for scalable, governed data zones.
- Integrate external data sources via REST APIs, SFTP file delivery, and SQL Server Managed Instance, implementing validation, logging, and schema enforcement.
- Utilize parameter-driven jobs and manage compute using Spark clusters and Databricks serverless. Collaborate with data analytics teams and business stakeholders to understand requirements and deliver analytics-ready datasets.
- Monitor and troubleshoot Azure Data Factory (ADF) pipelines (jobs, triggers, activities, data flows) to identify and resolve job failures and data issues.
- Automate deployments and manage code using Azure DevOps for CI/CD, version control, and environment management.
- Contribute to documentation, architectural design, and continuous improvement of data engineering best practices.
- Support the design and readiness of the data platform for AI and machine learning initiatives.
Education and Experience:
- Bachelor’s degree in Computer Science, Data Engineering, Information Systems.
- 5+ years of hands-on data-engineering experience in Azure-centric environments.
- Expertise with Azure Databricks, PySpark, Delta Lake, and Unity Catalog.
- Strong SQL skills with experience in Azure SQL Database or SQL Server Managed Instance.
- Proficiency in Azure Data Factory for troubleshooting and operational support.
- Experience integrating external data using REST APIs and SFTP.
- Working knowledge of Azure DevOps for CI/CD, version control, and parameterized deployments.
- Ability to build and maintain reusable Databricks libraries, utility notebooks, and parameterized jobs.
- Proven track record partnering with data analytics teams and business stakeholders.
- Excellent communication, problem-solving, and collaboration skills.
- Interest or experience in AI and machine learning data preparation.
- Experience implementing Medallion architecture and working within governed data environments.
- Knowledge of data governance, RBAC, and secure access controls in Azure.
- Familiarity with dimensional modeling, data warehousing concepts, and preparing datasets for BI tools (e.g., Power BI).
- Understanding of Spark cluster management, serverless compute, and performance optimization.
- Exposure to creating and managing Databricks utility widgets and leveraging Delta Lake features like time travel and schema enforcement.
- Mortgage or financial-services industry experience (a plus but not required).
- Hands-on experience preparing datasets for AI/ML models.
- Azure Data Engineering Expertise: Skilled in Azure Databricks, PySpark, Delta Lake, Unity Catalog, and SQL-based environments.
- Data Pipeline Development: Proven ability to design, optimize, and maintain scalable ETL/ELT pipelines using Databricks and Azure Data Factory
- Data Architecture & Governance: Knowledge of Medallion architecture, schema enforcement, RBAC, and secure access controls.
- Integration Skills: Experience ingesting and validating data from REST APIs, SFTP, SQL Server Managed Instance, and other Azure sources.
- DevOps & Automation: Strong proficiency with Azure DevOps for CI/CD, version control, and automated deployments.
- Performance Optimization: Ability to tune Spark clusters, leverage serverless compute, and optimize processing at scale.
- Integrity: Builds secure, governed, and trustworthy data solutions.
- Collaboration: Partners effectively with stakeholders and teams.
- Excellence: Delivers high-quality, optimized, and scalable pipelines.
- Critical Curiosity: Learns, questions, and innovates with new technologies.
Benefits
- Great compensation package
- Attractive benefits plans and paid time off
- 401(k) w/ company matching
- Professional learning and development opportunities
- Tuition Reimbursement
- And much more!
Maintaining a reliable, uninterrupted high speed internet connection is a requirement of hybrid or remote positions.
 
									
									
								
							 
			 
			 
			 
			