Job Description
Responsibilities
- Strong understanding of data warehousing, data lakes, and best practices for data quality, security, and performance.
- Design, develop, deploy, and maintain scalable data pipelines and ETL, including ELT processes using Azure Data Factory (ADF), Azure Databricks, and other Azure data services (e.g., Azure Data Lake Storage, Azure Synapse Analytics).
- Implement efficient, reusable, and testable code for data transformation, cleansing, and analysis using Python and PySpark within Databricks notebooks and jobs.
- Use ADF pipelines to orchestrate data movement and trigger Databricks notebooks or jobs, ensuring seamless integration between various data sources and destinations.
- Manage data storage, ensure data integrity, optimize data processing, and troubleshoot performance issues related to Databricks and ADF solutions.
- Collaborate closely with data scientists, data analysts, and business stakeholders to ...