Category Archives: ADFv2

Introduction to Azure Data Factory

Introduction to Azure Data Factory

Are you new to Azure, or looking to make the move and curious about what Azure Data Factory is? Azure Data Factory is Microsoft’s cloud version of a data integration and orchestration tool. It allows you to move your data from one place to another and make changes with it. Here, I’d like to introduce the 4 main steps or components of how Azure Data Factory works.

Step 1 – Connect and Collect

Connect and collect is where you define where you’re pulling your data from, such as SQL databases, web applications, SSAS, etc. You collect that data into one centralized location like Azure Data Lake or Azure Blob Storage.

Step 2 – Transform and Enrich

In this step, you take the data from your centralized storage and enrich it to further expand on your data using HDInsight operation, Spark or Data Lake analytics, for example.

Step 3 – Publish

Next is to publish the data to a place that it can be better used and consumed by the end users. Any BI tool, such as Power BI or reporting services are great choices.

Step 4 – Monitor

 

This last step is to monitor the data to be sure jobs are running, and data is flowing, properly. It’s also important to monitor to ensure data quality. Monitoring can be done with tools like PowerShell, Microsoft Operations Manager or Azure Monitor, which allow you to monitor inside the Azure portal.