Customer requirements and expectations are changing in relation to data integration. The need among users to develop and debug their extract transform load, ETL, and extract load transform, ELT, workflows iteratively, is therefore becoming more imperative. Azure Data Factory can help you build and develop iterative debug data factory pipelines when you develop your data integration solution. By authoring a pipeline using the pipeline Canvas, you can test your activities and pipelines by using the debug capability. In Azure Data Factory, there is no need to publish changes in the pipeline or activities before you want to debug. This is helpful in a scenario where you want to test the changes and see if it works as expected before you save and publish them. Sometimes, you don't want to debug the whole pipeline, but test a part of the pipeline. A debug run allows you to do just that. You can test a pipeline into end or set a break-point. By doing so in debug mode, you can interactively see the results of each step while you build and debug your pipeline. As you create or modify a pipeline that is running, you can see the results of each activity in the output tab of the pipeline Canvas. After a test run succeeds and you are satisfied with the results, you can add more activities to the pipeline and continue debugging in an iterative manner. When you are not satisfied or like to stop the pipeline from debugging, you can cancel a test run while it is in progress. You do need to be aware that by selecting the debug slider, it will run the pipeline. Therefore, if the pipeline contains, for example, a copy activity, the test run will copy data from source to destination. A best practice is to use test folders in your copy activities and other activities when debugging, such that when you are satisfied with the results and have debugged the pipeline, you switch to the actual folders for your normal operations. To debug the pipeline, select "Debug" on the toolbar. You didn't see the status of the pipeline run in the output tab at the bottom of the window. Once the pipeline can run successfully in the top toolbar, select "Publish All." This action publishes entities, datasets, and pipelines, you created to Data Factory. Finally, to see notification messages on the top right click "Show Notifications," the bell button. During the building of mapping data flows, you can interactively watch how the data shapes and transformations are executing so that you can debug them. To use this functionality, it is first necessary to turn on the Data Flow Debug feature. The debug session can be used both in dataflow design sessions, as well as during pipeline debug execution of data flows. Once the debug mode is on, you will built the data flow with an active Spark cluster. The Spark cluster will close once you turn debug off. You have a choice of what compute to use with debug. When you use an existing debug cluster, it will reduce the startup time. However, for complex or parallel workloads, you might want to spin up your own just-in-time cluster. Best practices for debugging data flows is, keep the debug mode on and check and validate the business logic included in the data flow. Visually viewing the data transformations and shapes helps you see the changes. If you want to test the dataflow in a pipeline that you've created, it is best to use debug button on the pipeline panel. While data preview doesn't write data, a debug run within your dataflow will write, just like debugging a pipeline data to your sync destination. As mentioned, each debug session started from the Azure Data Factory user interface is considered a new session with its own Spark cluster. In order to monitor the sessions, you can use the monitoring view for debug session to manage debug sessions per each Data Factory that is being set up. To see whether a Spark cluster is ready for debugging, you can check the cluster status indication at the top of the design surface. If it's green, then it's ready. If the cluster wasn't running yet when you entered the debug mode, the waiting time could be around 5-7 minutes since the clusters need to spin up. It is best practice that once you've finished debugging, you switch debug mode off so that the Spark cluster terminates. To use this functionality, it is first necessary to turn on the Data Flow Debug feature. When you're debugging, you can edit the preview of the data in a data flow by clicking "Debug Setting." Examples of change in the data preview could be a row limit or file source in case you use source transformations. When you select the "Staging Linked Service," you can use Azure Synapse Analytics as a source. For parameters that are in your dataflow or if you have parameters in any of its reference datasets, you can specify what values to use during debugging by selecting the parameters tab. During debugging, syncs are not required and are ignored in the dataflow. If you want to test and write the transformed data to your sync, you can execute the data flow from a pipeline and use the debug execution from the pipeline. In order to do so, you use a break-point on the activity at the point where you want to test and then select "Debug." A Debug Until option appears as an empty red circle at the upper right corner of the element. After you select the Debug Until option, it changes to a filled red circle to indicate the break-point is enabled. Azure Data Factory will then make sure that the test only runs until that break-point activity in the pipeline. This feature is especially useful when you want to test only a subset of the activities in a pipeline. In most scenarios, the debug features in Azure Data Factory are sufficient. However, sometimes it is necessary to test changes in a pipeline in a cloned sandbox environment. A use case to do so could be when you have parameterized ETL pipelines. For instance, when you'd like to test how a particular parameter behaves when triggered. In that case, the cloning of a sandbox environment might be suitable. Since Azure Data Factory mostly only incurs charges by the number of runs, a second Data Factory doesn't have to lead to additional fees. In order to monitor debug runs, you can check the output tab, but only for the most recent run that occurred in the browsing session since it won't show the history. If you would like to get a view of the history of debug runs or see all the active debug runs, you can navigate to the monitor tab. One thing to take in mind is that the Azure Data Factory Service only keeps debug run history for 15 days. In relation to monitoring your dataflow debug sessions, you would also navigate to the monitor tab.