In this video we will put everything together. So we will implement a small data simulator workflow which is simulating Node-RED running on an IoT device and publishing data through the IBM Watson IoT platform using the MQTT protocol. The IBM Watson IoT platform includes an MQTT broker, as a service with built in fault tolerance, load balancing and failover in over 50 IBM Data Centers spread around the globe. Then again, using Node-RED, we'll subscribe to the data created by our virtual device. And finally, we will just stream the data into Apache CouchDB, so let's do it together one by one. So let's open the Node-RED instance you've created in week one and get rid of the sample flow by selecting it, and pressing the Delete key on your keyboard. Let's take flow1.js from the Coursera download page and open it using a text editor. Select everything and paste it to Node-RED by clicking on the menu > Import > Clipboard. Paste the flow from the JSON file to the text area, and click on Import. Direct the flow to the panel. Click on the + symbol to create a new panel for a new flow. We are now importing flow2.js. Again, we select and copy to the Clipboard and paste the flow to Node-RED. So the Cloudant connector should be configured to use the Cloudant database from your Node-RED application running on Bluemix. As long as the service dropdown is not empty you are fine. And on the Watson IoT Platform connector side, you basically define to subscribe to add data without filtering it by subscribing to certain device types, device IDs, or event names. Note how fast this is. Once we activate the debug node, we will see test data generated by the test data generator. We'll have a look at this later. But basically, it is important to notice that this particular flow normally doesn't run in the cloud, but on an IoT device or gateway reading raw sensor data directly from the built-in sensors and transmitting those directly to the cloud. This is hypothetical sensor data coming from a washing machine. Let's deactivate the debug node and add one on the cloud side. Now we can see the very same data, but now we are officially on the cloud and not pretending we are on an IoT device anymore. We are currently streaming those data into Cloudant at the speed of more than one record per second. So now let's have a brief look at the test data generator. This inject node simulates a sensor node you would normally have on a IoT device. So for instance, on a Raspberry Pi, it would have a GPIO node. This inject node creates a message every second, with some simulated sensor data concerning the fluid in the washing machine. Every three seconds we create a message containing information on the voltage and frequency. And every five seconds we sample the actual speed of the motor. So now let's create some randomness. This is implemented in JavaScript and is beyond the course, but basically, we create data fluctuating around some average value, and certainly also at some odd values, which we are calling outliers. We'll detect those during a later stage in the course. We do the same for voltage and for the motor. In addition, we add a timestamp to each message independently from which sensor it is generated. Note that since Node-RED simply passes JSON objects between the nodes and JavaScript can access individual parts of the JSON object using a so-called OGN annotation, we can assign the timestamp to this JSON object in a very convenient way. Finally, we have to publish these messages via MQTT to the IBM Watson IoT Platform. Bluemix Service means that the credentials, in order to send messages to the IBM Watson IoT platform MQTT message broker are injected into the node via Cloud Foundry. And you basically don't have to care about them. So what we want to create from each message is a so-called Device Event. Device Events are meant to send sensor data upstream to the cloud. Every IoT device has to have a type in order to correctly classify it. And every device needs a unique device ID. Event Type is a way to assign some sort of category to each message, which is used for subscribing to events. So, for example, we could publish events as alerts as well. And the downstream subscriber could only subscribe to alerts, basically ignoring status messages. So let's have a look at the Cloudant user interface to see whether the data we are storing actually arrives in the database. So we open the Bluemix user interface and scroll down to the Cloudant service which is bound to our Node-RED application. You can easily make autocorrect one since by default it has the same name as the URL of your application. Now we click on LAUNCH in order to be taken to the Cloudant dashboard which is running outside of Bluemix. So let's click on washing, which is the database we are streaming data to. Now let's have a look at one single JSON document which represents one single sensor measurement. You can see that this value is coming from a shaft sensor where the motor is attached to. Now let's have a look at another measurement. This entry reflects a measurement from the current sensor, so we are mixing schemas here which is not a problem in NoSQL, but very tedious in the SQL database. In order to access Cloudant data on Apache Spark, we need to obtain the Cloudant credentials. So again, we enter the IBM Bluemix dashboard, click on our Node-RED application. Now we want to check which services are bound to our application by clicking on Connections. We see that there are three services bound to it. A relational database called dashDB, the IBM Watson IoT Platform, and the Cloudant service where we actually store our data in. Therefore, we have to obtain the credentials of the database. This is the username, this is the password and this is the host name. The database name is washing, as defined in the Node-RED flow. So now if we have all information to access the database from Apache Spark, so let's have a look how this works. So let's again have a look at the code from the previous video. There are basically four parameters you have to specify when accessing Cloudant from Apache Spark using the Cloudant connector. Three of them, you have obtained in the previous step by directly accessing the user interface of IBM Bluemix. And the last one, you basically defined in the Cloudant connector node of Node_RED when providing the database name. This is the host name of Cloudant, as in the credentials. This is the username, and this is the password. This is the Cloudant database name. So let's sum this up. Of course, IoT data comes from IoT devices, where on many of them, Node-RED can be run as well. Therefore, it is easy and convenient to simulate test data also using Node-RED in the cloud. As we have seen, it is very easy and straightforward to stream these data to Cloudant or any other data store covered in the previous video of this week. The Apache Spark Cloudant connector makes it very easy to create the Apache Spark data frame out of a Cloudant database. And from there on, you have the full power of the Apache Spark data frame API, and can basically forget where your data resides on. So as you now know the basic tooling, let's dive into some math in order to lay the foundations to exploratory data analysis of IoT sensor data. This will be fun so let's get started.