In this video, we're going to talk about Analytics Services on IBM Cloud. We'll start with looking at the different types of Analytics and then go into the big open source projects that are in the Analytics space. We'll then look at IBM Clouds offerings in this area and end with a demo that shows how to modify a database table and store those results in cloud object storage. Let's get started. Data analytics is the science of analyzing raw data in order to make conclusions about that information. Any type of information can be subjected to data analytics techniques to get insight that can be used to improve things. Let's talk about a few different types of Analytics and understand how they enable us to make better decisions. Descriptive analytics looks at past performance and understands that performance by mining historical data to look for the reasons behind past success or failure. Diagnostic analytics examines data or content to answer the question, why did this happen? It is characterized by techniques such as drilldown, data discovery, data mining, and correlations. Predictive analytics encompasses a variety of statistical techniques from data mining, predictive modeling, and machine learning that analyze current and historical facts to make predictions about future or otherwise unknown events. Prescriptive analytics takes advantage of the results of Descriptive and predictive analytics and suggests a decision. Spark is a unified analytics engine for big data processing with built in modules for streaming, SQL, machine learning and graph processing. It is an open source project with 750 contributors from 200 organizations. Hadoop is a framework that allows for a distributed processing of large data sets across clusters of computers using simple programming models. Like Spark, it is also open source. Hadoop uses the MapReduce programming model for parallel processing of large volumes of data in a distributed environment. There are five main analytics services on IBM Cloud. We have analytics engine, streaming analytics, DB2 warehouse, Cognos dashboard, and information server. Let's dive a bit deeper into each one. Analytics engine lets you deploy and develop applications using open source Apache Spark and Apache Hadoop. It also has on-demand scalability and is HIPAA ready for the Dallas region. It also gives you the ability to customize the environment with third party analytics libraries and packages. The streaming analytics service is used to ingest, analyze, monitor and correlate data in real time. It evaluates a broad range of streaming data from unstructured text, video, audio data to Geospatial and sensor data. It performs real time analysis on data in motion and it can connect with virtually any data source, whether unstructured, structured or streaming, and integrate with Hadoop and Spark. Lastly, it has built in Domain Analytics like machine learning, natural language, spatial temporal, text, acoustics and more to create adaptive stream applications. Db2 warehouse is a fully managed elastic cloud data warehouse that delivers independent scaling of storage and compute and is built for machine learning. You can train and run models directly in the DB2 warehouse engine using SQL, Python and R. It is highly scalable, and you can easily manage and independently scale up, compute and storage. It is secure. You can control and monitor activity on your database with fine grained access control and database auditing capabilities. And it's also Oracle compatible. You can leverage Db2's Oracle capability to run your existing Oracle applications on Db2 warehouse. The Cognos Dashboard Service lets you add end to end data visualizations to your application. The visualizations allow users to interact, for instance, they can drag and drop to quickly find valuable insights on their own. The visualizations have a live connection to the underlying data. Updates the data will be reflected in the visualizations in real time. You could also embed visualizations in your applications. Data can be explored using filters and navigation paths. The information server is a market leading data integration platform, which includes a family of products that enable you to understand, cleanse, monitor, transform, and deliver data, and to collaborate to bridge the gap between business and IT. The data stage tool allows you to create jobs that can extract, transform, and load data. The information governance catalog allows you to track data lineage. The information analyzer provides data profiling and analysis to accurately evaluate the content and structure of your data for consistency and quality. Now let's see an example of the information server in action. So, here in our information server we have our connections, and we can see we have our cloud database and our local Db2 warehouse. We also have table definitions at the top. So, these are the tables that we have from Db2 warehouse. We have parameter sets and jobs. So, now we're going to go ahead and create a job. And we'll make it a parallel job an in here we can look for some connections. So pass in a connection from the cloud Db2 warehouse. And here we will get the schema so will have different schemas and in this blue admin we can go for seven different types. And here we can search for a specific column. And we'll leave all of these employee columns. It will add these to the job. So, basically, we have all of these employee columns from that database from Db2 warehouse and now we can go ahead and cleanse and do some transformations on this data. So, here we can do a peek and that's just going to show us the top 10 parts of our data. So now we've connected that database table to the peek and then now we have. A remove duplicates, so that will just remove any duplicate data in our table and then we have a sort. So, we're going to sort this table. So, we really want to do. In the end is we want to sort the employee salary from highest to lowest. And now we've added a cloud object storage connection, so we're going to store the updated table in cloud object storage. So now we're going to go ahead and find the connection cloud object storage. So, we have the login URL and we're going to go back to our cloud object storage instance, and we'll grab our resource instance ID and API key. So, here's our cloud object storage. We have our resource instance ID. And then we have our API key. So, we'll paste that in there. And the region is US East, so I'll paste that in there. And then that's pretty much it. We'll click OK and that will connect us to our cloud object storage instance. And now we're just going to compile and run this job, and we see that run was successful. And now if we look in our cloud object storage instance, we should be able to see this, this new table which will have the ascending order for the salary. So, if we refresh, we have this output dot CSV file and if we go ahead and download it. We can open it up so we'll open it up with text edit and here we see the database table and we can see the ascending. So we can see the salaries 152,000, 9896, 90, 4,089, thousand etc. So that's how we do a really quick and simple example on information server to transform some data and then send that to cloud object storage. And now to summarize, it's all about the data whether you're cleaning it, transforming it, or visualizing it. Analytics is all about helping you make data driven decisions. IBM Cloud has a full set of products that can help you in making use of your data. At IBM, our analytics engine offering is based on the popular open source projects Hadoop and Spark.