So now from MongoDB we will go to Aerospike which is a key value store. Key value stores typically offer an API. That is the way to access data using a programming language like Python or Java. We will take a very brief look at Aerospike, which offers both a programmatic access and a limited amount of query access to data. The data model of Aerospike is illustrated here. Data are organized in lean spaces which can be in memory or on flash disks. Name spaces are top level data containers. The way you collect data in name spaces relates to how data is stored and managed. So name space contains records, indexes, and policies. Now policies dictate name space behavior like how data is stored, whether it's on RAM or disk, or how how many replicas exist for a record, and when records expire. A name space can contain sets, you can think of them as tables. So here there are two sets, people and places, and a set of records which are not in any set. Within a record, data is stored in one or many bins. Bins consist of a name and a value. The example written here is in Java, and you don't have to know Java to follow the main point here. Here we are creating indexes on key value data that's handled by Aerospike. This data set comes from Twitter. Each field of a tweet is extracted and put into Aerospike as a key value pair. So we declare the namespace to be example, and the record set to be tweet. The name of the index to be TestIndex, and the name of the bin as user name. Since this index is stored on disk an SQL-like command, like SHOW INDEX, shows the content of the index as you can see here. This routine shows how data can be inserted into Aerospike programmatically. Again, the goal is to point out a few salient aspects of data insertion regardless of the syntax of the language. Now since this is a key value store, one first needs to define the key. This line here says that the key in the namespace call example, and set call tweet, is the value of the function getId, which returns the ID of a tweet. When the data is populated, we are essentially creating bins. Here, user name is the attribute and the screen name obtained from the tweet is the value. The actual insertion happens here in the client. The client.put statement, where we need to mention the key and the bins we have just created. Now why are we inserting two bins at a time? Two bins with two ids and user name? This is an idiosyncrasy of the Aerospike client. After data is inserted one can create other data using AQL which is very much like SQL. This screenshot shows a part of the output of a simple select star query. Now in your hands-on exercise, you'll be able to play with the Aerospike data. This is just a screenshot showing the basic query syntax of AQL, that is Aerospike Query Language, and a few examples. The last two lines show a couple of interesting features of the language. The operation between 0 and 99, is a nicer way of stating a range query, which gives a lower and upper limits on a variable. The last line shows the operation cost, which transforms one type of data to another type. Here it transforms coordinates, that is latitude and longitude data, to a JSON format called GeoJSON which is designed to represent geographic data in a JSON structure. We will finish our coverage of queries with a quick reference to an advanced topic, which is beyond the scope of this course. Now you have seen in prior courses that streaming data is complex to process because a stream is infinite in nature. Now does this have any impact on query languages and evaluation? The answer is that it absolutely does. We'll mention one such impact here. This shows a pictorial depiction of streaming data. The data is segmented into five pieces, shown in the white boxes in the upper row. These can be gathered, for example every few seconds, or every 200 data objects. The query defines a window to select key of these data objects as a unit of processing. Here case three. So three units of data are picked up in a window unit. To get the next window it is moved by two units, this is called a slide of the window. Since the window size is three and the slide is two, one unit of information overlaps between the two consecutive windows. The lower line in this diagram shows the initialized item, let's ignore it, followed by two window sets of data records for processing. Thus the query language therefore, will have to specify a query that's an SQL query, over a window, which is also specified in the query. Now in a traffic data stream example, the SQL statement might look like this. Where the window size is 30 second, and the slide is the same size as window giving output every 30 seconds. So streaming data results in changes in both the query language and the way queries are processed.