In this module, we will look at data and storage services for Cloud applications. You will be visiting Cloud provider websites to identify services the providers offer. Data is what we store. Storage is where we store it. Let's look at a secret example. Here's Kim's password. Here's the storage for Kim's password. Computer systems can apply access rates to password storage. Kim and the operating system can read or write this password, but no one else. Now the essential question is, what are we protecting? Does this email leak Kim's password? The data is there. It's no longer in protected storage. Most of our security measures rely on keeping sensitive data in protected storage. Kim's password and the emails only protected by its lack of context. Of course, we could rewrite the email message to provide a little context. Secrecy poses a particular problem in data protection. The data itself is what matters, not just what is particular configuration is at the time. For example, having Kim's password, that is secret information and so we have to protect where it's stored and we have to prevent its replication except when it's necessary and appropriate. This is the transitivity problem. Anybody who knows the secret has the option of sharing it with somebody else. Now we have mechanisms we tried to put in computer systems to reduce the risk of copying information like this. Some work better than others. We also use data for decision-making and there we get into the data integrity issue. Bad data yields bad decisions. We have to store integrity sensitive data in protected storage areas, places where it's not going to be modified. Now with respect to integrity, we're not as concerned about secrecy or transitivity, but we are concerned with provenance, the history of the data, who create it, how was it created? What has happened to it since then? Has it been preserved in the proper and its proper stayed the whole time? Have all modifications been justified? Now, in a lot of systems, we simply trust to going from one protected storage component to another in order to infer the provenance of data that we really rely on. That's not the only way to do it, but it's a common approach. Okay, we have data and we have metadata. Metadata is data about data, obviously. In other words, the data's attributes. The most obvious example for people who sit down at a computer might be a file system and the notion that a file has attributes. There's the name. I mean, yeah, you've got the contents, but the contents aren't an attribute, that's the data, the stuff you care about. But the name is an attribute, access permissions are attributes. A location and a directory structure is an attribute. Those attributes tend to be stored in places other than the actual data of the file. So file properties might go into a file header or they might go into a directory of some sort. In database management systems, there's a similar thing, especially obvious in relational systems. You've got table and field names. Those are obviously attributes of the particular fields and the particular records and for each of those, there are attributes and properties associated with how you can process a particular field. Is it a string? Is it a date? Is at an integer of some sort? Now let's talk about deploying data in the different Cloud service models. We'll start with software as a service, and we actually won't get to platform and infrastructure until the next video. Here we go with software as a service. Remember, the way it works is the provider takes care of most of what's going on. It's up to the consumer to provide actual working data needed for their end users. Remember with Software as a service, you'll have end users connecting to the service using resources provided by the Cloud consumer, in order to fulfill the consumer's application requirements. Let's use an example, say WordPress, WordPress.com specifically. There's a website that offers hosting for webpages. They've essentially provided all of the content management right there. How do you manage your data and provide a website using that SaaS style content management system. Well, the consumer provides the site configuration and provides a contents in the media, has to do the downloading, has to do the setting up, say what the name of the site is, what the format's going to look like, choose a theme, provide a logo, perhaps and taglines, other marketing activities, connections to social media if that's appropriate. The software provider, on the other hand, provides everything else. They make sure that all the software is up to date. They take care of most of the security issues. There are few that land on the Cloud consumer, but most of it's handled by the provider. Now, SaaS may have some real advantages from a security standpoint, since we're able to delegate most everything to the cloud provider. But as Cloud consumers, we still have issues we need to address. We give administrator credentials, is that the thing that kills everybody. Essentially, that's the problem of making sure nobody can spoof the Administrator credentials and become an administrator relative to the Cloud consumer's application. For example, I have a website, I don't want my credentials to leak so that somebody else can attack my website. Another example is incorrect data protection. If you have a large-scale website and you have a lot of people putting data into it, you don't want someone to grab a data file, put it in, and then discover that this is a data file that really isn't supposed to be public. For example, let's say you have confidential price lists that people from sales will give to select individuals who are authorized to receive it. If you put that in the public area, everybody sees that. That's a very common mistake people make. Plug-in vulnerabilities. This refers to the notion that a lot of times you can build a software module that can be downloaded into the SaaS system, to operate as part of your application as a Cloud consumer. Now, sometimes these plugins may have vulnerabilities that allow back doors into your resources. That's a problem. Cloud provider risks are another thing. We've talked about this a little bit before. It comes down to the question of: What kinds of things could leak your information to the Cloud provider, or give the Cloud provider access to your information in an un-necessary and an un-appropriate manner. For example, having stuffing plain texts that ought to be encrypted when it's in storage. We have to address those risks. Finally, there are surprises and the SOC 2 report. Remember we have this SOC 2 report that we discussed in the last class, that will provide insight into the type of access controls and strategies and techniques used by the Cloud provider, in order to maintain the security of the Cloud consumer systems and applications. Be sure to read that SOC2 report, and look for things that might be unexpected or unfortunate.