[MUSIC] Hello, everyone and welcome back. In this lecture, we're continuing our series on data storage formats and we're going to explore shapefiles. This one should be a little quicker than the previous ones. We're mostly going to be talking about the limitation of shapefiles relative to other data types. And their one big benefit, which is that they are widely supported. So first, I have a shapefile loaded up from my map document right now and I have a series of shapefiles in a folder over here. Now, similar to file geodatabases, these show up as a single item in ArcGIS in our catalog window here. But in Windows, they're little different. So let's take a look. So let's put this side by side again. And so I have an HDflowline.shape here. So let's collapse this a little more. .shape just like I have over here. But I have all these other files with the same name. And also a lock file here that says that nobody else can edit it while I have it open. What these all are, are the supporting files for shapefile. So shapefile isn't actually one file, it's more shapefiles than shapefile. It's many different pieces that are understood by different pieces of software. So this shapefile here has the actual feature information, but this dbf file here has all of the data table information in it. The .prj file has the projection information and I can right-click and edit that. And we'll see it's just a text string describing what the projection is. And then we have some indexing information in this we have metadata. We have all sorts of other files that support the shapefile. If I want to send somebody this file similar to file geodatabase, I need to select all these and I need to send them to somebody one way or the other. I need to either give them all or these files or if I wanted to send it over email, I need to, again, zip compress the files into a single file and then they can download it and decompress it. So once again, ArcGIS is making things nice and viewing them as a single unit for us, but it's a little odd on the backend and it takes a little getting used to. Make sure you send all those files if you're trying to send somebody a shapefile. Now let's go explore the attributes. So remember how we ended the last one looking at select by attributes? Well shapefiles do an even different things. So we have file geodatabases which need nothing around their field names, and then personal geodatabases, which needed these brackets around it. And then shapefiles, which need the quotes. So again, use the double-click editor if you're not sure, or at least take a look at the fields in the double-click editor up here before you click your queries, because that's going to help you to figure out what you need to put in the box down here. Similarly, fields are a little different. This is backed by a really old storage engine called dBase 4 which is there's a .dbf file in the mini files that make up the shapefile here. And in fact I can open up the DBF file in Microsoft Excel as well or similar spreadsheet editors, but I shouldn't edit it in there, because there's no guarantee it's going to write at back out the same way that I write it in. So I can view in here, but don't treat this like we do with Microsoft Access in the personal geodatabases. When we save it back out, we're not necessarily keeping the same data construction. So you can view it here, you can maybe do some little bit of analysis but do not save it back out. So that's nice that it's stored in a compatible format, but like I said, that's also because it's so old. So if we take a look at some field properties, we can see that we have fewer options down here than we do as in the others. We can't make field null because it doesn't have the same concept with null. And as I said before, the field names are limited to 13 characters, letters, numbers, underscores type of stuff and length. And I don't recommend starting with numbers. I don't know if that's possible with the shape files. Similarly if we go to Add Field, if I select that Double type, I need to select a precision in the scale as well. Precision describes the total number of digits that can be stored in the field, and scale sets the number of decimal places that are available. This is useful but, it's a lot nicer with file geodatabases, where it just manages all that for us for accuracy. Note also that there's no alias here. I can't set an alias when I'm adding the field here. And finally, shapefiles don't store relationship. We can't create relationship classes between different shapefiles here. And they don't store relationships to other tables inside of them or even relationships between features, that topology that we used when we were editing features before. And that we'll explain a little later in this class in more depth. So without that information, without that awareness of other data around them, they're a little slower for many operations. The one thing that they're faster for is just raw rendering. So if I zoom in slightly on the shapefile, to force it to re-render, they're much faster at this than file geodatabases, but if I was to symbolize based on an attribute. So if I go to Properties and set my Symbology based upon one of these attributes, all of a sudden shapefiles are much, much, much slower. Even if I were to go in and manually create the indexes, the indexes we need for the shapefile. It would still be much slower than the file geodatabase at that point, and in geoprocessing operations, they're going to be slower as well. So I don't recommend that you use them as your primary storage mechanism for your projects. If you want to use them or you need to use them to feed into something that only supports shapefiles, or to send them to somebody who only supports shapefiles, you can. But I highly recommend that you use file geodatabases as your primary data storage when you're working on your projects. They're going to impose many fewer limitations, your data integrity is going to be better, and you're going to run into fewer problems. Okay, the last thing to mention here is one of those final problems is that each component of a shapefile, each of those files we looked up before, so each one of these, is limited to 2 GB in size, that's its max. So if we run over that, we end up usually with data corruption. The big weights that I've seen people run over is really complex shape information, feature information, or when they have lots and lots of rows in their table or many columns, and that causes their DBF file to go over 2 GB, and then they're going to start losing data after that. So again, use file geodatabases and you don't run into those problems. Okay, that's it for this lecture. In this lecture I showed you the core components of shapefiles, how we can use them and select by attributes, all the different files that make up a shapefile, how to add fields and the differences there in shapefiles. And some of the performance and size advantages of not using shapefiles. In the next lecture, we'll finish this series on data storage workspaces up by taking a look at an experimental new feature in ArcGIS that supports using SQLite databases as a data storage format. See you there.