Hello again. In our last lecture we talked about some basic principles about using data. Now we'd like to give some specific examples. Because data can mean so many different things, and can be used in so many different ways, we really need to talk about these examples in order to help people get into the mode of thinking about using data. In each case, we'll be asking first, exactly what type of material the data is. Some of the inputs of research are subject to copyright, and as you've said, some are not. Second, we'll examine exactly what the plan use is. Cuz as we'll see, different answers to these questions result in different decisions about the usability of certain kinds of data. So as Lisa already said, when we say data many people think about that spreadsheet of numbers and by now we all know that facts, including numerical results obtained by experimentation, for example, are not subject to copyright, at least not by themselves. Those raw numbers are not protected, and if a user wanted to pull them from the spreadsheet and do her own analysis of the numbers in a different way, for example, there would be no copyright impediment at all. But it's important to remember that the way the data is selected and arranged and displayed may be subject to copyright. So, it can be original expression. For example a pie chart with the different colors, a bar graph, all of those different expressions that display data in an organized way will probably have a copyright because of the original selection and arrangement of the data that they represent. This is why a user who wants to republish a chart or a figure from a previous article does need to think about copyright. It's often the case if the reuse will be fair use, but the analysis has to be done because the visualization of the data may well be subject to copyright. The raw data probably isn't and can be pulled out and analyzed in a different way. But if you reproduce the original expression of the data, you do have to think about copyright. >> Well another area where this comes up is GSI data and maps. Talk about visualization. Maps are primarily a visualization or pictorial representations of facts, organized in such a way to make the information portrayed easier to understand. So if you are, for example, in epidemiology and you are studying a collection of data on the locations where people who suffer from a particular disease are located and then you want to pair that information with data about the causes of that disease, you would have two data sets. Both of those are factual sets of data. However, if they're used as coordinates on a map, then once they become, kinda, coordinates on a map and you get to that visualization, they become expression. And they can be subject to copyright. So this is often kind of a murky area between when you move from something being simply data collected and when it becomes a visualization and potentially subject to copyright. So another common type of data is text. So here it's possible for the data itself to have copyrights, so the text would have copyright, but if you're using a text mining project for a corpus of modern novels as an example. You have to take into account the copyright of the novels. That's pretty clear. But in this case, you're actually using them as a collection of words of underlying data. And so depending on how it's done, the output from that may very well not implicate copyright at all. Because it may be only a small tiny part of the original text that are actually included in any result set from doing that text mining. Or even it it's not miniscule or tiny amount, it still may be a larger amount, it still may be fair use. So the underlying text are protected by copyright, which puts limits on how the researcher could share that entire corpus of texts. But the text to mining results may be a very different situation. So another example of textual data that sometimes comes up is spreadsheets or citations. So citations themselves are really collections of facts, but sometimes researchers want to study publication patterns. So they pull together a large number of citations to analyze. So a lot will depend on what is harvested, so if it is merely that citation, which is essential author, title, journal, name. Things that very much are facts. Then this is a lot like the example of the numerical data on a spreadsheet. So there's no underlying copyright in the data. But if the harvest includes things like abstracts, which do have some expression, and do have some originality for those articles then the data itself might subject to copyright. So as with our example of text, the output that may not use the copyrighted content, or that use may be a fair use, but where abstracts and others are concerned, copyright may need to be considered. The other thing that may need to be considered is where those citations came from. If they came from a database that has contractual provisions, those contractual provisions may provide some restrictions or limitations on how that data can be collected and used. >> And for a final example, let's talk about a sort of unusual and interesting example where images are the underlying data. Most images have copyright of course. So, the full analysis of how the images and facts would be used is important in order to decide whether you can use the data or not. But in this example we're thinking about here, we have NASA satellite photographs. And the thing that's so unusual about the NASA satellite photographs is that they were made by federal employees, and therefore, if they're made by federal employees in the course of their jobs, then the works are in the public domain, and free to be used by all of us. So, NASA can not copyright the images, but they do try to put restrictions on the trademark of the NASA logo, and so on. And so, this is another example of how we need to think about things other than copyright, in addition, when we're dealing with data. There may be a contractual agreement, like in the example Lisa just gave, or there may be a trademark issue. And those things can't be discounted when talking about copyright and thinking about what you can and can't do. >> Thanks for watching and listening.