Hi, I am Andrea Vitaletti and in today lecture, we are going to talk about techniques and methods to share digital resources on the Internet. We have seen in previous lectures that the archaeology is a patient, methodical and scientific activity, and when successful, this activity brings to new discoveries that can shed light into our past. We have also seen how to digitize such objects, and so, now we want to show how such digital copies of the objects can be brought into the internet. With the main purpose of preserve and disseminate our cultural heritage. So, in the context of saving the universal heritage, digitalization, digitization, sorry, has two main purposes. One is preservation, the other one is dissemination. It's pretty clear that, if we have a digital object into the Internet, we can disseminate this content all around the world. About preservation having multiple copies, we can distribute those copies in the Internet. And in general, digital contents are more robust than physical contents. However, I want to stress that there is a particular problem. The so-called digital obsolescence. To clarify this problem, I want to make an example. A few years ago, one of the most common tools in order to listen to music were tapes, nowadays it's not so easy to find devices in order to listen to music recorded on tapes. So, the problem with digital obsolescence is that it's not only necessary to preserve the digital contents, but also the tools, the devices, and the software necessary to understand that content. However when we talk about digital contents in particular we will focus on resources. That is, in general, a digital object that is accessible over the network. What is the problem... that is not sufficient to make a copy. We need to identify these resources to describe it to locate it, to discover in the most effective way. So let's make an example. Suppose that you have a collection of music files in the form of mp3. Is not sufficient only to have information on track one, two, three, and four. Likely, if you want to be effective in finding out the music you want to listen to, you need to organize this music in terms of authors, in terms of artists, in terms of album, and so on and so forth. Another example from previous lectures. We have seen how it's possible to digitize physical objects, and through digitization, we can have, we can distinguish among different parts of the monuments. As an example, in this picture you see here, you have the column that is made of three main parts. Okay, so what is important is to understand what is the role of each part, namely to describe all those parts. The reference scenario we will considere is the following. We will have digital objects, namely, digital resources that will be distributed over the Internet and will be stored in a digital library. For the sake of simplicity, let's think about digital libraries as websites in which we can find digital contents. On the other side, you have people interested in finding those digital objects and so we have to provide suitable tools in order to identify, search and find those objects. Okay, as we observed before it is not sufficient to put raw data, we know, we need to organize the information. Let's make another example. Suppose you are reading a paper and then you find the information in the red box in the picture. You see, this is Paolo Matthiae, Ebla: an empire rediscovery, 1980. From your experience, it looks like a reference. And what is a reference actually? A reference is information describing a book. So, are data about data. The so called metadata. So, metadata are a convenient tool in order to describe information. But let's go a bit more into details. By convention, I not only know that this is a reference, but I also know that the structure of the reference can provide us further information. As an example, I know that Paolo Matthiae is the author of this book. I know the Ebla: An Empire Discovered, is the title of this book. And I also know that 1980 is likely the year of publication. Okay, this is implicitly ... this is information implicit in the structure of reference, but we can do it even more explicit. Namely, what we can do, we can make a kind of table in which the categories of information are explicitly listed. So you have a document type, in this case a book. You have the author's first name, in this case Paolo. You have the author's last name, in this case it's Matthiae, and so on, and so forth. So the general problem is, we have data as raw facts, we need to structure those data in order to provide information. In order to structure those data, we need to conceptualize our knowledge. Our knowledge allow us to define the structure of the data. Okay? So knowledge leads to information that determines the data. At the same time, data can provide us new information in order to reinforce the knowledge of our domain. So, there is this cycle. Okay? So, we've already seen roughly that what a raw data. That metadata is a convenient tool in order to describe the information. Let's have a look at how we can model our knowledge. Okay, so the purpose of this slide is to show different methods and techniques in order to structure our data by the description of our knowledge. At the very low level, we have a vocabulary. What is a vocabulary? It's simply of shared terms. Namely, the list of terms that are allowed in a specific domain. Then, if we structure these terms in a hierarchy from the general to the particular, we have a taxonomy. If we introduce other relationships, such as the associate and the equivalent relationships, we have a thesaurus. And finally, the ontology. The ontology is the formal model. It's a shared conceptualization of a specific domain. Let's clarify those concepts with a simple example. So, let's assume that our vocabulary is made by those terms. Mammal, human, dog. Canine, residential, buildings, commercial, house. Okay, so those are the list of terms allowed in our domain. Now, clearly a human is a mammal. A dog is a mammal, a canine is a mammal. Similarly, buildings can be either residential or commercial, and a house is a residential building. So now we have defined a hierarchy that allows us to better represent our knowledge of the domain, then we can say that a dog is equivalent to a canine, and similarly a human is related to a house, so this is the thesaurus. Finally, we can introduce another relationship. We can say that the human is the owner of a dog. Notice, that's it's not true for the vice versa. So you see, we are introducing different levels of complexity that allow us to better structure our knowledge, and to represent and model our knowledge. Concluding. In this lecture, we have seen that it is important to structure data on the basis of the knowledge of the domain of interest, in order to share the data over the internet in the most effective way. In the next lecture we will see in greater detail some tools for data modelling.