In this section, I will talk about the different about different factors that you will need to consider when planning your data collection modality. These factors include the structure of the data itself, they relate to the physical location, or the study personality to meet face to face with the subjects or will the data be collected remotely. And, finally, whether the data already exist and you are reusing existing data or whether you need to capture data from scratch. The classic form with which clinical research data has been captured is the case report form. And this is, this usually consists of a small set of well defined data elements such as the elements you would find on a web form in your browser or on a paper survey. Da a , data is typically entered by studies objects or study personnel, either into paper format or into an electronic data capture format. Notice that when the data is captured into an electronic format for it to be analyzed it would still need to to be transcribed into the electronic format as well. It's considered bad practice to enter data directly into spreadsheets or into database software It's it's always intuitive and less likely to have errors if you use a structure form. Another type of data that you would, you will capture are complex machine capture data. And these are digital, but they are not as easy to interpret as case report forms. And these could be the results. or the, the three things of biometrical or physiological measurements, which is electronic such EEG or EKG. Or image files that are captured from radiology or histology studies. This kind of data is electronically stored as large machine-readable binary files. And you will need to annotate the, you will need to annotate your study data base with the correct markers to point to these files. [INAUDIBLE] Sometimes, you will need to capture unstructured data. For example, during subject interviews, or, or observations. And and these kinds of interactions, data is typically captured via recording or videotaping or note taking. then this, data is transcribed into text that can then be entered or stored electronically. into large unstructured blobs of text. you can impose structure on that, on this kind of data, by coding, and so sometimes study personnel would go in and extract data, or just realize that certain, parts of the text, can be coded in a certain manner. And this way the results of this kind of coding can be extracted and entered into the database itself. So second factor relates to the physical location. If you are meeting in person, or if you are doing in person data collection, it is important to know the context. Is it a dedicated study visit, or are you collecting, or is the study data being captured as part of a different process? So for example, are you capturing patients doing a routine clinical visit? and another context that's important is to identify who will be doing the data entry. Is the patient going to be doing the data entry him or herself? Or study personnel will be doing that for them. This may impact your choice and design of forms. forms that are entered once by patients or study subjects for example, will need to be simple and intuitive. Where as study personnel can be, may be able to use more complex forms because they will just have to learn how to use them once, and once that learning curb is overcome then they can efficiently capture the same data over and over from multiple stdy patients. Study subjects. When you're doing, when you're capturing data remotely, this could be done via a paper form, such as a survey, where the subjects will just fill in the blanks. another format would be, you can, you can sort of automate the process by using bubbles or scannable forms like the exam te-, like the scannable exam Forms. or you can also use optical character recognition where the data can be entered by the subjects into these forms by using just handwriting and then. there are, there is software that can identify and digitize that content. sometime, you can also use electronic data capture, as we've mentioned before, and the means with which this can the transport could be either via a browser and this requires internet access. Or you can deploy specialized data collection software to the site or the study subjects, or to the site of data collection. where data, where the software provides the interface and then the transport, the transport could be part of that software process. You can capture data by phone, there are interactive voice response systems, or an SMS, you can establish an SMS dialogue where there exists systems, the patients can enter, can answer can provide answers via their SMS messages in a menu. Or structured format and that would all be automatically entered into your research data set. When collecting data remotely, it's important also to take into account the security of the data collection, and the transfer pipe back to your database. If you are collecting sensitive patients, such as protected health information, then it's, then that factor is very important. The third factor that you need to take into account when choosing your data collection modality is whether you're collecting data from scratch or whether you are re-using existing data. Notice when you are re-using existing data, it's less important to worry about data capture and more and more important to worry about data extraction from that existing source. A source a source of information that is typically reusued for clinical research is the medical record, or the electronic medical record. Data in the electronic medical record can exist either in a structured format, for examples lab, lab, tables of lab values that can be automatically extracted, transformed and then loaded into your study data base. However, there exists portions of the medical record, such as clinical notes that are written down by the health care providers, that consist of unstructured text. To capture that data requires abstraction by expert encoders, and these are professionals who can understand the clinical terminology and identify the portions or the units of information that are needed for this study. When designing- when, when deciding to reuse existing data from the medical record, it's very important to know whether you have the right access. Whether you have, the rights to access the, the medical record. And also it's important to, to, to-, for that to be done by personnel who are knowledgeable of-, with clinical documentation. Notice that even, in this context, unstructured data can be somewhat automated using, information extraction techniques like natural language processing which can identify facts or assertions made in natural text. And then those facts or assertions are then translated into structured, information types that are then loaded into your database. And to your study database. Another source of the, of information that can be re-used for clinical research is data that exists from previous studies. the considerations you will need, so in that case the study has been collected and exists in well maintained study databases or hopefully well maintained databases. the considerations you will need to to have before re-using that data is whether you do have permission. Whether you can, whether you can have access and how you can have access to that data, and finally research design situations. The context of studies may be different and replicating the same context from one study to the other may not be straightforward. So the applicability of the data or the interpretation of that data may be different in the two settings. But other than that it's a relatively straightforward extraction mechanism. finally you can reuse data that exists in public health registries or other publicly available data When extracting from these sources, it's important to be assured, or to be aware of the data quality, the level of completeness of those data sources, and how those data sources will need to be transformed so they can match your study design, and the databases structure that you you have. So to recap, I've just described the following factors that do affect your choice of data collection modality. The structure, these are factors that relate to the structure of your data, the location, where the study personal will need to meet with your subjects, or whether the data's collected remotely, and finally whether you are re-using existing data, or whether you're collecting data from scratch.