[MUSIC] Well, Sam, it's a pleasure to be able to talk to you in person, in front of the audience. >> Thank you. >> They've seen both of us for a long time, and maybe can get to hear a bit more about what you do in your real life. >> Okay. >> So, from the organization's point of view, what is your role at Johns Hopkins? >> So I'd say my main role is a DBA, Database Administrator on the Epic data warehouse, which is a large recasting of clinical data across the enterprise. >> So, sometimes that epic data warehouse we call it EDWs for enterprise data warehouse. >> Correct. >> Because we hate the actual formal name that epic gives to that warehouse which we love naming here. Now, in that role as DBA, what is it that the organization expects you to accomplish for them? >> That role as a DBA is a little over simplified in term. Essentially is a broad role that makes sure that data is consistent, data is available, data is reliable, and data is accessible and understandable across those roles. >> All right, and so there are DBAs of multiple types. There's DBAs for small databases, for large databases, the enterprise, so your responsibility is pretty large. So let's go over it. You said the data are available, data are, >> Accesible. >> Accessible. What's the difference between available and accessible? >> Available means that here are not outages of the system. Accessible means the right people, the right teams, have access to the data they need. The system might be up, but a particular user might not be getting the data they need. >> So it's available and accessible, and another was what? >> Reliable. >> Reliable, so what does that mean? Or how do you know when data is unreliable? >> So for example, a good example was a social security numbers, which is a very highly protected item. But sometimes there have been systems I've worked on where eople didn't know the social security number for people. And in order to get through the interface where they type in the actual data, they would just type 999-999-999. And there were hundreds of people with that social security of 99999, where the semantic meaning of a social security of that number meant unknown. >> Right. >> But it actually made the data unreliable. And, so that's an example. >> So, there's notions of trust, there's notions of meaning, in all this. And was there something else? I think you, I think the last thing you said was something like, that people can understand what the data mean. >> Right, so we built in addition to the database, in itself, we've built a knowledge base and other training material. It's very important to think sometimes people get in, especially software people think that software just means my software is functioning. But I've learned to think of software as kind of multi layer. There's the actual software, the data, but there's also the training materials, the understanding, it's all one broader package. And so in order to provide the semantic meaning, the understanding, we have various tools, something called a console that gives data models for people to understand, read about them. We have a knowledge base, and we have training seminars and online training that we've built to help people here. >> So you actually care about people not just your data? >> [LAUGH] Well we try to. >> [LAUGH] Okay. So I think, since you know the language of the stack, you understand that what I've been asking is, what would be above the line functions of the DBA. And I think it's important for the students to understand that even a technology role like a DBA, actually is not above the line responsibilities. >> Right. >> Then the fun part is what do you do below the line to accomplish all of that. So the first thing you said was availability, so what do you do below the line to ensure availability? >> Okay so, below the line for availability we have a very dependable backup and restore capability where the data is backed up frequently. There's a variety of full backups and incremental backups. Based on performance and size, storage, predictions, we vary the relationship of full backups to incremental backups. That we've adopted another technology. We are on the Microsoft SQL server platform, we're using something called AlwaysOn, which is a type of replication where the data is replicated from one server to another server. So if one server would go down, or one data center, we have multiple data centers here. If one data center would go down, the other one would be immediately available. And for availability also, we've made different servers. One for the loading of data and one for more of the general reporting of data. I think that's pretty much for availability. >> Right, but with backup, it's always important to make sure that you can restore through backups. >> Right, that's correct. >> How often do you check that? >> Well we just went through a system upgrade where we checked that several times during the process of the upgrade. In general, we are forced to for one scenario or another, to check it once a month or something like that. >> Once a month. >> But that's actually very critical, we call DR disaster recovery testing, where we do a full of stack of rebuild and then testing. >> And what's the relationship between the business continuity and what you're calling availability, are they the same thing or are they a little different? >> That's a good question. So, business continuity in my mind, which is above the line, really has a general objective role to make sure the business doesn't go down or the enterprise doesn't go down. Where it gets a little more subjective is the type of system that you are supporting. If you are supporting a system that is a bedside clinical system where people's lives are directly on the line, that data availability has a much higher requirement. So what we have, an SLA, a service level agreement, between our teams here. We have a team here that represents server management. We have a team that represents security management. And on my team that represents the data warehouse, we are focused more on just the data. So we have service level agreements between each other about how much downtime can we tolerate. A data warehouse is used for reporting, which is, by nature, our data warehouse is populated 24 hours after the clinical system. We have a nightly ETL, or extract, transform, and load from a clinical system into the data warehouse. So we're already 24 hours behind, so we're not the same level as a bedside system. Therefore, our tolerance for downtime might be a little bit more than that. So our surface level agreements might state between six or 24 hours recovery time. Well, so again, at the high level business continuity, ensures the general principles that the business cannot go down. Each department or each application has a little bit more arrangement in determining what availability means for them. So there is above the line below the line range. >> The second thing was accessibility, so what do you do to deliver accessibility? >> So, one of the luxuries with working with a nice big enterprise like John's Hopkins is we have a very robust security infrastructure. And that uses LDAP or active directory services where every user on any Hopkin's IT resource has a unified directory of all users. We try to leverage that at the database level as well, where Active Directory groups are created that are in parallel with database roles. Let me explain. Let's say I have an entry level analyst or an entry level report, somebody who wants to write reports. Their kind of data they need, they don't need access to more sensitive data like a social security number, or people's psychiatric notes. Or things like their claim history, or another area we have is patient surveys, the press gaining patient surveys. That's some sensitive data. Maybe some people are expressing opinions about providers that maybe shouldn't be in everybody's access. So what we've got is Active Directory groups,aligned with SQL logins and database roles, to do additive permissions. We start off at a base level where there's not a granted access to special data. If somebody justifies it through a variety of HIPPA, or data use agreements, then we add on another Active Directory group, and the summation of the active directory groups and permissions is cumulative. So, it adds on it, it grants access to some of those other special data elements. That has given us a lot of flexibility to take on new data. To take on claims data or the third party data that we want to channel various sets of users to that data, and keep it away from other sets of fuses. >> Right, sounds like a lot of work. Now before that it was reliability, what do you do to deliver reliability? >> So reliability, again, the luxury of working in a great big enterprise like John Hopkins is that they have done a lot of investment on a variety of solid hardware platforms. We just went through a large system upgrade where reliability, I would say, means the servers don't go down. And if something does go down, there's a redundant system to take over. The servers are adequately specked for capacity, we had passed the threshold recently where we have a database server with literally a terabyte of RAM, which is a- >> That's a lot. >> That's a lot of RAM, [LAUGH] volume for actually for older DBAs like myself. >> Let's be clear. Most RAM in most PCs have about a gig of RAM? >> No probably, a few more 16 gig or- >> 16 gig of RAM, so this is like 60 times that? >> Right, it has, in a sense, made my job easier, because with that amount of processing power a terabyte of RAM and 64 processors. One of the traditional roles of a DBA is optimizing SQL code, and understanding doing things in an efficient manner. With that kind of firepower on the hardware, it makes almost anything run efficiently. But regardless that kind of reliability, reliability doesn't just mean I can connect to a server. It means that the server will process my request in a reasonable, timely manner. And so that's a large part of DBA operations. >> But before you said reliability was about the data being reliable. >> That's also true. I was talking just now about the environment. >> That's okay, human beings don't have to be totally consistent. We're not logical animals, the system too sort of a thing. So, what do you do to keep the data reliable? >> So we have a battery of validation scripts that we run at the installation of our warehouse. We have a variety of analyst that we work with who analyze the data for consistency, for null value checking, for discrepancies. And our most powerful [LAUGH], validation team is actually our end customers, know their reports very well, we work very hand in hand with them, giving this feedback when something doesn't look right. >> When they give feedback, there are two types of feedback. One type of feedback is that what you did is wrong, which you can fix. Another type of feedback is that the original data was wrong, all right? We really don't have recourse to change original data right, because it's a legal document. >> Well, that is correct, however, we have recourse in that we can identify that. >> Okay. >> So when we get a validation question a lot of times it is a detective hunt. Where did the bad data come from? How did it get there? What happened? Last night I got a request, somebody sent, we haven't seen an update since a week ago. Our data is getting old where we thought the process was running fine. And sure enough deep down in a certain particular process it was loading data, there had been a missed folder path or something like that, but it didn't report it. So we thought everything was fine, the client thought everything was fine, but sure enough there wasn't any data. So that is an example. But there is other examples of data reliability in it. As somebody recently mentioned, there is a system that updated provider data, and it inadvertently overwrote patients' primary care provider pointer. So what ended up happening in the system was that patients all over sudden were being assigned to new primary care providers. And those providers would pull up their list of patients and there would be new patients on the list or their old patients won't be on the list. And it caused a tremendous amount of havoc. >> Well thank goodness they knew who their patients are, right? >> Well, they had it in personal memory, right? So this was not here at Hopkins, this was another I heard about. What was happening is that, excuse me, that was due to a flaw in the uptake mechanism. Just writing the data to a wrong field. So that´s kind of debugging that kind of detecting sourcing. So the reliability of the data is critical. And that´s again, going back to the backups and restore, you can backup, you compare against backup, you can correct sometimes for a software, or something like that.