Hello and welcome to this course in which we're going to talk about performing reconnaissance with python, in this video, we're going to start out with an introduction to reconnaissance. So we'll talk about some of the fundamentals of reconnaissance and also the goals for this particular course. And we'll have two sections in this video, we'll start out with that introduction to reconnaissance, laying the groundwork for our discussion and then we'll discuss how we're going to go about performing reconnaissance with python. And so let's get started, so a good starting point when our discussion of reconnaissance is why perform reconnaissance. So reconnaissance is time consuming and it's not as exciting as other parts of the cyberattack process. If you're performing a penetration test, what you probably want to do is actually try to break into systems etc. And so all of this information gathering at the front end isn't as exciting. However, it's really important to perform reconnaissance when going into that engagement because this reconnaissance can help to identify potential weaknesses, vulnerabilities, attack vectors, etc. Before you can really dive into the exploitation effectively, it's useful to know a little bit about the system. The more you know, the quieter you can be in your actual attack. And so there are two main types of reconnaissance and we're going to look at both of them. In this course, there's active reconnaissance where you're pretty much engaging directly with the target system. So things like network scans count as active reconnaissance because you're sending packets and receiving responses. The other type of reconnaissance is passive reconnaissance. And so this is really anything where you're not actively engaging with that target system. So there's a lot of open sources of information about different environments that you can go and look to for intelligence. Also you can gain a lot of knowledge about the system even just by monitoring the network traffic. If you can connect to maybe a public WiFi etc. And so those are more the passive reconnaissance because if you do them properly, nobody really even knows you're there. And so in this course we're focusing on using python for reconnaissance. And so why are we choosing python in general or python in particular and programming in general. So let's start with that why automation question. So the reasons for automation, our speed. So reconnaissance might not be that exciting. And so the faster you can get it done the better and also the faster you can get it done, the more time you can spend on the more important parts of that ethical hacking engagement. So things like actually determining if the systems are vulnerable, enumerating weaknesses, etc. And so if we can speed things up a bit via automation, it makes sense to do so. So, the second reason to use python or automation in general for reconnaissance is scale. So the modern enterprise is huge and so there's a number of different systems that we need to learn about. And so you certainly can do everything that we're going to demonstrate in this course by hand but it's going to take a lot of time. And so all of that time spent on one system could be spent doing something else. And so if we can automate the process we can scan more of the attack surface more quickly and really distill the information down to what's valuable without wasting precious time. And then thirdly, just efficiency, you're going to be doing this a lot and so the more smoothly and efficiently that you can perform reconnaissance system, the better and so automating common repetitive tasks. Always a good idea and so that's the argument for automation in general, we're going to be using python. Because python programming language has all of those different advantages we talked about in the previous course it's easy to use, its powerful etc. And so that's some of the high level stuff what are we actually going to be doing in this course. And so as I mentioned we can perform reconnaissance in a variety of different ways and we're going to be focusing on two of them, one of them is open technical sources. So this is all of those databases of information that you can access and learn about a particular environment because someone's already done all the active work for you. Or because information about that environment has to be registered with that authority etc. And so if we can just download a bunch of useful information and intelligence that we can use for reconnaissance, it's much better than trying to find that all up by ourselves. However, we can't get all of the information that we need via those passive methods taking advantage of open sources. And so that's where we'll do a little bit of active reconnaissance with some network scanning. And so our goal is to make it so that network scanning that we perform is as targeted as possible. Because if you say start running and map and set it to scan every single port on every single system in a target environment, you're going to raise some red flags that's very noisy. And most of that effort should hopefully be wasted because that network should be locked down. So the only ports where you'd actually be able to get a response from are the ones that should be publicly accessible. So safe for a web server, you should potentially, if you get a response from ports 80 and 443 and hopefully not much else. So if you're scanning thousands of ports and only getting two responses, you're wasting effort and you're making a lot of noise that you don't have to. And so we want to avoid that one possible, so we're going to use as much open source information as we can to target our scanning efforts, which is more efficient and more stealthy. And so our goal here is to use python to go from essentially no knowledge of the target environment to identifying potential exploitation vectors, which is the end goal of reconnaissance. And so if we can automate that entire process, that's great. And so the diagram that we've got here on the screen is going to show how the various pieces that we talk about in the following videos fit together to allow us to achieve those goals. So when I say that we're starting out with little or no knowledge mean that we're starting out by knowing, okay, here's the organization that we're trying to scan. So maybe you've been hired to do an ethical hack of google. You know that they're called google LLC and you know that their domain name is google dot com from that information. Can we get to okay, this is how we attack them and yes, the answer is yes. And so we're going to look at a few different sources of information and we're going to make sure that we're accessing these all using python. So from that name and domain, we're going to target two sources of open information, we're going to take a look at Showdown, which is essentially a search engine for internet connected devices. And so if you've got a system and it's connected to the internet, it's probably showing up on Showdown. And there's a lot of information there, so just based off of a company name, we can get IP addresses associated with that company, the open ports on the computers that are internet facing. And data about the services or programs that are running on those ports, so that data might be extremely targeted. Something like a program name and version number or it might be more general like we performed banner grabbing for some protocols. Or perform to get or head request to a web server to get its default responses and header data. And so we'll be using python to interact with showdown and pull that sort of information out for future reference. Certainly we could gather it ourselves essentially showed and just done a lot of port scanning and stuff like that. But it's much easier just to ask for it, download it and be done in a few seconds. And so the other source of open information will look at is the domain named system DNS. And so DNS is that address book of the internet, if you provide it with a domain name it tells you the IP address associated with it. And so our goal here is to take advantage of DNS queries which sort of straddle that line between passive and active reconnaissance. To try to learn a little bit about some of the systems on the target network. So for example, if we can determine which IP addresses associated with mail dot google dot com. We've got an IP address and we have a hint about what probably is going on on that computer. It's probably running a mail server of some sort and so that helps us to target our future analysis. And so we're going to take that information from showdown and DNS and feed it into another python script designed to try to convert this information into program names and version numbers. And so for example, if we know, we've got a server called mail dot google dot com. We would love to come out the other end knowing okay on these ports, it's running these programs and it's this particular version of those programs and we'll talk about why we want that in a moment. So how do we do that, so in some cases showing hands that to us. And so we just have to pull it out of the records from showdown. In other cases, we're going to do some analysis of the banners that we get from showdown or from the active scanning that you see underneath that service analysis section. So those banners often will say what the program name and version number is of a particular application. And so if we can identify those and pull them out, we can use them further down our reconnaissance path. And so I've mentioned that active scanning because now if we know, okay, here are the ports that we know to look at. So maybe we've learned something from showdown or based off of the DNS, we've decided, hey, it's a mail server, we should look at port 25 because it's probably got port 25 open. Or if we know nothing else on a server will check some of the common ports. So like http ports smtp ports, etc, to see if anything is running on those much more targeted than the thousands of ports we could check up in just a general scan. And when we do that, active scan will get banner information just like from showdown that goes back into our service analysis to see if we can grab those program names and version numbers. And so coming out of that service analysis, we would love to know, hey, we're running Apache version whatever. Because that information maps very easily to vulnerabilities are CVEs which are just listings of known vulnerabilities. And they often will say within those listings this affects versions x, y and z of software y. And so if we can say I'm running, I've got this program and this version number. Are there any known vulnerabilities out there, then we come out with, okay, here's vulnerabilities we might be able to exploit and that's great. So through this process that we can automate from end to end, we start out with a company name and maybe a domain name associated with it. Something like google dot com and end up hopefully with a list of potentially exploitable vulnerabilities. And if we automate the entire process, all we have to do is it start at the beginning and watch for something to come out the other side. And so just in summary here, we started out with an introduction to reconnaissance, talking about what reconnaissance is and why we're bothering with it. After that, we discussed kind of our goals for this particular course. And so how we're going to look at a few different python scripts and how we can piece them together to automate a lot of that reconnaissance process. And so let's get started, thank you.