We were asked to contribute an article to PenTest magazine, and chose to write up an introductory how-to on footprinting. We’ve republished it here for those interested.
Network foot printing is, perhaps, the first active step in the reconnaissance phase of an external network security engagement. This phase is often highly automated with little human interaction as the techniques appear, at first glance, to be easily applied in a general fashion across a broad range of targets. As a security analyst, footprinting is also one of the most enjoyable parts of my job as I attempt to outperform the automatons; it is all about finding that one target that everybody forgot about or did not even know they had, that one old IIS 5 webserver that is not used, but not powered off.
With this article I am going to share some of the steps, tips and tricks that pentesters and hackers alike use when starting on a engagement.
Approach
As with most things in life having a good approach to a problem will yield better results and overtime as your approach is refined you will consume less time while getting better results. By following a methodology, your footprinting will become more repeatable and thus reliable. A basic footprining methodology covers reconnaissance, DNS mining, various information services (e.g. whois, Robtex, routes), network registration information and active steps such as SSL host enumeration.
While the temptation exists to merely feed a domain name into a tool or script and take the output as your completed footprint, this will not yield a passable footprint for two reasons. Firstly, a single tool will not have access to all the disparate information sources that one should consult, and secondly the footprinting process is inherently iterative and continuous. A footprint is almost never complete; instead, a fork of the footprint data provides the best current view of the target, but the information could change tomorrow as new sites are brought online, or old sites are taken offline. As a new piece of data is found that could expand the footprint, a new iteration of the footprinting process triggers with that datum as the seed, and the results are combined with all discovered information.
Know your target
The very first thing to do is to get to know your target organisation. What they do, who they do it for, who does it for them, where they do it from – both online and in the kinetic world, what community or charity work they are involved in. This will give you an insight into what type of network/infrastructure you can expect. Reading public announcements, financial reports and any other documents published on or by the organisation might also yield interesting results. Any organisation that must publish regular reports (e.g. listed companies), provide a treasure trove of information for understanding the target’s core business units, corporate hierarchy and lines of business. All these become very useful when selecting targets.
Dumpster diving, if you are up for it and have physical access to the target, means sifting through trash to get useful information, but in recent times social media can provide us with even more. Sites like LinkedIn, Facebook and Twitter can provide you with lists of employees and projects that the organisation is involved with and perhaps even information about third party products and suppliers that are in use.
One should even keep an eye out for evidence of previous breaches or loss of credentials. It has become common place for hackers to post information about security breaches on sites like pastebin.com. The most likely evidence would be credentials in the form of corporate emails and passwords being reused on unrelated sites that are hacked, and have their user databases uploaded. In addition, developers use sites like Pastebin to share code, ideas and patches, and if you are lucky you might just find a little snippet of code sitting out in the open on Pastebin, that will give you the edge.
DNS
“The Domain Name System (DNS) is a hierarchical distributed naming system for computers, services, or any resource connected to the Internet or a private network.” – WikiPedia
In a nutshell, DNS is used to convert computer names to their numeric addresses.
Start by enumerating every possible domain owned by the target. This is where the information from the initial reconnaissance phase comes in handy, as the target’s website will likely point to external domains of interest and also help you guess at possible names. With a list of most discovered domains in hand, move on to a TLD (Top level domain) expand. TLDs are the highest level subdomains in DNS; .com, .net, .za, .mobi are all examples of TLDs (The Mozilla Organization maintains a list of TLDs https://wiki.mozilla.org/TLD_List).
In the next step, we take a discovered discovered domain and check to see if there are any other domains with the same name, but with a different TLD. For example, if the target has the domain victim.com, test whether the domains victim.net, victim.info, victim.org etc. exist and if they exist check to see if they are owned by our target organization. To determine whether a domain exists or not, one should examine the SOA (start of authority) DNS record for the domain. Using commands like nslookup under Microsoft Windows or the dig/host commands under most of the *nix family will reveal SOA records.
Using dig, “dig zonetransfer.me soa”.
Figure 1: Using dig to get the SOA (Start of authority) record for a domain
If, by verifying the SOA, it is confirmed that the domain exists, then the next step is to track down who it belongs to. At this point the whois service is called upon. ‘Whois’ is simply a registry that contains the information of the owner of a domain. Note that it is not entirely reliable and certainly not consistent. The following very simple query “whois zonetransfer.me” provides us with the owner of the domain “zonetransfer.me” detail.
Figure 2: Using whois to get the domain owner detail
After finding domains, running them through a TLD expansion and verifying their whois information, it is time to track down hosts. First we need to get the NS or name server records for the domains. Again using “dig zonetransfer.me ns” returns a list of all the name servers used by this domain. In many cases the name server will not be part of the target’s network and is often out-of-scope, but they will still be used in the next step.
DNS yields much interesting information, but the default methods for extracting information from foreign servers effectively relies on a brute force. However, DNS supports a trick where all DNS information for a zone can be downloaded if the server allows it, and this is called a “zone transfer”. When enabled, they are extremely useful as they negate the need for guessing or brute-forcing; sadly they are commonly disabled. Still, given the usefulness of zone transfers it is always worth testing for. Zone transfers should be performed against all the name servers that are specified in the NS records of a domain as the data contained in each name server should be the same, but the security configuration might be different. Using dig, the following command will attempt to perform a zone transfer “dig axfr @ns12.zoneedit.com zonetransfer.me”
Figure 3: Performing a zone transfer using dig
As mentioned previously, zone transfers are not that common. When we cannot download the zone file, there are a couple of other tricks that might work. One is to brute force or guess host names: by using a long list of common hostnames one can test for names such as “fw.victim.com”, “intranet.victim.com”, “mail.victim.com” and so on. The names can be commonly seen hostnames, generated names when computers are assigned numeric or algorithmic names, or from sets of related names such as characters from a book series. When brute forcing DNS, be sure to check the following DNS records: CNAME, A and AAAA. Again this is easy using a tool like dig. “dig www.google.com a” produces the DNS configuration for www.google.com, note that the hostname www.google.com actually has multiple DNS entries, one CNAME record, and multiple A records. Looking at the IP addresses it is clear that there are several different hosts (2 in the screenshot below).
Figure 4: Using dig to get the a record for a host entry
Doing this manually seems easy and quick, (and it is) but if we want to brute force or guess many host names, then this will take too long. Of course, it is easy enough to script these commands to automate the process; however there are existing tools written specifically for this purpose. One of the most popular tools, Fierce, is a perl script written by RSnake (http://ha.ckers.org/fierce/), which is easy to use and has many useful functions. Additionally, there are tools like Paterva’s Maltego and SensePost’s Yeti (a tool I wrote) which provide graphical tools for this purpose.
If we happen to have a list of IP addresses or IP netblocks of the target, then a further DNS trick is to convert the addresses into hostnames using reverse lookups to get the PTR record entry. This is useful since reverse records are easily brute forced in IPv4. Bear in mind that DNS does not require a PTR record (reverse entry) or that entries in the reverse zone must match entries in the forward zone. But the result can give you an idea of whether the host is a shared host, owned and hosted by the company or just remote hosted.
To test once more, try using dig, “dig 104.66.194.173.in-addr.arpa ptr”. While this too can be easily automated, the previously mentioned tools will also handle PTR records.
Search engines:
DNS interrogation and mining forms the bulk foot printing, but thanks to modern search engines like Google and Bing, finding targets has become much easier.
Apart from the normal searching for your target, as you would do in your initial phase, you can actually use the data that you discovered during the course of the DNS mining to try and get further information using search engines. Bing from Microsoft provides us with two really useful search operators: “ip:” and “site:”. When using the “ip:” operator, Bing will return a list of hosts that it has indexed that resolve to the IP address that you have specified. Alternatively the “site:” operator when used with a domain name, will return a list of host names that have been indexed by the search engine and belong to the domain specified. Quick and easy, and Bing also provides you with a very simple free API that you can use to automate these searches.
Address mapping
All this fuss with DNS is important, but it is only useful insofar as they lead us to addresses. The next step is discovering where the target exists within the IP address space. Luckily useful tools and resources exist to help us uncover these ranges, by automating a combination of manual techniques such as whois querying, traceroute and netblock calculators. In the previous section the whois tool was used to get the domain owner information. The same tool can be used to discover the ownership/assignment details of a specific IP address. Let’s take www.facebook.com; one of the IP addresses that it resolves to is 69.63.190.10. “whois 69.63.190.10” produces the following output.
Figure 5: Getting the netblock and owner using whois
From the whois output we get really useful information. First is a netblock range 69.63.176.0-69.63.190.255 as well as the owner of this net block, namely Facebook, Inc. In this case we are lucky and the netblock is registered to facebook, but often you will only get the network service provider to which the netblock is allocated to. In that case, you will have to query the service provider in order to gain more info about the specific netblock. Online resources can also be very useful, for example ARIN (American Registry for Internet Numbers) or any of the other regional registries (RIPE, AfriNIC, APNIC and LACNIC) provides a reverse whois search interface where one can search for organisation names and other terms, even performing wild card searches. Giving Facebook a second look, we try a search on the reverse whois interface found at http://whois.arin.net/ with the term “facebook”, and get a list of five additional network ranges.
Figure 6: Search results for ARIN reverse whois
SSL Certificates
Lastly, we turn to SSL. SSL may be more familiar as a “protection” against nasty eavesdroppers and men-in-the-middle, but it is useful for footprinters. How? It is really simple actually, one of the security checks performed by browsers when deciding on the validity of a SSL certificate is whether the Common Name contained in the certificate matches the DNS name of the host requested from the browser. How does this help? Say a list of IP addresses has been produced; the next step would be to perform a reverse lookup of all these addresses. However, if no reverse entry is present and Bing has no record of the IP, then some creativity is called for. If an HTTPS website is hosted on that address then simply browse to that IP address and, when presented with the invalid certificate error, message, look for the “real” host name.
Figure 7: Firefox reporting the common name contained in a SSL certificate for a host
Again, this is something that is easily automated, so we have included a module in Yeti to actually do this for you.
Conclusion
Foot printing might at first glance appear to be simple and mundane, but the more you do it, the more you will realise that very few organisations have a handle on exactly what they have and what they present to the Internet. As the Internet and networks evolve so will the way companies and organisations use it, and so will their footprint. A year-old footprint could be hopelessly outdated, and ongoing footprinting helps organisations maintain a current view of their threat landscape.
With the ongoing move away from local infrastructure to hosted infrastructure, the footprint expands, spreads and grows, and so will our quest to find as much as possible.