Features Reconnaissance Tools Lead image: © Wojciech Kaczkowski, 123RF.com
© Wojciech Kaczkowski, 123RF.com

What tools do intruders use to


Professional attackers have much more pointed at your site than just Nmap, and you should too if you want to test your network's security. We'll show you some tools intruders use to gather information. By David Dodd

During reconnaissance, intruders gather information from public sources to learn about the target: the nature of the business, the technical infrastructure, architecture, products, and network configuration. The actions required to discover this information might seem harmless and might even be overlooked by security administrators as "network noise," but the information gathered in the reconnaissance phase can be useful for launching a network attack. Social Engineering – manipulating people to divulge confidential information or tricking people to do things beneficial to the user – might become prevalent at this stage. If the reconnaissance is pulled off successfully, the target might not know until it is too late.

In this article, I describe some tools and techniques used in the planning, scoping, and recon portion of a penetration test. If you learn to use these recon tools, you'll get a head start on the intruder by finding these vulnerabilities before they are subject to attack.

Domain Tools

Intruders and penetration testers use a number of tools to obtain DNS information. Many of these tools are very familiar to IT professionals. Tools such as nslookup and dig provide information on domain names, name servers, and network hosts accessible through the Internet. The popular whois service also offers a means for discovering domain information. The Nmap scanner's -sL option (nmap -sL) performs a reverse DNS lookup on every IP address in the scan and queries the DNS server each time an IP address is listed.

In addition to these standard DNS tools, a pair of lesser known utilities also inhabit the toolboxes of many experienced pen testers. Dnsrecon [1], written by Carlos Perez, provides different methods for enumerating targets, such as querying for service-oriented architecture (SOA) and top-level domain, zone transfer, reverse-record lookup, service record enumeration, and brute force attacks on subdomain and host records with a wordlist:

# ./dnsrecon.rb -t std -d packetstormsecurity.org

Fierce [2], written by Rsnake, queries your DNS servers of the target and attempts to dump the SOA records:

$ ./fierce.pl -dns <target> -wide -file output.txt

Fierce is interesting to run in larger organizations that have vast networks. If it finds anything, Fierce will scan up and down looking for anything else with the same domain name using reverse lookups.

A search option that allows you to find non-related domain names (Figure 1) is:

$ ./fierce.pl -dns <target> -search searchoption1,searchoption2
Searching out non-related domain names with Fierce.
Figure 1: Searching out non-related domain names with Fierce.

(Where searchoption1 and searchoption2 are different names that the target goes by, such as acme.com and acmecompany.com). Fierce has wordlist support, so you can supply your own dictionary using the -wordlist key:

$ ./fierce.pl -dns <target> -wordlist dicfile.txt -file target.txt.

Dnsrecon, Fierce, and the other DNS tools will likely identify numerous systems that are directly and indirectly associated with the target. You might identify many systems that are out of scope of your initial target, and you must then verify their inclusion in or exclusion from your target scope.

When querying DNS servers, you get some interesting information, indicating which machines are mail servers or name servers. Table 1 shows a list of DNS record types.

Tabelle 1: DNS Record Types




Nameserver record


Address record


Host Information record


Mail Exchange record


Text record


Canonical Name record


Start of Authority record


Responsible Person record


Point of inverse lookups record


Service location record

Searching for Metadata

Metadata (data about data) resides in email files, spreadsheets, or other electronic document formats. This type of information became popular when it was used to catch the 30-year-old case involving the Wichita, Kansas, BTK killer. Metadata is information about a document, such as who created a file, the date it was created, and when it was last modified. The amount of metadata depends on the properties of the file format. Pen testers can use a tool called Metagoofil [3] to help find metadata on websites. Metagoofil is an information-gathering tool designed for extracting metadata off public documents (PDF, DOC, XLS, PPT, ODP, ODS) available on target websites. To install Metagoofil on Ubuntu, you need libextractor installed on your system:

$ sudo apt-get install libextractor-plugins extract

Edit the metagoofil.py file and have the extcommand read as:


The metagoofil.py file is executable, but on some systems, you might not be able to issue the simple command:

$ ./metagoofil.py

and will instead have to enter the following:

$ python metagoofil.py

Once Metagoofil is running, issue the following command to search a website for useful documents (see Figure 2):

$ python metagoofil.py -d warnerbros.com -f all -l 50 -o warnerbros.html -t deadfile
Finding useful documents with metagoofil.
Figure 2: Finding useful documents with metagoofil.

The -d option specifies the website to search, -f specifies the file type, for which I selected all, -l specifies to limit the results to 50, -o specifies the output (in this case, HTML), and -t specifies the target directory to download the files.

Now open up a web browser and look at the results of the warnerbros.html file (see Figure 3).

Metagoofil results.
Figure 3: Metagoofil results.

wScroll through the HTML page and find all the important metadata from each file that was found during the scan. At the end of the document is a list of total authors found (potential users) along with path disclosures (see Figure 4).

Finding potential users and path disclosures.
Figure 4: Finding potential users and path disclosures.

Accounts and Hostnames

A valuable tool for social engineering and intelligence gathering is theHarvester [4], which will get email accounts, usernames, hostnames, and subdomains from different public sources, such as search engines, and PGP key servers. The sources supported are Google, Google profiles, Bing, PGP, LinkedIn, and Exalead. New features were added as of March 4, 2011, with the version 2.0 release, including time delays between requests, XML results export, searching a domain in all sources, and virtual host verification. To issue a search (see Figure 5), use:

./theHarvester.py -l 100 -b all -d target.com
Harvesting public information with theHarvester.
Figure 5: Harvesting public information with theHarvester.

You can redirect the output to a text file to read later. To utilize the Bing feature, you need an API key; otherwise, you will get an error by issuing the all option. Open up vi or your favorite editor and edit the file ~/theHarvester-ng/discovery/bingsearch.py, then look for the line that says: self.bingApi="" and enter your API number.

Metasploit can also search for email accounts using the gather option. This option in Metasploit is located in the auxiliary options. Just type the following at the msf > prompt:

msf > use gather/search_email_collector
msf > set domain sempra.com
msf > run

This function is useful within Metasploit, but it is not as powerful as using theHarvester. For instance, Metasploit's use of the gather tool does not allow you to search for PGP accounts, although it will search for email in Google, Bing, and Yahoo.

Network Discovery with Paterva's Maltego

Paterva's Maltego [5] is a general-purpose reconnaissance tool that runs on Windows, Linux, and Mac OS X. (This article focuses on the Linux version.) Maltego is available in two versions: a free community edition and a commercial version. The differences are that the community version has a maximum of 12 results per transform, runs slower, and won't provide updates until the next major version.

Maltego is built on the concept of transforms, taking one piece of information and performing a lookup to determine another piece of information. For instance, a Maltego transform will perform a DNS lookup and find the IP address. You can then apply another transform to map the IP address to an organization's name via a netblock lookup. Follow this with a whois lookup on the .org name to determine the public PGP key. Next, you can map that key to the names of people who have signed the key to get names of more people.

The issue that presents itself once you start this search is the vast amounts of information. It is difficult for the human brain to see obscure links between seemingly unrelated data. It is easy to see commonalities between pieces of information when displayed graphically. Maltego can graphically display the links between different kinds of data, such as people, organizations, domain names, IP addresses, and documents.

To create a new graph, use either the Ctrl+T keyboard command or click on the + button next to the application icon. Once the graph is available, you can add entities and run transforms to change those entities. The palette is available once you click the Manage tab and see it listed under Windows, which contains a default collection of entities (see Figure 6).

Managing reconnaissance information with Maltego.
Figure 6: Managing reconnaissance information with Maltego.

Select a node from the palette and drag it onto the graph; to edit the value, double click on the text. Left click on the node you want to select (you should see a rectangle appear around it in yellow), and you will see a list of transforms to run. All the transforms can be displayed and a selection made by clicking on a transform name. Transforms can also be grouped logically by the user into sets. At the top is the Maltego application button that provides access to additional functionality and resources. Maltego can easily load and save graphs that are saved with an .mtgx extension.

When you right-click on the entity and get a list of available transforms, you can choose any one of the associated transforms or apply all by choosing All transforms. This option will take some time to complete and generates a lot of traffic. The information pulled back from various public sources is displayed hierarchically, and you can view it in several ways. (see Figure 7).

Displaying information in Maltego.
Figure 7: Displaying information in Maltego.

Shodan [6] is a search engine that lets you find specific computers (router, servers, etc.) using a variety of filters. The bulk of data is taken from banners, which are metadata the server sends back to the client. This is information about the server software, what options the service supports, and banner messages or anything else the client would like to know before interacting with the server. You can enter into your search input box the following: SCADA city:"San Diego" country:US and Shodan will return SCADA systems running in San Diego. This type of search can be very helpful in doing penetration tests for public utilities.

You will even find a Shodan add-on for Maltego, which requires Maltego version 3 or later and a Shodan API key. The Shodan add-on gives you six transforms: searchShodan, searchExploitDB, searchMetasploit, getHostProfile, searchShodanDomain, and searchShodanNetblock (see Figure 8).

The Shodan add-on for Maltego offers additional transforms.
Figure 8: The Shodan add-on for Maltego offers additional transforms.

Google Search Directives

Google is quite useful for helping you find vulnerable systems in your target environment. At this year's BlackHat Las Vegas 2011 conference, researchers warned that "You can do a Google search with your web browser and start operating circuit breakers … ." Among the results was one referencing an "RTU pump status" for a remote terminal unit, like those used in water treatment plants and pipelines that appeared to be connected to the Internet. The result also included a password: 1234.

Many Google search directives will turn up interesting intrusion information. The site: directive allows an attacker to search for pages on just a single site or domain, narrowing down and focusing the search. The link: directive shows sites that link to a given website. The intitle: directive allows you to search within title text. The inurl: directive searches for specific terms to be included in the URL of a given site. The all in the name of the directive indicates you want pages with all of the terms used in the search, such as allintext:, allintitle:, and allinurl:. For more on forming useful Google queries, see the Syngress book Google Hacking for Penetration Testers [7].

A very good source to find different Google search options is the GHDB (Google Hacking Database) [8], hosted by Hackers for Charity, a group I do volunteer work for. The GHDB site catalogs Google queries that will turn up interesting information on website vulnerabilities. Many of the searches locate insecure systems or servers that expose valuable information that can be used to launch an attack. For instance, the Vulnerable Servers page includes the following entry:

"html allowed guestbook"
When this is typed in Google, it finds
websites which have HTML-enabled
guestbooks. This is really stupid because
users could totally mess up their guestbook
by typing...

followed by a series of JavaScript and HTML statements that could potentially compromise a guestbook.

A couple of other tools that implement many of the search terms contained in the GHDB are SiteDigger [9], Wikto [10], and Gooscan. SiteDigger runs on Windows and generates its searches from a user-provided domain, as well as the contents of either the GHDB or Foundstone's own FSDB of Google searches. SiteDigger is now maintained by McAfee. Wikto, which also runs on Windows, performs Google searches using the GHDB against one or more user-provided domains. Wikto offers several features, including a scan of the target webs servers looking for well-known vulnerable scripts. Gooscan, which runs on Linux and does not require a Google API key, formulates queries for Google's regular human interface web page and scrapes the results it gets back. The use of Gooscan could violate Google's terms of service.


The information in this article will be useful in preparing for your penetration test engagements. The reconnaissance phase used in many penetration tests and ethical hacking projects is used to gather information you will leverage for the remainder of the project.