Find Subdomains of a website using python programming
Crawling Subdomains
Subdomains are the domain before actual the actual domain
name and part of the main domain. A subdomain is a domain that is a part of a
larger domain under the Domain Name System (DNS) hierarchy. A subdomain is
basically a child domain under a larger parent domain name. In the larger
scheme of the Domain Name System, it is considered a third-level domain used to
organize site content.
Example –
Subdomain.target.com
mail.google.com
plus.google.com
apps.facebook.com
mail.google.com
plus.google.com
apps.facebook.com
Therefore if we want to hack any website we need to test all
of it’s subdomains. A lot of companies have hidden sub domains that they do not
expose to everyone because it may contain some critical information like
subdomains for the employees who work in the company, subdomain for beta version
of the website and these subdomains are great for finding vulnerability or any
security bug. Most of the time these subdomains are not developed much secured
as main website. Hence it is very important to discover all the subdomains of
the website in order to test them so , this proves that how important
information gathering is.
The first program we are going to write to discover the
subdomains on the target website, we will be using number of names for that and
if we get a positive response, than that sub domain exits.
So in order to do that we should have a way for
communicating with our target website or web server through our python program.
This can be achived by sending HTTP request to the web server through our
python program.
Sending GET Requests to web Server.
For sending GET Requests to web Server we are using the
python module called Requests.
Requests
allows you to send organic, grass-fed
HTTP/1.1 requests, without the need for manual labor. There’s no need to
manually add query strings to your URLs, or to form-encode your POST data.
Keep-alive and HTTP connection pooling are 100% automatic.
Installation of Requests

Make a request
Making a request with Requests is very easy.
First import the
Requests module
Now, lets’s try to fetch a web page. Here we are using Github’s website as
an example
Now we have a response object called “response”, we can get all the options we need for this object.
Similarly, we can generate other HTTP requests types.
Response Content
We can read the content of the server’s response. Consider the GitHub example againRequests will automatically decode content from the server. Most unicode charsets are seamlessly decoded.
In the above script we have imported Requests module. We
have specified the variable as “url” which we want to request (google.com),
than we have call a function called get which allow us to send a get request
using request library and we are passing url as a variable i.e google.com and
we are storing the result returned by this function in the variable called res.
On the running the
above script we are getting a response and the response code is 200 which means
success, hence we have successfully sent the get request to the web server and
received the domain. Now let’s try the same script for the subdomain.
As we can see that the above script also works fine with
subdomains. So now using our code we are able to communicate with websites, we
can send get requests and can also verify that this request was completed or
get failed. Using this we are able to test weather a certain subdomain exist on
a website or not.
Discovering Subdomains
As far we have build a program which can communicate with a website over HTTP. As an example we built a very simple program that can request a certain domain and if that domain exist it’s going to print the response otherwise it won’t show anything. Now we are going to write a program which will discover all the subdomains on our target. To discover all the subdomains on our target we are going to use a wordlist. Wordlist is simple text file, basically it contains a number of words each at specific line and our python program will read this file one line at a time, each time trying this word as a subdomain and see if the subdomain exist. This technique is used a lot during penetration testing, it is used to crack keys, gaining access etc. The whole idea is to use a wordlist which is a file with lot of words that can answer your question and try all of possibilities.The first thing we need to do in our python program is to open this file. The key function for working with files in Python is the
open()
function. The open()
function takes two parameters; filename,
and modeThere are four different methods (modes) for opening a file:
"r"
- Read -
Default value. Opens a file for reading, error if the file does not exist"a"
- Append -
Opens a file for appending, creates the file if it does not exist"w"
- Write -
Opens a file for writing, creates the file if it does not exist"x"
- Create -
Creates the specified file, returns an error if the file existsIn addition you can specify if the file should be handled as binary or text mode
"t"
- Text -
Default value. Text mode"b"
- Binary -
Binary mode (e.g. images)Syntax
To open a file for reading it is enough to specify the name of the file:
file = open("testfile.txt")
The code above is the same as:
file = open("testfile", "rt")
Because "r" for read, and "t"
for text are the default values, you do not need to specify them.
. Note: Make
sure the file exists, or else you will get an error
In the above program we can see that it is printing each
word of a file in a separate line. So basically it is iterating over each line
and executing the print statement in here once in every iteration now we can
use these words to build our possible domain. Now we need to append each word
of a line to the target url so that we could build a complete sub domain and
can test it.
Hence
our above program has successfully discovered the sub domains from the target
website. Now we can also interact with website from our python program using
requests method
Thanks,
MAYANK BARSAINYA
Founder, M7 SECURITY
Thanks,
MAYANK BARSAINYA
Founder, M7 SECURITY
Highly recommend combining manual scan with fast explore tools such as spyse. Will save a lot of time
ReplyDeleteA quick guide - https://spyse.com/blog/information-gathering/how-to-find-subdomains-instantly