Find Subdomains of a website using python programming


Crawling Subdomains

Subdomains are the domain before actual the actual domain name and part of the main domain. A subdomain is a domain that is a part of a larger domain under the Domain Name System (DNS) hierarchy. A subdomain is basically a child domain under a larger parent domain name. In the larger scheme of the Domain Name System, it is considered a third-level domain used to organize site content.
Example –
Subdomain.target.com
mail.google.com
plus.google.com
apps.facebook.com
Therefore if we want to hack any website we need to test all of it’s subdomains. A lot of companies have hidden sub domains that they do not expose to everyone because it may contain some critical information like subdomains for the employees who work in the company, subdomain for beta version of the website and these subdomains are great for finding vulnerability or any security bug. Most of the time these subdomains are not developed much secured as main website. Hence it is very important to discover all the subdomains of the website in order to test them so , this proves that how important information gathering is.
The first program we are going to write to discover the subdomains on the target website, we will be using number of names for that and if we get a positive response, than that sub domain exits.
So in order to do that we should have a way for communicating with our target website or web server through our python program. This can be achived by sending HTTP request to the web server through our python program.

Sending GET Requests to web Server.

For sending GET Requests to web Server we are using the python module called Requests.
Requests allows you to send organic, grass-fed HTTP/1.1 requests, without the need for manual labor. There’s no need to manually add query strings to your URLs, or to form-encode your POST data. Keep-alive and HTTP connection pooling are 100% automatic.

Installation of Requests

To install Requests, simply run this simple command in your terminal of choice:

Make a request

Making a request with Requests is very easy.
First import the Requests module


Now, lets’s try to fetch a web page. Here we are using Github’s website as an example



Now we have a response object called “response”, we can get all the options we need for this object.
Similarly, we can generate other HTTP requests types.






Response Content

We can read the content of the server’s response. Consider the GitHub example again



Requests will automatically decode content from the server. Most unicode charsets are seamlessly decoded.



In the above script we have imported Requests module. We have specified the variable as “url” which we want to request (google.com), than we have call a function called get which allow us to send a get request using request library and we are passing url as a variable i.e google.com and we are storing the result returned by this function in the variable called res.

On the running the above script we are getting a response and the response code is 200 which means success, hence we have successfully sent the get request to the web server and received the domain. Now let’s try the same script for the subdomain.


As we can see that the above script also works fine with subdomains. So now using our code we are able to communicate with websites, we can send get requests and can also verify that this request was completed or get failed. Using this we are able to test weather a certain subdomain exist on a website or not.


Discovering Subdomains

As far we have build a program which can communicate with a website over HTTP. As an example we built a very simple program that can request a certain domain and if that domain exist it’s going to print the response otherwise it won’t show anything. Now we are going to write a program which will discover all the subdomains on our target. To discover all the subdomains on our target we are going to use a wordlist. Wordlist is simple text file, basically it contains a number of words each at specific line and our python program will read this file one line at a time, each time trying this word as a subdomain and see if the subdomain exist. This technique is used a lot during penetration testing, it is used to crack keys, gaining access etc. The whole idea is to use a wordlist which is a file with lot of words that can answer your question and try all of possibilities.
The first thing we need to do in our python program is to open this file. The key function for working with files in Python is the open() function.  The open() function takes two parameters; filename, and mode
There are four different methods (modes) for opening a file:
"r" - Read - Default value. Opens a file for reading, error if the file does not exist
"a" - Append - Opens a file for appending, creates the file if it does not exist
"w" - Write - Opens a file for writing, creates the file if it does not exist
"x" - Create - Creates the specified file, returns an error if the file exists
In addition you can specify if the file should be handled as binary or text mode
"t" - Text - Default value. Text mode
"b" - Binary - Binary mode (e.g. images)

Syntax

To open a file for reading it is enough to specify the name of the file:
file = open("testfile.txt")
The code above is the same as:
file = open("testfile", "rt")
Because "r" for read, and "t" for text are the default values, you do not need to specify them.
. Note: Make sure the file exists, or else you will get an error

In the above program we can see that it is printing each word of a file in a separate line. So basically it is iterating over each line and executing the print statement in here once in every iteration now we can use these words to build our possible domain. Now we need to append each word of a line to the target url so that we could build a complete sub domain and can test it.


Hence our above program has successfully discovered the sub domains from the target website. Now we can also interact with website from our python program using requests method

               Thanks,
                               MAYANK BARSAINYA
                               Founder, M7 SECURITY
 




 

Comments

  1. Highly recommend combining manual scan with fast explore tools such as spyse. Will save a lot of time
    A quick guide - https://spyse.com/blog/information-gathering/how-to-find-subdomains-instantly

    ReplyDelete

Post a Comment

Popular Posts