Bypassing uncomfortably easy CAPTCHAs using python PIL in my free time

3la2kb
System Weakness
Published in
6 min readAug 10, 2023

--

One day at night, I was going through my recon data for multiple bug bounty targets but not really having the energy to do the usual ‘test’, an endpoint of an old target caught my attention…I thought it was new, but when it opened it reminded me of a really annoying CAPTCHA because it was so simple…the text it self felt easy to be recognized by OCR, but i couldn’t bypass it, no usual CAPTCHA answer parameter tricks, no OCR.

And this is for example how a CAPTCHA looks :

It looks so easy, but tesseract returned empty or non-writable bytes (using all psm and oem) , online OCR retuned on maximum 1 or 2 correct characters

One of the best trials to use online tools.

i didn’t really feel like pentesting but I really needed something else to geek on for fun, so i decided that i will figure it out my own…either writing a script with a stone-age custom OCR engine to detect the characters specifically for this case or just writing some sort of an extension to one of the existing tools.

After some time thinking about how i would approach this i realized that i have to do some analysis on how random is this, what colors do they use, how random are the character placements and do they intersect, and alot other questions that i will wrap up here :

Q: How random was the numbers ?
A: Wasn’t able to detect a certain pattern.

Q: Do the characters intersect, and how random are the placements ?
A: Yes, they intersect but they will never be on top of each other, and the placements are random.

Q: Do the characters rotate :
A: No

Q: What colors do they use :
A: That was the vital part, they were using random colors for the background but always the same color RGB(128,0,128) for the text.

Brainstorming

After I did my analysis, I came up with an idea that would or wouldn’t work, but it was all about simplifying these images to tesseract-ocr to be able to give the correct reading.

My initial plan was :

  • separate text from background (make background white and text black).
  • detect the locations of continuous empty-spaces and characters separately.
  • crop each character to the center of a new image
  • pass each new image to tesseract in order with the right configuration

separating text from background :

A quick search on the web got that code snippet :

def seperate(im):
newimdata = []
targetcolor = (128,0,128)
whitecolor = (255,255,255)
for color in im.getdata():
if color = targetcolor :
newimdata.append( (0,0,0) )
else:
newimdata.append( whitecolor )
newim = Image.new(im.mode,im.size)
newim.putdata(newimdata)
return newim

but that only got a thin cracky line of dark pixels left on a back background so I made a small change

def seperate(im):
newimdata = []
targetcolor = (128,0,128)
whitecolor = (255,255,255)
for color in im.getdata():
if abs(color[0]-targetcolor[0]) <= 100 and abs(color[1]-targetcolor[1]) <= 100 and abs(color[2]-targetcolor[2]) <= 100:
newimdata.append( (0,0,0) )
else:
newimdata.append( whitecolor )
newim = Image.new(im.mode,im.size)
newim.putdata(newimdata)
return newim

I made the script accept color change to a certain limit which is 100 on each color of the RGB, I figured this exact number by trial and error…nothing fancy.

this gave me a clear white background with the text in black :

now i had either to split each character out of the picture and pass it to tesseract-ocr or find a pixel pattern for each character and determine the text using this.

So to see if the first (easier) solution is going to work or not I did the work for the script this time, I split each character alone and pass it to tesseract-ocr and it worked but with some mistakes, maybe because the images were too small but we will get to that later because i felt that it’s pretty doable.

So my plan was :

  • get the left-most black pixel of the image and consider it my starting point.
  • get the next first occurence of a white pixel and consider it as a start for a new image.
  • repeat the steps above vertically and horizontally after adjusting the cropping point.
  • run tesseract-ocr on the resulting images one by one.
  • detecting OCR mistakes and adapting to them like removing unwanted characters and checking for repeated mistakes

this was the final code if you are craving some spaghetti :

from PIL import Image
import sys
import os

imagePath = sys.argv[1]
newImagePath = 'res.png'
im = Image.open(imagePath)


def seperate(im):
newimdata = []
target_color = (128,0,128)
white_color = (255,255,255)
for color in im.getdata():
if abs(color[0]-target_color[0]) <= 100 and abs(color[1]-target_color[1]) <= 100 and abs(color[2]-target_color[2]) <= 100:
newimdata.append( (0,0,0) )
else:
newimdata.append( white_color )
newim = Image.new(im.mode,im.size)
newim.putdata(newimdata)
return newim


seperate(im).save(newImagePath)

im2 = Image.open(newImagePath)
pixels = list(im2.getdata())
width, height = im2.size
pixels = [pixels[i * width:(i + 1) * width] for i in range(height)]


def lowest_point(pixels, width, height):
x = width
y = height
while pixels[y-1][x-1] != (0,0,0) and y >= 0 :
y -= 1
x = width
while pixels[y-1][x-1] != (0,0,0) :
if x == 0 :
break
x -= 1
return [x,y]


def upperleft_point(pixels, start,width, height):
if width == None or start == None :
return None
for x in range(start,width) :
for y in range(height) :
if pixels[y][x] == (0,0,0) :
return [x,y]


def split_points(pixels,start,width,height):
upperleft = upperleft_point(pixels,start,width,height)
if upperleft == None or width == None:
return None
initial = upperleft[0]
y = height-1
for x in range(initial, width) :
while y >= 0 and (0,0,0) != pixels[y][x]:
if y == 0 :
return x
y -= 1



def output_image(pixels, start ,width, height, fname) :
res = []
for y in range(height) :
for x in range(start,width) :
res.append(pixels[y][x])
im_pil = Image.new(im.mode, (width-start,height))
im_pil.putdata(res)
im_pil.save(fname)

def pixel_column(pixels, x):
res = []
height = len(pixels)
for i in range(height) :
res.append(pixels[i][x])
return res



init_x = upperleft_point(pixels,0,width,height)[0]
fn = 0
start = 0
whites = []
for x in range(init_x,width):
if (0,0,0) not in pixel_column(pixels,x) :
whites.append(x)


def crop_x(pixels,width):
global whites
res = []
start = whites[0]
point2 = int(upperleft_point(pixels,0,width,height)[0]/2)
res.append(point2)
for x in range(len(whites)) :
if x == len(whites)-1 :
break
if whites[x+1] - whites[x] != 1 :
point = int((whites[x] + start)/2)
res.append(point)
start = whites[x+1]
point = int((whites[-1]+width)/2)
res.append(point)
return res


cx = crop_x(pixels,width)

for x in range(len(cx)-1):
crop_width = cx[x+1]-cx[x]
output_image(pixels,cx[x],cx[x+1],height,str(x)+".png")

res = ""
flag = 0
for x in range(len(cx)-1):
cmd = os.popen("tesseract "+str(x)+".png stdout --dpi 70 --psm 10").read().strip()
if len(cmd) > 1 :
flag = 1
res += cmd

if len(res.strip()) > 6 :
i = 0
if flag :
res = res.replace("Ic","k")
l = len(res)-1
while i < l :
if res[i] == res[i+1].upper() and res[i] != res[i+1] :
res = res.replace(res[i]+res[i+1],res[i+1])
l -= 1
i += 1
res = res.replace("|","I").replace("°","o").replace("©","c").replace("I<","k").replace("I¢","k").replace("I‘","k").replace("¥","y")

print("[+] Captcha :",res)

--

--