This project is a Google Colab-compatible image processing pipeline that:
β
Removes red backgrounds from images
β
Extracts white-colored Turkish text
β
Uses Tesseract OCR to convert image content into readable text
When you upload an image with a red background and white text, the script:
- Detects red areas using HSV color masking
- Inverts the mask to isolate white text
- Applies morphological operations to clean up noise
- Runs Tesseract OCR (Turkish language) to extract the text
Works especially well for:
- Posters with red backgrounds
- Documents or visuals with strong red coloring
- White-on-red warning labels, graphics, or forms
- Converts image to HSV color space
- Masks red areas using two HSV ranges
- Inverts the mask so text appears white
- Optionally applies morphological operations
- Uses Tesseract with
--psm 6layout mode - Language set to
tur(Turkish) - Extracts text from the masked image
- OpenCV
- NumPy
- Matplotlib
- Tesseract OCR
- Turkish OCR package:
tesseract-ocr-tur
- Only red β white contrast is currently supported
- Works best with clean, high-contrast images
- Use
RGBAorRGBcompatible formats (e.g., .png, .jpg)