Installing Tesseract on Windows
PyTesseract is a widely used open-source OCR engine for Python that read and recognizes text in images. It determines text lines that are fixed pitch and slices the words into characters based on the pitch. While it is known for its accuracy and versatility, it can be challenging to install it in a Windows environment.
Installation steps
1. Download and install Tesseract
2. Add TESSDATA_PREFIX in the System Environment Variables:
Variable Name - TESSDATA_PREFIX
Variable Value - C:\Program Files (x86)\Tesseract-OCR\tessdata
3. Add another environment variable tesseract.
Variable Name - tesseract
Variable Value - C:\Program Files (x86)\Tesseract-OCR\tesseract.exe
4. Add the path in the PATH environment.
Variable Value –C:\Program Files (x86)\Tesseract-OCR