Technology Led By Innovation
SimpOCR is a simple OCR program that uses keras-ocr package of python along with tensorflow and pretrained models to detect and predict the texts from images and pdfs. It supports png, jpg and pdf formats. PDF formats are converted to image formats and then text is extracted from that. It is a command line tool that can output the extracted text into console or save that into a output file. The input file and output file (optional) can be passed as arguments on command line to the tool.