A snipping tool written in Python that automatically identifies text in a snipped image and performs a google search.
Like many programmers, I find myself copying and pasting text very often. The majority of the time I’m copying some sort of error message from my IDE or terminal to search on Google. The rest of the time I’m copying a method or function into google to read the documentation. The process of highlighting the text, copying, opening a browser, pasting and then searching can get a little tedious over time (first world problems I know). Well I decided to automate this process using Python.
I want a desktop app that will sit in the background and act much like windows snipping tool. It will have two functions: Snip & Search and Snip & Copy.
Both functions will allow me to snip a region of text as an image, perform character recognition on the image, and then return a string. The Snip & Search function will then automatically open a browser tab and search the string from the snipped image in Google. The Snip & Copy function, on the other hand, will save the string to my clipboard.
So below is a demonstration of how to use the program. It’s actually quite useful when copying text from places where you can’t copy in the usual way such as the terminal, images or youtube tutorials.
You can find the full code or download the .exe file on my Github. Please note, at the moment the app has only been tested on my PC running Windows 10. Modifications will need to be made to ensure that the app working on other operating systems. I might get to this in the future.
So how did I make it…
As I previously mentioned the app is written in Python with the main libraries used being PyQT5, cv2, Pillow and PyTesseract. The GUI allows me to set the browser I wish to use and waits on the press of a function button. Once a function button is pressed an instance of the snipping tool widget is created and waits for the user to snip a region of their screen using click and drag. Credit must be given here to harupy on github who has created a great snipping tool app written in Python. His code forms the basis for the snipping tool portion of this project.
Once a region is snipped, the image is converted to a greyscale array using the cv2 library and then passed to the Tesseract Optical Character Recognition (OCR) engine using the PyTesseract library. This is a wrapper for Google’s open source Tesseract-OCR Engine.
Once the snipped image has been analysed and the text recognised, the string is either saved to the clipboard using the pyperclip library, or a search is performed in the browser specified. The GUI then resets and wait for another snip.
The text recognition isn’t 100% accurate, particularly in long sections and when the text is certain colours. I am sure I could improve this with some more advanced image processing in cv2 before passing in into the Tesseract OCR engine. These changes may be done in the future if I find the time. I will update this article in the future if I implement any improvements. Any suggestions of improvements are welcome. Have fun snipping!