I'm not even a manga reader, but I can imagine how tedious it must be to clean speech bubbles from text if you've decided to translate it.
I'm not even a manga reader, but I can imagine how tedious it must be to clean speech bubbles from text if you've decided to translate it into another language.
For Linux users, there's an excellent option — Panel Cleaner, which allows you to automate this process with OCR support.
On first launch, the application will require additional ML download for text detection and recognition. Panel Cleaner will handle this task on its own, so there's no need to search for or configure anything additionally. However, if necessary, you can use your own OCR engines.
Next, simply add images (or a folder with them) and configure the necessary processing parameters on the left panel. You can specify how to detect text, preprocessor, noise suppression, and much more.
Then, for each image element, generation needs to be performed. This way, the application will separate text, background, bubbles, and other elements into separate masks.
It's not possible to edit text within the application, but you can save the final result with clean bubbles and recognized text in a separate folder. You can see the cleaning result in the images below.
As you can see, the text from speech bubbles was almost perfectly removed. Naturally, nothing came out with the text overlaid on the images.
However, I couldn't recognize English text with the default engine and default settings.