[FoRK] image processing for OCR

Eugen Leitl < eugen at leitl.org > on > Mon Jul 3 07:12:52 PDT 2006

On Mon, Jul 03, 2006 at 11:54:43PM +1000, Damien Morton wrote:

> The algorithm is called a flood fill. 
> http://student.kuleuven.be/~m0216922/CG/floodfill.html

I'm familiar with it, and unfortunately it wouldn't do.
> Though with what you are doing, its a bit different because the areas 
> you want to fill arent known in advance, and I imagine a fair portion of 
> the regions you would want to fill are broken regions - i.e. with at 


> least some pixels missing somewhere along their borders.
> You could do a blur followed by its inverse (sharpen), which would close 
> small gaps, but I fear that with fine detailed text, you would create 
> more problems than you would solve that way (i.e. by closing gaps that 
> _should_ be there).

I imagine some digital filter with a mask (the features are small) might
be able to enhance the recognition rate.
> Can you put some of these scans up somewhere for download?

Alas, unfortunately I don't think I can put up whole pages, lets
I leak too much information about the project (it's perfectly boring,
but corporate customers are pretty irrational about secrecy).
Eugen* Leitl <a href="http://leitl.org">leitl</a> http://leitl.org
ICBM: 48.07100, 11.36820            http://www.ativel.com
8B29F6BE: 099D 78BA 2FD3 B014 B08A  7779 75B0 2443 8B29 F6BE

