[FoRK] image processing for OCR

Andy Armstrong < andy at hexten.net > on > Mon Jul 3 06:07:40 PDT 2006

On 3 Jul 2006, at 13:20, Eugen Leitl wrote:
> FoRK has been awfully quiet lately. Have received no feedback
> on my last posts at all. Are you all deathly busy, or what?
> Nevertheless, here's another question: if you've got a lousy
> scan where only font outlines are present, which digital filter
> would you use to fill in the hollow outline? Already the proper search
> term would be of much help.

Are you looking to automate this or is it a one off?

I think the easiest way to do it programatically is to scan each  
raster turning filling on and off as you cross filled pixels. That  
implements the effect of filling the paths using an odd/even winding  
rule - which is what you want for text.

The main problem then is handling the special case of the horizontal  
path segments at the tops and bottoms of letters - you only want to  
toggle filling where a path crosses the raster rather than where it  
just runs along the raster for a bit. The fact that your letter  
outlines are probably more than one pixel thick slightly complicates  
detecting that case.

Is that what you need to do? If so I'll try to provide more detail :)

Andy Armstrong, hexten.net

