User Interface Design
This page will contain a little bit of information that may help people write new user interfaces for SOCR. The following information is gleaned out of messages I sent to Reggie.
To talk to libSOCR you need to allocate a SOCR_doc document structure that keeps track of each page and the zones that have been drawn on each page. The SOCR_zone type allows creation/deletion/movement of the zones.
As a user selects a series of files the SOCR_doc::add_page() function is called, and then as the user draws zones on each page the add_zone() function is called. The bitmap isn’t stored in the SOCR_doc structure to make it possible to process thousands of pages.
The KDE interface allocates memory for a iconic preview of each page, and when the page is about to be processed, the image is loaded into memory and copied into a SOCR_pixmap structure.
To do the OCR, a SOCR_ocr object is created and the SOCR_doc and SOCR_pixmap are passed to it. The resulting OCR’d text is returned as a string.
/* allocate the document structure */
SOCR_doc doc;
doc.add_page();
/* add a zone */
doc.page(0).add_zone(10,11,200,201);
doc.page(0).add_zone(50,81,100,151);
doc.add_page();
doc.page(1).add_zone(1,2,240,101);
/* allocate the image */
SOCR_pixmap im;
for(j=0;j<h;j++)
for(i=0;i<w;i++)
im.put(i,j,_getpixel(i,j))
/* do the OCR on a page at a time*/
SOCR_ocr ocr;
string output;
ocr.read(im,doc, output);
cout << output;
The pixmap structure
The SOCR_pixmap class is a simple wrapper to an 8 bpp greyscale image. If you load in a b/w image, copy the values to SOCR_pixmap as 0==black, 255==white. For greyscale pixels use values in this range. If you use the SOCR_pixmap::put(x,y, r,g,b) method for colour images, this will use the YIQ conversion to greyscale.
Good things a UI will have
Here are a few thoughts about what a simple UI may have. If you are developing a plugin to another application, not all of these may make sense. Write to Stuart with any ideas.