Reference number:	TIN2006-15694-C02-02
Duration:	01/2010 – 12/2012
Funding:	Ministerio de Ciencia e Innovación
Partners:	ICAR
Director:	Josep Lladós
Members:	Marçal Rusiñol Josep Lladós Joan Mas Romeu Jaume Gibert Gemma Sánchez Albaladejo Farshad Nourbakhsh Ernest Valveny Antonio Clavelli Alicia Fornés Albert Gordo Oriol Ramos Terrades David Fernández Agnés Borràs

Huge amounts of documents are being stored currently as digital images at private and public organizations. However, for these raw digital images to be really useful, they need to be annotated with informative content. Document Image Analysis and Pattern Recognition techniques are at the heart of current solutions to this problem. They are mainly Optical Character Recognition (OCR) solutions that are mature commercial products with high performance in typewritten and structured documents. However, when dealing with difficult unconstrained documents, commercial OCR products are simply not usable since, in the vast majority of these documents elements can by no means be isolated automatically.

OCR is usually not enough…

Given the high error rates involved in “post-editing” solutions, only semi-automatic or computer-assisted alternatives can be currently foreseen. The IKETDIHC project aims at developing innovative technologies to implement such computer-assisted solutions, that will be applied to difficult documents, specifically to ancient documents, handwritten documents, unstructured documents, documents with heterogeneous contents, and handwritten music scores.

Project goals

IKETDIHC is a coordinated project with two subprojects: “Knowledge Extraction from Documents Images with Heterogenous Contents” (KEDIHC), and “Multimodal Interaction for Text Transcription with Adaptive Learning” (MITTRAL). KEDIHC will be devoted to the analysis and annotation of documents with heterogenous contents.

This process will be carried out, first, by analyzing the structure of the document according to coloured relevant regions. Second, a complete structural description of the document will be completed in terms of the spatial relationship of its constituents zones. Third, the categorization of the document will be done according to its structure. Finally, new techniques to extract semantic knowledge from graphic parts of the document images will be developed.

MITTRAL will develop advanced techniques and multimodal interfaces for the transcription of handwritten document images, following an interactive-predictive approach. Adaptive learning techniques for handwritten text recognition will be also developed. Textual parts will have been previously identified by KEDIHC.

IKETDIHC will pay special attention to the way in which end-users may provide their input. End-user input will be mainly based on intuitive graphical input devices to speed up the operations. These multimodal capabilities will be developed by experts on on-line handwritten text recognition (MITTRAL) and sketching (KEDIHC) and will be integrated in a collaborative platform.

The designed software tool will be periodically evaluated in terms of usability and profitability by Promoter-Observer Entities (EPOs).