In conjunction with the 14th IAPR International Conference on Document Analysis and Recognition ICDAR’17, the HISTORICAL BOOK ANALYSIS COMPETITION (HBA) is organized. The HBA competition will address a thriving topic of major interest of many researchers in different fields including (historical) document image analysis, image processing, pattern recognition and classification.
The HBA competition will provide a large experimental corpus and a thorough evaluation protocol to ensure a consistent comparison of image processing methods for historical document image analysis.
A new dataset which is called the HBA dataset will be released at this occasion. The HBA dataset is composed of 4436 real ground-truthed historical document images. It has been ground-truthed by annotating each foreground pixel. The ground truth information is currently available at pixel level.
Two nested challenges are proposed in the HBA competition. Firstly, the HBA competition will aim at evaluating how image analysis methods could discriminate the textual content from the graphical ones at pixel level. Secondly, it will aim at assessing the capabilities of the participating methods to separate the textual content according to different text fonts (e.g. lowercase, uppercase, italic, …) at pixel level.