Home
The Challenge: Digitising Old Texts
Black letter fonts, also known as “Gebrochene Schriften” or broken scripts, first emerged as early as the 12th century, and evolved over the years to consist of a variety of derivations and font types.
Common characteristics and peculiarities of the type include the elongated s and ligatures, or “joined” letters for certain letter combinations. The frequency of its application makes the understanding of Fraktur essential for studying text and developing recognition technologies for the period between 1800 and 1938.
*Processed with ABBYY Recognition Server: Gothic/Fraktur enabled/disabled
Resume:
The sample clearly shows that tuned and optimised recognition technologies have to be used when processing historic documents printed in old fonts.
The same, of course, applies when “old” and “modern” fonts are mixed.
-
New Scientific Paper on Gothic/Fraktur OCR: University of Zurich - State Archive Zurich & ABBYY Recognition Server 3.0 – Scientific publication on the project - 7 pages
PDF, 7 pages
-
-
-
IMPACT Centre of Competence
… is a new, none profit organisation with the mission to make the digitisation of historical printed text “better, faster, cheaper”. It will provide tools, services and facilities to further advance the state-of-the-art in the field of document imaging, language technology and the processing of historic text.