Text Analysis


Our solution for collecting linguistic data:

Vocabulary lists
Document properties
Linguistic statistical data
Register information

Data Cleanup


Our solution for cleansing language data:

Orthographic corrections
Quality metrics
Variant detection

Terminology Compilation


Our solution for terminology compilation:

Term extraction
Terminology evaluation
Terminology consolidation

Content Processing


Our solution for content processing:

Key word extraction
Statistical classification
Document indexation
Document clustering


Cleaning Up Legacy Product Data Is Feasible

This was the conclusion of the talk Verjüngungskur für sprachliche Altdaten (Rejuvenating Treatment for Legacy Data) presented by Axel Theofilidis at the tekom/tcworld conference 2017. Legacy product data may exhibit specific properties due to limitations of early IT systems such as texts written in capital letters only, without umlauts, with lots of orthographic errors and the like. Such data can be cleaned and recovered to a degree of 95 % and even more by using high quality linguistic processing tools. The tiny rest may easily be corrected and adapted intellectually.

Learn more about how legacy product data can be cleaned by using high quality linguistic processing tools. The results perfectly fit modern language technology applications such as authoring memories or translation memories.

Lemmatisation of Luther Bible 2017 supports lemma based search ...

Deutsche Bibelgesellschaft is publishing a new version of the Luther Bible with the support of  IAI Lingustic Content AG.