Text normalization
Appearance
Text normalization is a process by which text is transformed in some way to make it consistent in some way which it may not have been before. Text normalization is often performed before a text is processed in some way, such as generating synthesized speech, automated language translation, and storage in a database.
Examples of text normalization:
- converting all letters to lower or upper case
- removing punctuation
- removing letters with accent marks and other diacritics
- expanding abbreviations