opentag.com
\\ Technologies :: Translation :: Translation Memory

Translation memories, also known as translation databases, are collections of entries where a source text is associated with its corresponding translation in one or more target languages.

Typically, TMs are used in translation tools: when the translator "opens" a segment, the application look up the database for equivalent source text. The result is a list of matches usually ranked with a score expressing the percentage of similarity between the source text in the document and in the TM.

  • An exact match (100% match) is a match where there is no difference (or no difference that cannot be handled automatically by the tool) between the source text in the document and the source text in the TM.

  • A fuzzy match (less than 100% match) is a match where the source text in the document is very similar, but not exactly the same, as the source text in the TM. Duplicated exact matches are also often treated as fuzzy matches.

A feature used more and more in TMs is the use of machine translation to create fuzzy matches, allowing some level of integration of MT systems without changing the whole translation process.

There are many advantages in using TMs:

  • The translation can go much faster, avoid unnecessary re-typing of existing translations, or having to change only parts of text.

  • TMs also allow a better control of the quality by offering translation candidates that have been already approved, with the correct terminology.

There are also some drawbacks:

  • If terminology changes between projects the content of a TM needs to be updated to reflect these changes. The same way, if the TM is not updated after edit and proof, or if revisions are not entered, all these changes will be in the TM the next time you use it.

  • Automatically leveraging translation using exact matches (without validating them) can generate incorrect translation since there is no verification of the context where the new segment is used compared to where the original one was used: this is the difference between true reuse and recycling. Most TM systems are recycling systems.

Translation memory is a powerful technology that can help lowering the cost of localization. However, the use of TM needs to be weighted and all factors taken in account.

To allow better interoperability between different tools, the Translation Memory eXchange format (TMX), has been created.