Doublettes—such a pretty term, yet such a bad concept
Posted by Barbara Inge Karsch on June 10, 2011
Sooner rather than later terminologists need to think about database maintenance. Initially, with few entries in the database, data integrity is easy to warrant: In fact, the terminologist might remember about any entry they ever compiled; my Italian colleague, Licia, remembered just about any entry she ever opened in the database. But even the best human brains will eventually ‘run out of memory’ and blunders will happen. One of these blunders are so called doublettes.
According to ISO TR 26162, a doublette is a “terminological entry that describes the same concept as another entry.” Sometimes these entries are also referred to as duplicates or duplicate entries, but the technical term in standards is doublette. It is important to note that homonyms do not equal doublettes. In other words, two terms that are spelt the same way and that are in two separate entries may refer to the same concept and may therefore be doublettes. But they may also justifiably be listed in separate entries, because they denote slightly or completely different concepts.
As an example, I deliberately set up doublettes in i-Term, a terminology management system developed by DANTERM: The terms automated teller machine and electronic cash machine can be considered synonyms and should be listed in one terminological entry. Below you can see that automated teller machine and its abbreviated form ATM have one definition and definition source, while electronic cash machine and its abbreviated form, cash machine, are listed in a separate entry with another, yet similar definition and its definition source. During database maintenance, these entries should be consolidated into one terminological entry with all its synonyms.
It is much easier to detect homographs that turn out to be doublettes. Rather, it should be easier to avoid them in the first place: after all, every new entry in a database starts with a search of the term denoting the concept; if it already exists with the same spelling, it would be a hit). Here are ‘homograph doublettes’ from the Microsoft Language Portal. While we can’t see the ID, the definition shows pretty clearly that the two entries are describing the same concept.
Doublettes happen, particularly in settings where more than one terminologist adds and approves entries in a database. But even if one terminologist approves all new concepts, s/he cannot guarantee that a database remains free of doublettes. The right combination of skills, processes and tool support can help limit the number, though.