BIK Terminology

Solving the terminology puzzle, one posting at a time

  • About
    • Curriculum Vitae
  • Services
  • Portfolio
  • Resources
  • Blog
  • Contact

Why doublettes are bad

June 15, 2011 by Barbara Inge Karsch

One of the main reasons of having a concept-oriented terminology database is that we can set up one definition to represent the concept and can then attach all its designations, including all equivalents in the target language. It helps save cost, drive standardization and increase usability. Doublettes offset these benefits.

The below diagrams are simplifications, of course, but they explain visually why concept orientation is necessary when you are dealing with more than one language in a database. To explain it briefly: once the concept is established through a definition and other concept-related metadata, source and target designators can be researched and documented. Sometimes this research will result in multiple target equivalents when there was only one source designator; sometimes it is just the opposite, where, say, the source languages uses a long and a short form, but the target language only has a long form.

If you had doublettes in your database it not only means that the concept research happened twice and, to a certain level, unsuccessfully. But it also means that designators have to be researched twice and their respective metadata has to be documented twice. The more languages there are, the more expensive that becomes. Rather than having, say, a German terminologist research the concept denoted by automated teller machine, ATM and electronic cash machine, cash machine, etc. two or more times, research takes place once and the German equivalent Bankautomat is attached as equivalent potentially as equivalent for all English synonyms.

Doublettes also make it more difficult to work towards standardized terminology. When you set up a terminological entry including the metadata to guide the consumer of the terminological data in usage, standardization happens even if there are multiple synonyms. Because they are all in one record, the user has, e.g. usage, product, or version information to choose the applicable term for their context. But it is also harder to use, because the reader has to compare two entries to find the guidance.

And lastly, if that information is in two records, it might be harder to discover. Depending on the search functionality, the designator and the language of the designator, the doublettes might display in one search. But chances are that only one is found and taken for the only record on the concept. With increasing data volumes more doublettes will happen, but retrievability is a critical part of usability. And without usability, standardization is even less likely and even more money was wasted.

SHARE THIS:

Doublettes—such a pretty term, yet such a bad concept

June 10, 2011 by Barbara Inge Karsch

Sooner rather than later terminologists need to think about database maintenance. Initially, with few entries in the database, data integrity is easy to warrant: In fact, the terminologist might remember about any entry they ever compiled; my Italian colleague, Licia, remembered just about any entry she ever opened in the database. But even the best human brains will eventually ‘run out of memory’ and blunders will happen. One of these blunders are so called doublettes.

According to ISO TR 26162, a doublette is a “terminological entry that describes the same concept as another entry.” Sometimes these entries are also referred to as duplicates or duplicate entries, but the technical term in standards is doublette. It is important to note that homonyms do not equal doublettes. In other words, two terms that are spelt the same way and that are in two separate entries may refer to the same concept and may therefore be doublettes. But they may also justifiably be listed in separate entries, because they denote slightly or completely different concepts.

As an example, I deliberately set up doublettes in i-Term, a terminology management system developed by DANTERM: The terms automated teller machine and electronic cash machine can be considered synonyms and should be listed in one terminological entry. Below you can see that automated teller machine and its abbreviated form ATM have one definition and definition source, while electronic cash machine and its abbreviated form, cash machine, are listed in a separate entry with another, yet similar definition and its definition source. During database maintenance, these entries should be consolidated into one terminological entry with all its synonyms.

It is much easier to detect homographs that turn out to be doublettes. Rather, it should be easier to avoid them in the first place: after all, every new entry in a database starts with a search of the term denoting the concept; if it already exists with the same spelling, it would be a hit). Here are ‘homograph doublettes’ from the Microsoft Language Portal. While we can’t see the ID, the definition shows pretty clearly that the two entries are describing the same concept.

Doublettes happen, particularly in settings where more than one terminologist adds and approves entries in a database. But even if one terminologist approves all new concepts, s/he cannot guarantee that a database remains free of doublettes. The right combination of skills, processes and tool support can help limit the number, though.

SHARE THIS:

Twitterisms

March 22, 2011 by Barbara Inge Karsch

What do you call a user of the Twitter short messaging service who is liked and admired by other users?

A tweetheart! And how do you use the term? Here is an example of how Belgian tennis player, Kim Clijsters, used it in a tweet from the Yahoo-Eurosport site: “Happy Australia day to all my Aussie tweethearts!” It earned her the Tweet of the Day.

I am not a Twitter user, or tweeter, but the terminology of Twitter has been the subject of many conversations. While this social media has been emerging at an incredible pace, some of the terminology around it is quite well developed. The glossary provided by the Twitter service contains the basics. But it doesn’t list all the good (and bad) terms that have sprung up around the service.

Some of the terms that don’t work so well are impossible to pronounce. The list in this article on About.com contains designations, like Twitpocalypse, which is defined as “the moment when the identification number of individual tweets surpassed the capacity of the most common data type. The Twitpocalypse crashed a number of Twitter clients.” The motivation behind the name is clear, though.

This article* in the quarterly webzine of the Macmillan English Dictionaries, MED Magazine, has a very nice list of twitterisms. I would consider most of them quite well-motivated. If you don’t want to check out the link, here is another example: What group do people belong to whose tweets attract a large number of readers? The twitterati.

*BIK: Unfortunately this article was removed recently.

SHARE THIS:
« Previous Page
Next Page »

Blog Categories

  • Advanced terminology topics
  • Branding
  • Content publisher
  • Events
  • Interesting terms
  • Job posting
  • Process
    • Coining terms
    • Designing a terminology database
    • Maintaining a database
    • Researching terms
    • Selecting terms
    • Setting up entries
    • Standardizing entries
  • Return on investment
  • Skills and qualities
    • Negotiation skills
    • Producing quality
    • Producing quantity
  • Subject matter expert
  • Terminologist
  • Terminology 101
    • Terminology methods
    • Terminology of terminology
    • Terminology principles
  • TermNet
  • Theory
  • Tool
    • iTerm
    • Machine translation
    • Proprietary terminology management systems
      • J.D. Edwards TDB
      • Microsoft Terminology Studio
    • Term extraction tool
      • memoQ
    • Terminology portals
      • BACUS
      • EuroTermBank
      • Irish National Terminology Database
      • Microsoft Language Portal
      • Rikstermbanken
  • Translator
  • Usability

Blog Archives

  • November 2012
  • October 2012
  • September 2012
  • November 2011
  • October 2011
  • September 2011
  • August 2011
  • July 2011
  • June 2011
  • April 2011
  • March 2011
  • February 2011
  • January 2011
  • December 2010
  • November 2010
  • October 2010
  • September 2010
  • August 2010
  • July 2010
  • June 2010
  • May 2010

BIK Terminology

  • About Barbara Inge Karsch
  • Terminology Services
  • Terminology Resources
  • My Terminology Portfolio
  • Let’s Talk Terminology

From the Blog

  • A glossary for MT–terrific! MT on a glossary—horrific!
  • Part-time position for an Arabic terminologist
  • Tidbit from the ATA Conference
  • Bilingual corpora and target terminology research
  • Terminology internship at Eurocopter in France

Find It Here

Follow Me

  • Email
  • LinkedIn
  • Phone
Copyright © 2023 BIK Terminology. All Rights Reserved. Sitemap. Website by sundaradesign.