BIK Terminology

Solving the terminology puzzle, one posting at a time

  • About
    • Curriculum Vitae
  • Services
  • Portfolio
  • Resources
  • Blog
  • Contact

Bilingual corpora and target terminology research

October 24, 2012 by Barbara Inge Karsch

Here is another question that came up during one of the webinars recently: Do you think it is worth using a concordance tool to help sifting the technical terms in order to build up the terminology?

Another really good question! Let me address this by answering when it would NOT be worth using available corpora. I would not use it if I didn’t expect work from the same client or in the same subject area again. I would also not do it, if I hated the user experience of the concordance tool. If I have to struggle with a tool, I am probably faster and attain a more reliable result by simply researching the concepts from scratch. BUT if the subject matter is clear-cut, you expect more work in that area and the tool provides a nice interface that allows you to work efficiently, by all means use the existing bilingual corpus as one of your research tools.

And here comes my second qualification: Double-check the target-language equivalents found in bilingual corpora. Your additional research will a) confirm that the target term used in the corpus was correct and b) give you the metadata that you might want to document anyway. I am thinking mostly of context samples. While you could easily use the translated context from your corpus, context written by a native-speaker expert in the target language gives you a higher reliability: it shows that the term is correct and used and how it is used correctly. The beauty of working with a corpus is that you already have terms that you can check on. Be prepared to discard them, though, if your research does not confirm them. Ultimately, you want your term base entries to be highly reliable: Do the work once, reuse it many times!

SHARE THIS:

Dear readers!

September 28, 2012 by Barbara Inge Karsch

Thank you very much for your positive feedback while I was busy with things like Windows 8 terminology, teaching at NYU, attending TKE and the ISO meetings in Madrid, and doing webinars. During one of the webinars, we didn’t get around to all questions. I will be addressing some of these here now.



Question:
As you add terminology into your database, you might not remember that you have already entered some word that is a synonym. So, might you not end up with a different ID for 2 synonyms?

Answer: Yes, that is a scenario that is very common and that everyone setting up terminology entries is facing: We do our best to enter terms and names in canonical form in order to find them again and to avoid creating duplicates. So, we document, say, operating system and not Operating Systems, or we enter purge, and not to purge or purged in the database. Even though we were good about the form of our terms, we might not remember the meaning of all entries created and thus willy-nilly create doublettes in our database. Often times, we create them because we are not aware that one entry is a view onto a concept from one angle and a second entry might present the same concept from another angle, similar to these two pictures of the some flower.

Here are a few thoughts on what might help you avoid duplicate entries:

  • Start out by specifying the subject field in your database. It will help you narrow down the concept for which you are about to create an entry. You might do a search on the subject field and see what concepts you defined at an earlier time. Sometimes that helps trigger your memory.
  • As you are narrowing down the subject field and take a quick glance through some of the existing definitions, you might identify and recognize an existing concept as the one you are about to work on.

If you set up a doublette anyway—and it is bound to happen—you might find it later in one of the following ways and eradicate it:

  • Export your database into a spreadsheet program and do a quick QA on your entries. In a spreadsheet, such as Excel, you can sort each column. If there are true doublettes, you might have started the definition with the same superordinate, which, if you sort the entries, get lined up next to each other.
  • Maybe you don’t have time for QA, then I would simply wait until you notice while you are using your database and take care of it then. The damage in databases with lots of languages attached to a source language entry is bigger, but there are usually also more people working in the system, so errors are identified quickly. For the freelance translator, a doublette here and there is not as costly and it is also eliminated quickly once identified.

Developers of terminology management systems might eventually get to a point where maintenance functionality becomes part of the out-of-the-box program. At Microsoft, a colleague worked on an algorithm that helped us identify duplicates. The project was not completed when I left the corporate world, but a first test showed that the noise the program identified was not overwhelming. So, there is hope that with increasing demand for clean terminological and conceptual data such functionality becomes standard in off-the-shelf TMSs. In the meantime, stick with best practices when documenting your terms and names and use the database.

SHARE THIS:

Is “cloud” a technical term (yet)?

October 11, 2011 by Barbara Inge Karsch

We have jargon, we have words, we have phrases…we have terms. Can words become terms? How would that happen? And has “the cloud” arrived as a technical concept yet?

Cloud, as a word, is part of our everyday vocabulary. With the summer over, it’ll again be part of our daily lives in the Pacific Northwest for the next eight months. On the right is a good definition from the Merriam Webster Learner’s Dictionary. The Learner’s Dictionary is not concerned with technical language, as it is compiled for non-native speakers. So, the definition doesn’t allude to the fact that clouds, in a related sense, are also part of the field of meteorology and therefore part of a language for special purposes (LSP).

When common everyday words are used in technical communication and with specialized meaning, they have become terms through a process called terminologization. Is cloud, as in cloud computing, there yet? Or is it still in this murky area where marketing babel meets technical communication? It certainly was initially.

Here is a great blog on when cloud was used for the first time. Author John M. Willis asked his Twitter followers Who Coined The Phrase Cloud Computing? and could then trace back the first occurrences to May of 1997 and a patent application for “cloud computing” by NetCentric; then to a 1999 NYT article that referred to a Microsoft “cloud of computers”, and finally to a speech by Google’s Eric Schmidt who Willis says he would pick as the moment when the cloud metaphor became mainstream.

That was 2006, and “the cloud” may have become part of the tech world’s hype, but it wasn’t a technical term with a solid and clearly delineated definition. As Willis points out “cloud computing was a collection of related concepts that people recognized, but didn’t really have a good descriptor for, a definition in search of a term, you could say.”

Yes, we had the designator, but did we really have a clear definition? In my mind, everyone defined it differently. For a while, the idea of “the cloud” was batted around mostly by marketing and advertising folks whose job it is to use hip language and create positive connotations. When “the cloud” and other marketing jargon sound like dreams coming true to disposed audiences, they usually spell nightmare to terminologists. The path of a “cloud dream” into technical language is a difficult one. In 2008, I was part of a terminology taskforce within the Windows Server team who tried to nail down what cloud computing was. I believe the final definition wasn’t set when I left in May 2010.

An Azure architect evangelist (See You say Aaaazure, I say Azuuuure…) and I recently analyzed the conceptual area. Although he kept saying that some of the many companies in cloud computing these days “would also include x, y, or z,” x, y and z all turned out to not be “essential characteristics.” And we ended up with the following definition. It is based largely on the one published by Netlingo, but modified to meet more of the criteria of a terminological definition:

“A type of computing in which dynamic, scalable and virtual resources are provided over the Internet and which includes services that provide common business applications online and accessible from a Web browser, while the software and data are stored on servers.”

Wouldn’t it be great, if a terminologist could stand by to assist any time a new concept is being created somewhere? Then, we’d have nice definitions and well-formed terms and appellations right away. Since that is utopia, at least it helps to be aware that language is in flux, that marketing language might be deliberately nebulous, and that it might take time before a majority of experts have agreed on what something is and how it is different from other things around it. I think “the cloud” and “cloud computing” have been terminologized and arrived in technical language.

SHARE THIS:
Next Page »

Blog Categories

  • Advanced terminology topics
  • Branding
  • Content publisher
  • Events
  • Interesting terms
  • Job posting
  • Process
    • Coining terms
    • Designing a terminology database
    • Maintaining a database
    • Researching terms
    • Selecting terms
    • Setting up entries
    • Standardizing entries
  • Return on investment
  • Skills and qualities
    • Negotiation skills
    • Producing quality
    • Producing quantity
  • Subject matter expert
  • Terminologist
  • Terminology 101
    • Terminology methods
    • Terminology of terminology
    • Terminology principles
  • TermNet
  • Theory
  • Tool
    • iTerm
    • Machine translation
    • Proprietary terminology management systems
      • J.D. Edwards TDB
      • Microsoft Terminology Studio
    • Term extraction tool
      • memoQ
    • Terminology portals
      • BACUS
      • EuroTermBank
      • Irish National Terminology Database
      • Microsoft Language Portal
      • Rikstermbanken
  • Translator
  • Usability

Blog Archives

  • November 2012
  • October 2012
  • September 2012
  • November 2011
  • October 2011
  • September 2011
  • August 2011
  • July 2011
  • June 2011
  • April 2011
  • March 2011
  • February 2011
  • January 2011
  • December 2010
  • November 2010
  • October 2010
  • September 2010
  • August 2010
  • July 2010
  • June 2010
  • May 2010

BIK Terminology

  • About Barbara Inge Karsch
  • Terminology Services
  • Terminology Resources
  • My Terminology Portfolio
  • Let’s Talk Terminology

From the Blog

  • A glossary for MT–terrific! MT on a glossary—horrific!
  • Part-time position for an Arabic terminologist
  • Tidbit from the ATA Conference
  • Bilingual corpora and target terminology research
  • Terminology internship at Eurocopter in France

Find It Here

Follow Me

  • Email
  • LinkedIn
  • Phone
Copyright © 2023 BIK Terminology. All Rights Reserved. Sitemap. Website by sundaradesign.