BIK Terminology—

Solving the terminology puzzle, one posting at a time

  • Author

    Barbara Inge Karsch - Terminology Consulting and Training

  • Images

    Cathedral of Our Lady, Antwerp, by Barbara Inge Karsch

  • Enter your email address to subscribe to this blog and receive notifications of new posts by email.

    Join 73 other followers

Archive for the ‘Terminology 101’ Category

Guest Blog for Bibliotekarska terminologija

Posted by Barbara Inge Karsch on August 30, 2010

I will soon get back to my regular schedule. In the meantime, here is another little article that I wrote for Ivan Kanič—a librarian and blogger, who writes about terminology issues for librarians in his native Slovenia. It’s been a pleasure working with Ivan on this little project. Check it out!

Posted in Researching terms, Terminologist, Terminology 101 | Tagged: , | Leave a Comment »

Guest Blog for SDL

Posted by Barbara Inge Karsch on August 23, 2010

Tom Smith from SDL and I did an interview-style blog on the SDL blog site. If you are interested in terminology management and branding, among other things, check it out.

Posted in Branding, Terminology 101 | Tagged: | 2 Comments »

What I like about ISO 704

Posted by Barbara Inge Karsch on August 5, 2010

The body of ISO 704 “Terminology work—Principles and methods” lists a bunch of important information for terminology work. But what stuck in my mind is actually the annexes, most of all Annex B.

In the current version 704:2009, Annex B is devoted to term-formation methods. In other words, it gives us the most important methods that we have available when creating new terms or appellations in English. It also notes what might be obvious to us, i.e. that these methods differ from language to language. For German, for example, we now have the new Terminologiearbeit – Best Practices which the German terminology association, DTT, published recently and which is more systematic about this topic than a standard might be.

Comprehensiveness need not be the goal of 704; awareness of these methods is more important. If half the content publishers, PMs, branding or marketing folks that I worked with in the IT world had read those five short pages, it would have done a world of good. Instead, I have heard colleagues mocking terminologists who, when coining new terms, pull out the Duden (the main German-German dictionary, similar to The Webster’s or Le Petit Robert) to apply one of these methods. She who laughs last, laughs best, though: New terms and appellations that are well-motivated—either rooted in existing language or deliberately different—last. Quickly invented garbage causes misunderstandings and costs money.

Annex B doesn’t claim to be comprehensive, but it lists the most important methods that can be used in term formation. Here are the three main methods and some examples:Cairn from a recent hike in the Cascades, Barbara Inge Karsch

  • The first one is the creation of completely new lexical entities (terms or appellations), also known as neoterms. One way of creating a neoterm is through compounding, where a new designation is formed of two or more elements, for example cloud computing. 
  • Two methods that fall into the category of using existing forms are terminologization (see also How do I identify a term—terminologization) and transdisciplinary borrowing. An example of terminologization is cloud, where the everyday word cloud took on a very specific meaning in the context of computing, while the name of the computer virus Trojan horse was obviously borrowed from Greek mythology.
  • Translingual borrowing results in new terms and appellations that originate in another language. English climbing language, for example, is full of direct loans from a variety of other languages; just think of bergschrund, cairn or .

The above are just a few examples to give you an impression of what could be learned by reading Annex B. Incidentally, these are methods. They need to be applied correctly, not randomly. I can already hear it, “but I used transdisciplinary borrowing to come up with this [junk]”. No. Even if your orthopedist uses minimally-invasive arthroscopic surgery to fix your knee, you want him to be sure that you actually need surgery, right? If you need to coin English terms or appellations on a regular basis, Annex B of ISO 704 is worth your while. I also like Annex C. More about that some other time.

Posted in Coining terms, Content publisher, Interesting terms, Terminologist, Terminology methods | Tagged: , , , , , | 6 Comments »

Who cares about ISO 704?

Posted by Barbara Inge Karsch on July 29, 2010

The next standard to talk about is ISO 704 “Terminology work—Principles and methods.” It is an interesting one for a variety of reasons. For one, I have more questions than answers.

At the TKE (Terminology and Knowledge Engineering) Conference in Dublin, my esteemed colleagues, Hanne Erdman Thomsen, Sue Ellen Wright, Gerhard Budin and Loïc Depecker will devote a workshop to ‘Accommodating User Needs for ISO 704: Towards a New Revision of the Core International Standard on Terminology Work’. I will have a short time slot to provide input myself and therefore have been re-reviewing ISO 704 over the last few days.

As I am putting my thoughts together, I was wondering: Who knows or uses ISO 704? I would like to invite you to do two things: Click on the little survey below in this posting. And, if you haven’t done so, please tell me about yourself by participating in the survey on the Survey tab. Both surveys are anonymous and might help me understand what this standard could do. If you know the standard and have something to share about it, please leave a comment below. I would be very grateful to get your input.

Because, quite frankly, I am puzzling over this standard. I have read it three times over the past year and every time after a few weeks go by, I have to think about what this standard is actually for. I believe it stems from the fact that it is a bit wordy at the moment. It contains a lot of good information, but the presentation is ineffective.Designations

But now, what can it do for the reader? As its title says, it lays out the various principles underlying terminology management. For example, it tells us what objects, concepts, concept relations and concept systems are. It then goes into definitions and definition writing, before the subject of designations is discussed. Remember, this little graphic from What is a Term? As an aside, we talk about terms many times when we actually mean designations; in German, we even find the ugly Anglicism Term and its plural Terme.

So, ISO 704 really does do what the title says, it presents us with principles and methods. It just doesn’t seem to stick with me. Yet.

Posted in Events, Terminologist, Terminology methods, Terminology principles | Tagged: , , | 4 Comments »

ISO 12620—Why bother

Posted by Barbara Inge Karsch on July 22, 2010

Standards are nice, but they don’t do anything for you or, more importantly, the user of your terminology database, if you are the only one applying them. But how do you get a large virtual team of terminologists or language specialists to agree on and apply standards, such as ISO 12620, to database entries? And first: Why bother climbing such a mountain?

Machapuchare by Birgit KarschImagine you have a large document to author or translate. Your client gave you a dictionary to use. Because you are not sure of the meaning or usage of 50 terms, you look them up. But the dictionary holds you up more than anything: One entry contains a definition, the next one doesn’t; one provides context, but it is in a language you don’t understand; most terms make sense, but several of them are cryptic and the entry doesn’t provide clarity. If your client hadn’t insisted that you use the dictionary, you wouldn’t: It just slows you down.

The objective of a terminology database is to have consistent and correct terminology used in the product, in source as well as in target languages. To support that goal, users must be able to use a database entry quickly and easily—structure really helps here. Furthermore, users must be able to trust the information provided—transparent, clear and consistent entries create trust.

Ideally, you have a centralized team of trained terminologists who know the standards inside out and apply them religiously. If you don’t, select/create a tool that supports standards adherence as much as possible. Some simple examples: If definition is mandatory, automatically enforce it; if the term is a verb, hide the Number field; if the language is English, hide the Gender field. Tools can do a lot, but your team very likely still needs a standard.

The Microsoft terminology team did. Simply handing a standards document off to the team had not been successful in the past—nobody could remember it, many entries therefore contained unstructured, if not incorrect information, and there was no incentive to adhere to standards. A more collaborative effort was called for: Together, in-house terminologists went through data categories one by one. Because we were a virtual team, e-mail was the best form of communication. Each data category was dealt with in one e-mail that contained: the definition, a scenario and voting buttons that allowed the team to agree with the meaning or disagree and make a better suggestion. Team members could participate in the voting, but they didn’t have to. However, anyone knew from the beginning that they had to accept the outcome, regardless of whether they participated or not. Annapurna South by Birgit KarschAfter the new guide had been published, measurements were carried out and documented in a quarterly report. Terminologists then set their own deadlines for cleaning up entries to comply with the standards.

ISO 12620 doesn’t just enable data exchange, as we saw in last week’s entry. At J.D. Edwards and Microsoft, it also helped create standards guides. I am sure not every field is filled in correctly; perfection is not the point. But with shrinking budgets and tighter deadlines, a database that could cost millions of dollars must support the user as best as possible in their endeavor to create reliable communication. A standards guide based on an international standard is a good tool you can use to climb that mountain.

Posted in Content publisher, Microsoft Terminology Studio, Standardizing entries, Terminologist, Terminology 101 | Tagged: , | 1 Comment »

How do I identify a term—standardization

Posted by Barbara Inge Karsch on July 1, 2010

And the final criterion in this blog series on how to identify terms is, in my mind, one of the most important ones—standardization. Standardized usage and spelling makes the life of the product user much easier, and it is fairly clear which key concepts need to be documented in a terminology database for that reason. But are they the same for target terms? And if not, how would we know what must be standardized for, say, Japanese? We don’t—that’s when we rely on process and tools.

Example 1. Before we got to standardizing terminology at J.D. Edwards (JDE), purchase orders could be pushed, Microsoft Office ClipArtcommitted or sent. And it all meant the same thing. That had several obvious consequences:

  • Loss of productivity by customers: They had to research documentation to find out what would happen if they clicked Push on one form, Send on another or Commit on the third.
  • Loss of productivity by translators: They walked across the hall, which fortunately was possible, to enquire about the difference.
  • Inconsistency in target languages: If some translators did not think that these three terms could stand for the same thing (why would they?), they replicated the inconsistency in their language.
  • Translation memory: Push purchase order, Commit purchase order and Send purchase order needed to be translated three times by 21 languages before the translation memory kicked in.

All this results in direct and indirect cost.

Microsoft Office ClipArtExample 2. The VP of content publishing and translation at JDE used the following example to point out that terms and concepts should not be used at will: reporting code, system code, application, product, module, and product code. While everyone in Accounting had some sort of meaning in their head, the concepts behind them were initially not clearly defined. For example, does a product consist of modules? Or does an application consist of systems? Is a reporting code part of a module or a subunit of a product code? And when a customer buys an application is it the same as a product? So, what happens if Accounting isn’t clear what exactly the customer is buying…

Example 3. Standardization to achieve consistency in the source language is self-evident. But what about the target side? Of course, we would want a team of ten localizers working on different parts of the same product to use the same terminology. One of the most difficult languages to standardize is Japanese. My former colleague and Japanese terminologist at JDE, Demi, explained it as follows:

For Japanese, “[…] we have three writing systems:

  • Chinese characters […]
  • Hiragana […]
  • Katakana […].

We often mix Roman alphabet in our writing system too. […]how to mix the three characters, Chinese, Katakana, Hiragana, plus Roman alphabet, is up to each [person’s] discretion! For translation, it causes a problem of course. We need to come up with a certain agreements and rules.”

The standards and rules that Demi referred should be reflected in standardized entries in a terminology database and available at the localizers’ fingertips. Now, the tricky part is that, for Japanese, terms representing different concepts than those selected during upfront term selection may need to be standardized. In this case, it Microsoft Office ClipArtis vital that the terminology management system allow requests for entries from downstream contributors, such as the Japanese terminologist or the Japanese localizers. The requests may not make sense to a source terminologist at first glance, so a justification comment speeds up processing of the request.

To sum up this series on how to identify terms for inclusion in a terminology database: We discussed nine criteria: terminologization, specialization, confusability, frequency, distribution, novelty, visibility, system and standardization. Each one of them weighs differently for each term candidate and most of the time several criteria apply. A terminologist, content publisher or translator has to weight these criteria and make a decision quickly. No two people will come up with the same list upfront. But tools and processes should support downstream requests.

Posted in Content publisher, Selecting terms, Terminologist, Terminology 101, Translator | Tagged: , , , , | Leave a Comment »

How do I identify a term—system

Posted by Barbara Inge Karsch on June 30, 2010

Here is one that is forgotten often in fast-paced, high-production environments: system. This at first glance cryptic criterion refers to terms that may not be part of our text or our list of term candidates, but that are part of the conceptual system that makes up the subject matter we are working in. And sometimes, if not to say almost always, it pays off to be systematic.

A very quick excursion into the theory of terminology management: We distinguish between ad-hoc and systematic terminology work.

  • When we work ad-hoc, we don’t care about the surrounding concepts or terms; we focus on solving the terminological problem at hand; for example: I need to know what forecasting is and what it is called in Finnish.
  • When we take a systematic approach, we go deeper into understanding a particular subject. We may start out researching one term (e.g. forecasting) and understand the concept behind it, but then we continue to study its parent, sibling and child concepts; we work in a subject area and examine and document the relationships of the concepts.

In the following example, the terminologist decided to not only set up an entry for forecasting, but to also list different types of forecasting—child or subordinate concepts—and the parent or superordinate concept. The J.D. Edwards terminology tool, TDB, had an add-on that turned the data into visuals, such as the one below. It goes without saying that displays of this nature help, for instance, the Finnish terminologist to find equivalents more easily when s/he knows that besides qualitative forecasting there is also quantitative forecasting, etc.

JDE types of forecasting

In his Manuel pratique de terminologie, Dubuc suggests that ad-hoc terminology work is a good way to get started with terminology management. Furthermore, he is right in that documenting concepts and their systems takes time and money, both of which are in short supply in many business environments. On the other hand, a more systematic approach will, in my experience, lead to entries that stand the test of time longer, create less downstream problems or questions, and need less maintenance. So, investing more time in the initial research and setting the surrounding concepts while you have the information at hand anyway, may very well pay off later. Seasoned terminologists know when to include terms to flesh out a system and when to simply answer an ad-hoc question.

Posted in Advanced terminology topics, Content publisher, J.D. Edwards TDB, Selecting terms, Terminologist, Terminology 101 | Tagged: , , , | Leave a Comment »

How do I identify a term—visibility

Posted by Barbara Inge Karsch on June 29, 2010

Yesterday’s example was the term ribbon. While the concept was an innovation at the time that is quite prevalent in software today, the term is not necessarily highly visible. Today’s focus will be on the term-selection criterion “visibility”—in other words, on terms that are conspicuous and prevalent.

Look at the following screen prints from products within the Microsoft© Office 2010 suite:

Microsoft Office ribbon tabsDid you find some highly visible terms there? All of them stand for ribbon tabs that are highly standardized to maximize user retention: One term representing the same concept in each of the different products makes it much easier for the user to remember where to find what. Do you think that this was a coordinated effort? I don’t know for sure, as my involvement with Office was limited to Office 2007, but it looks like it. That, too, is terminology management.

Highly-visible terms must be correct in both the source and all target languages. Inconsistencies, spelling errors or variations are not only embarrassing, they lead to less trust by users, especially in markets with high-quality expectations. Terminology management working methods can spare you the embarrassment and lead to a trusting relationship with the users.

Posted in Content publisher, Selecting terms, Terminologist, Terminology 101 | Tagged: , , | Leave a Comment »

How do I identify a term—novelty

Posted by Barbara Inge Karsch on June 28, 2010

In the posting for frequency and distribution, the focus was on automated term extraction output. Today’s criterion for term selection will pertain more often to manual term extraction. For consistency sake, we call it novelty to go along with all the other nouns (terminologization, specialization, confusability, frequency and distribution). But it simply refers to terms that are new and should be added to a terminology database for that reason.

In the manual term extraction process a writer or editor documents terms while authoring material. They can do this either in a separate list or directly in the terminology database, depending on their working style, the need for immediate availability of the terms, their rights in the terminology tool, etc. Many of the terms documented this way will meet the criterion “novelty.” In a less strict sense of the word, novelties or “new terms” can also be the focus of a term extraction program. These programs can be set up to only extract terms that have not come up or been documented so far. The difference is that the human can evaluate right away which term really stands for an innovative concept, while the machine will only exclude what is already documented elsewhere.

Most of us remember that with Office 2007 the ribbon was introduced. While the name of this new tabbed command bar does not show up in text all that often, it was new and would have been hard to name in other languages had it not been documented in a terminology entry.

 

  

  

  

If the answer to the question “is this a new term representing a new concept?” is yes, do make an entry in the terminology database. Especially in environments where terminology management has been common practice and there is no need to document legacy terminology, most terms added to the database meet this criterion. Stay tuned for the posting on term selection and visibility.

Posted in Content publisher, Selecting terms, Terminologist, Terminology 101 | Tagged: , , | Leave a Comment »

How do I identify a term—frequency and distribution

Posted by Barbara Inge Karsch on June 27, 2010

A seemingly obvious criterion to select terms for a terminology database is frequency of occurrence. A term extraction program, for example, should tell us how often a term appears in the text mined. Term extraction output or other text-mining solutions might also tell you what the distribution of a term is, in other words you may be able to find out in how many documents or products a term occurs.

When sifting through term candidates in term-mining output, we very likely have to scope quite a bit, because we can’t spend weeks on making perfect term selections. As we know by now, frequency is not the only term selection criteria, but it can help us particularly in large projects. Here are options and their pros and cons:

Pros

Cons

Recommendation

Ignore frequency and evaluate all term candidates

More precise selection because nothing is excluded

High time investment

Good for small lists; never completely ignore frequency, as it can still tell us something about the importance of a term

Exclude all terms that occur less than x number of times

Number of term candidates is smaller

Potential to miss critical terms

Good for larger lists and when a critical percentage of terms was already extracted manually

Exclude all terms that occur more than y number of times

Number of term candidates is smaller

Potential to miss critical terms

Good for large lists from which existing database or other non-critical terms or words were not excluded

Only go through terms that occur more than x and less than y

Number of terms can be reduced significantly

High potential to miss critical terms

Good when both critical terms are already extracted and no stop word list was used

If a term occurs often in a project, it is probably either very important or so generic that it shouldn’t be included. If you run a term extraction process, words should not be part of the resulting list; they should be part of a stop-word list.

Certain term mining solutions or lookup tools also indicate in which project or in which version and product a particular term is used. In other words, they give us information about the distribution of a term. But high distribution, just like high frequency, may be criteria of terms that are very well known and do not need to be documented. For example, at Microsoft it would seem useless to include terms, such as computer or user, just because they occur frequently and are widely distributed. There are other reasons to include them, though. By the same token, a widely-distributed and highly-frequent term that is somewhat mysterious should be included in the terminology database, as many users might need to look it up and the return on investment is there.

To summarize, frequency and distribution are important term selection criteria. They must be looked at in combination with other criteria, though, to make sense. One criterion to consider could be novelty, which we will examine in the following entry.

Posted in Content publisher, Selecting terms, Term extraction tool, Terminologist, Terminology 101 | Tagged: , , , | 1 Comment »

 
Follow

Get every new post delivered to your Inbox.

Join 73 other followers