BIK Terminology—

Solving the terminology puzzle, one posting at a time

  • Author

    Barbara Inge Karsch - Terminology Consulting and Training

  • Images

    Cathedral of Our Lady, Antwerp, by Barbara Inge Karsch

  • Enter your email address to subscribe to this blog and receive notifications of new posts by email.

    Join 73 other followers

Archive for the ‘Terminologist’ Category

Terms—A translator’s perspective vs. a terminologist’s perspective

Posted by Barbara Inge Karsch on September 30, 2010

Any translator can do a terminologist’s work. The best translators compile lists of terms, equivalents, maybe a piece of context or even a definition before or at least while they are translating. So, theoretically the above statement is correct. But let’s take another look at the focus of a translator and the focus of a terminologist with regard to terms.

Although a term can be at the same time a unit of translation and a term described and defined in a terminology database, translators and terminologists treat that unit differently. A translator works in context and arrives at a target solution that is correct for that particular text. Based on Saussure, Juan Sager calls terms in a translation text “instances of parole” or “language in use” (Routledge Encyclopedia of Translation Studies).

In Quasi dasselbe mit anderen Worten, Umberto Eco says “in light of [all the] meanings made available by a dictionary entry and its applicable encyclopedic information, the translator must choose the most probable, reasonable and relevant sense for the context in question and this possible world” (translation by BIK). That means that the translator cannot simply copy what he finds in a dictionary or terminology database; he actually has to be, as Robin Bonthrone put it years ago, “switched on.” If that wasn’t a condition, machine translation would have long since taken over.

That context then becomes part of the translated text, which in our scenario of technical translation, usually becomes part of a translation memory (TM). And it also becomes part of a product. As part of the product, the term is now part of history, as it were. As part of the TM, the term may be reused for the next version of the product, and it may also serve as reference material to others. But a translation memory does not equate to managed terminology. Strings in TMs contain terminology, but TMs are generally static and hardly ever managed.

In applied terminology, the starting point might be the term in the translation environment above. But a terminologist must research and understand the term not only in one particular context, but in as many as it takes to uniquely identify its meaning. Once that meaning has been identified, the terminologist creates a terminological entry. According to Sager, terminologists use the term, the “instance of parole”, to get to langue, i.e. the abstract system behind the linguistic sign. The entry is part of the terminological system in the database and can now be applied back in parole, in more than one situation or context, to more than one product or company. Therefore, it must be comprehensible to people other than the terminologist, and it must reflect the understanding and knowledge of the subject matter expert (see also Terminology by Maria Theresa Cabré).

While both translators and terminologists research terms, the product of their work is different. The translator is responsible for the delivery of a correct target language text with correct technical terms (parole or language in use). The terminologist is responsible for the creation of a JIALcorrect and complete terminological entry in a database (langue or the abstract system underlying speech acts). That entry may over time be used for many different products and versions inside or outside the company; the entry may become obsolete or even incorrect and the terminologist may need to modify it or add a new entry to the database accordingly. Monetary compensation, as described in What do we do with terms? method and goal of translators and terminologists are different. Therefore, translators translate, terminologists research and document.

[This posting is based on an article published in the Journal of Internationalisation and Localisation, which can be downloaded for free.]

Posted in Advanced terminology topics, Researching terms, Terminologist, Theory, Translator | 2 Comments »

Quantity AND Quality

Posted by Barbara Inge Karsch on September 16, 2010

In If quantity matters, what about quality? I promised to shed some light on how to achieve quantity without skimping on quality. In knowledge management, it boils down to solid processes supported by reliable and appropriate tools and executed by skilled people. Let me drill down on some aspects of setting up processes and tools to support quantity and quality.

If you cannot afford to build up an encyclopedia for your company (and who can?), select metadata carefully. The number and types of data categories (DCs), as discussed in The Year of Standards, can make a big difference. That is not to say use less. Use the right ones for your environment.

Along those lines, hide data categories or values where they don’t make sense. For example, don’t display Grammatical Gender when Language=English; invariably a terminologist will accidentally select a gender, and if only a few users wonder why that is or note the error, but can’t find a way to alert you to it, too much time is wasted. Similarly, hide Grammatical Number, when the Part of Speech=Verb, and so on.

Plan dependent data, such as product and version, carefully. For example, if versions for all your products are numbered the same way (e.g. 1, 2, 3,..), it might be easiest to have two related tables. If most of your versions have very different version names, you could have one table that lists product and version together (e.g. Windows 95, Windows 2000, Windows XP, …); it makes information retrievable slightly simpler especially for non-expert users. Or maybe you cannot afford or don’t need to manage down to the version level because you are in a highly dynamic environment.Anton by Lee Dennis

Enforce mandatory data when a terminologist releases (approves or fails) an entry. If you  decided that five out of your ten DCs are mandatory, let the tool help terminologists by not letting them get away with a shortcut or an oversight.

It is obviously not an easy task to anticipate what you need in your environment. But well-designed tools and processes support high quality AND quantity and therefore boost your return on investment.

On a personal note, Anton is exhausted with anticipation of our big upcoming event: He will be the ring bearer in our wedding this weekend.

Posted in Advanced terminology topics, Designing a terminology database, Producing quality, Producing quantity, Return on investment, Setting up entries, Terminologist, Tool | Tagged: , , | 1 Comment »

If quantity matters, what about quality?

Posted by Barbara Inge Karsch on September 9, 2010

Linguistic quality is one of the persistent puzzles in our industry, as it is such an elusive concept. It doesn’t have to be, though. But if only Microsoft Clip Artquantity matters to you, you are on your way to ruining your company’s linguistic assets.

Because terminology management is not an end in itself, let’s start with the quality objective that users of a prescriptive terminology database are after. Most users access terminological data for support with monolingual, multilingual, manual or automated authoring processes. The outcomes of these processes are texts of some nature. The ultimate quality goal that terminology management supports with regard to these texts could be defined as “the text must contain correct terms used consistently.” In fact, Sue Ellen Wright “concludes that the terminology that makes up the text comprises that aspect of the text that poses the greatest risk for failure.” (Handbook of Terminology Management)

In order to get to this quality goal, other quality goals must precede it. For one, the database must contain correct terminological entries; and second, there must be integrity between the different entries, i.e. entries in the database must not contradict each other.

In order to attain these two goals, others must be met in their turn: The data values within the entries must contain correct information. And the entries must be complete, i.e. no mandatory data is missing. I call this the mandate to release only correct and complete entries (of course, a prescriptive database may contain pre-released entries that don’t meet these criteria yet).

Let’s see what that means for terminologists who are responsible for setting up, approving or releasing a correct and complete entry. They need to be able to:

  • Do research.
  • Transfer the result of the research into the data categories correctly.
  • Assure integrity between entries.
  • Approve only entries that have all the mandatory data.
  • Fill in an optional data category, when necessary.

Let’s leave aside for a moment that we are all human and that we will botch the occasional entry. Can you imagine if instead of doing the above, terminologists were told not to worry about quality? From now on, they would:

  • Stop at 50% research or don’t validate the data already present in the entry.
  • Fill in only some of the mandatory fields.
  • Choose the entry language randomly.
  • Add three or four different designations to the Term field.
  • ….

Microsoft Clip ArtDo you think that we could meet our number 1 goal of correct and consistent terminology in texts? No. Instead a text in the source language would contain inconsistencies, spelling variations, and probably errors. Translations performed by translators would contain the same, possibly worse problems. Machine translations would be consistent, but they would consistently contain multiple target terms for one source term, etc. The translation memory would propagate issues to other texts within the same product, the next version of the product, to texts for other products, and so on. Some writers and translators would not use the terminology database anymore, which means that fewer errors are challenged and fixed. Others would argue that they must use the database; after all, it is prescriptive.

Unreliable entries are poison in the system. With a lax attitude towards quality, you can do more harm than good. Does that mean that you have to invest hours and hours in your entries? Absolutely not. We’ll get to some measures in a later posting. But if you can’t afford correct and complete entries, don’t waste your money on terminology management.

Posted in Advanced terminology topics, Producing quality, Producing quantity, Return on investment, Setting up entries, Terminologist, Terminology methods, Terminology principles | Tagged: , , | 1 Comment »

Guest Blog for Bibliotekarska terminologija

Posted by Barbara Inge Karsch on August 30, 2010

I will soon get back to my regular schedule. In the meantime, here is another little article that I wrote for Ivan Kanič—a librarian and blogger, who writes about terminology issues for librarians in his native Slovenia. It’s been a pleasure working with Ivan on this little project. Check it out!

Posted in Researching terms, Terminologist, Terminology 101 | Tagged: , | Leave a Comment »

What I like about ISO 704

Posted by Barbara Inge Karsch on August 5, 2010

The body of ISO 704 “Terminology work—Principles and methods” lists a bunch of important information for terminology work. But what stuck in my mind is actually the annexes, most of all Annex B.

In the current version 704:2009, Annex B is devoted to term-formation methods. In other words, it gives us the most important methods that we have available when creating new terms or appellations in English. It also notes what might be obvious to us, i.e. that these methods differ from language to language. For German, for example, we now have the new Terminologiearbeit – Best Practices which the German terminology association, DTT, published recently and which is more systematic about this topic than a standard might be.

Comprehensiveness need not be the goal of 704; awareness of these methods is more important. If half the content publishers, PMs, branding or marketing folks that I worked with in the IT world had read those five short pages, it would have done a world of good. Instead, I have heard colleagues mocking terminologists who, when coining new terms, pull out the Duden (the main German-German dictionary, similar to The Webster’s or Le Petit Robert) to apply one of these methods. She who laughs last, laughs best, though: New terms and appellations that are well-motivated—either rooted in existing language or deliberately different—last. Quickly invented garbage causes misunderstandings and costs money.

Annex B doesn’t claim to be comprehensive, but it lists the most important methods that can be used in term formation. Here are the three main methods and some examples:Cairn from a recent hike in the Cascades, Barbara Inge Karsch

  • The first one is the creation of completely new lexical entities (terms or appellations), also known as neoterms. One way of creating a neoterm is through compounding, where a new designation is formed of two or more elements, for example cloud computing. 
  • Two methods that fall into the category of using existing forms are terminologization (see also How do I identify a term—terminologization) and transdisciplinary borrowing. An example of terminologization is cloud, where the everyday word cloud took on a very specific meaning in the context of computing, while the name of the computer virus Trojan horse was obviously borrowed from Greek mythology.
  • Translingual borrowing results in new terms and appellations that originate in another language. English climbing language, for example, is full of direct loans from a variety of other languages; just think of bergschrund, cairn or .

The above are just a few examples to give you an impression of what could be learned by reading Annex B. Incidentally, these are methods. They need to be applied correctly, not randomly. I can already hear it, “but I used transdisciplinary borrowing to come up with this [junk]”. No. Even if your orthopedist uses minimally-invasive arthroscopic surgery to fix your knee, you want him to be sure that you actually need surgery, right? If you need to coin English terms or appellations on a regular basis, Annex B of ISO 704 is worth your while. I also like Annex C. More about that some other time.

Posted in Coining terms, Content publisher, Interesting terms, Terminologist, Terminology methods | Tagged: , , , , , | 6 Comments »

Who cares about ISO 704?

Posted by Barbara Inge Karsch on July 29, 2010

The next standard to talk about is ISO 704 “Terminology work—Principles and methods.” It is an interesting one for a variety of reasons. For one, I have more questions than answers.

At the TKE (Terminology and Knowledge Engineering) Conference in Dublin, my esteemed colleagues, Hanne Erdman Thomsen, Sue Ellen Wright, Gerhard Budin and Loïc Depecker will devote a workshop to ‘Accommodating User Needs for ISO 704: Towards a New Revision of the Core International Standard on Terminology Work’. I will have a short time slot to provide input myself and therefore have been re-reviewing ISO 704 over the last few days.

As I am putting my thoughts together, I was wondering: Who knows or uses ISO 704? I would like to invite you to do two things: Click on the little survey below in this posting. And, if you haven’t done so, please tell me about yourself by participating in the survey on the Survey tab. Both surveys are anonymous and might help me understand what this standard could do. If you know the standard and have something to share about it, please leave a comment below. I would be very grateful to get your input.

Because, quite frankly, I am puzzling over this standard. I have read it three times over the past year and every time after a few weeks go by, I have to think about what this standard is actually for. I believe it stems from the fact that it is a bit wordy at the moment. It contains a lot of good information, but the presentation is ineffective.Designations

But now, what can it do for the reader? As its title says, it lays out the various principles underlying terminology management. For example, it tells us what objects, concepts, concept relations and concept systems are. It then goes into definitions and definition writing, before the subject of designations is discussed. Remember, this little graphic from What is a Term? As an aside, we talk about terms many times when we actually mean designations; in German, we even find the ugly Anglicism Term and its plural Terme.

So, ISO 704 really does do what the title says, it presents us with principles and methods. It just doesn’t seem to stick with me. Yet.

Posted in Events, Terminologist, Terminology methods, Terminology principles | Tagged: , , | 4 Comments »

ISO 12620—Why bother

Posted by Barbara Inge Karsch on July 22, 2010

Standards are nice, but they don’t do anything for you or, more importantly, the user of your terminology database, if you are the only one applying them. But how do you get a large virtual team of terminologists or language specialists to agree on and apply standards, such as ISO 12620, to database entries? And first: Why bother climbing such a mountain?

Machapuchare by Birgit KarschImagine you have a large document to author or translate. Your client gave you a dictionary to use. Because you are not sure of the meaning or usage of 50 terms, you look them up. But the dictionary holds you up more than anything: One entry contains a definition, the next one doesn’t; one provides context, but it is in a language you don’t understand; most terms make sense, but several of them are cryptic and the entry doesn’t provide clarity. If your client hadn’t insisted that you use the dictionary, you wouldn’t: It just slows you down.

The objective of a terminology database is to have consistent and correct terminology used in the product, in source as well as in target languages. To support that goal, users must be able to use a database entry quickly and easily—structure really helps here. Furthermore, users must be able to trust the information provided—transparent, clear and consistent entries create trust.

Ideally, you have a centralized team of trained terminologists who know the standards inside out and apply them religiously. If you don’t, select/create a tool that supports standards adherence as much as possible. Some simple examples: If definition is mandatory, automatically enforce it; if the term is a verb, hide the Number field; if the language is English, hide the Gender field. Tools can do a lot, but your team very likely still needs a standard.

The Microsoft terminology team did. Simply handing a standards document off to the team had not been successful in the past—nobody could remember it, many entries therefore contained unstructured, if not incorrect information, and there was no incentive to adhere to standards. A more collaborative effort was called for: Together, in-house terminologists went through data categories one by one. Because we were a virtual team, e-mail was the best form of communication. Each data category was dealt with in one e-mail that contained: the definition, a scenario and voting buttons that allowed the team to agree with the meaning or disagree and make a better suggestion. Team members could participate in the voting, but they didn’t have to. However, anyone knew from the beginning that they had to accept the outcome, regardless of whether they participated or not. Annapurna South by Birgit KarschAfter the new guide had been published, measurements were carried out and documented in a quarterly report. Terminologists then set their own deadlines for cleaning up entries to comply with the standards.

ISO 12620 doesn’t just enable data exchange, as we saw in last week’s entry. At J.D. Edwards and Microsoft, it also helped create standards guides. I am sure not every field is filled in correctly; perfection is not the point. But with shrinking budgets and tighter deadlines, a database that could cost millions of dollars must support the user as best as possible in their endeavor to create reliable communication. A standards guide based on an international standard is a good tool you can use to climb that mountain.

Posted in Content publisher, Microsoft Terminology Studio, Standardizing entries, Terminologist, Terminology 101 | Tagged: , | 1 Comment »

The Year of Standards

Posted by Barbara Inge Karsch on July 16, 2010

LISA The Localization Industry Standards Association (LISA) reminded us in their recent Globalization Insider that they had declared 2010 the ‘Year of Standards.’ It resonates with me because socializing standards was one of the objectives that I set for this blog. Standards and standardization are the essence of terminology management, and yet practitioners either don’t know of standards, don’t have time to read them, or think they can do without them. In the following weeks, as the ISO Technical Committee 37 ("Terminology and other language and content resources") is gearing up for the annual meeting in Dublin, I’d like to focus on standards. Let’s start with ISO 12620.

ISO 12620:1999 (Computer applications in terminology—Data categories—Part 2: Data category registry) provides standardized data categories (DCs) for terminology databases; a data category is the name of the database field, as it were, its definition, and its ID. Did everyone notice that terminology can now be downloaded from the Microsoft Language Portal? One of the reasons why you can download the terminology today and use it in your own terminology database is ISO 12620. The availability of such a tremendous asset is a major argument in favor of standards.

I remember when my manager at J.D. Edwards slapped 12620 on the table and we started the selection process for TDB. It can be quite overwhelming. But I turned into a big fan of 12620 very quickly: It allowed us to design a database that met our needs at J.D. Edwards.

When I joined Microsoft in 2004, my colleagues had already selected data categories for a MultiTerm database. Since I was familiar with 12620, it did not take much time to be at home in the new database. We reviewed and simplified the DCs over the years, because certain data categories chosen initially were not used often enough to warrant their existence. One example is ‘animacy,’ which is defined in 12620 as “[t]he characteristic of a word indicating that in a given discourse community, its referent is considered to be alive or to possess a quality of volition or consciousness”…most of the things documented in Term Studio are dead and have no will or consciousness. But we could simply remove ‘animacy’, while it would have been difficult or costly to integrate a new data category late in the game. If you are designing a terminology database, err on the side of being more comprehensive. Because we relied on 12620, it was easy when earlier in 2010 we prepared for making data exportable into a TBX format (ISO 30042). The alignment was already there, and communication with the vendor, an expert in TBX, was easy.

ISO 12620:1999 has since been retired and was succeeded by ISO 12620:2009, which “provides guidelines […] forISOcat creating, selecting and maintaining data categories, as well as an interchange format for representing them.” The data categories themselves were moved into the ISOcat “Data Category Registry” open to use by anyone.

ISO 12620 or now the Data Category Registry allows terminology database designers to apply tried and true standards rather than reinventing the wheel. As all standards, they enable quick adoption by those familiar with them and they enable data sharing (e.g. in large term banks, such as the EuroTermBank). If you are not familiar with standards, read A Standards Primer written by Christine Bucher for LISA. It is a fantastic overview that helps navigate the standardization maze.

Posted in Advanced terminology topics, Designing a terminology database, EuroTermBank, J.D. Edwards TDB, Microsoft Language Portal, Microsoft Terminology Studio, Terminologist | Tagged: , , , | 1 Comment »

How do I identify a term—standardization

Posted by Barbara Inge Karsch on July 1, 2010

And the final criterion in this blog series on how to identify terms is, in my mind, one of the most important ones—standardization. Standardized usage and spelling makes the life of the product user much easier, and it is fairly clear which key concepts need to be documented in a terminology database for that reason. But are they the same for target terms? And if not, how would we know what must be standardized for, say, Japanese? We don’t—that’s when we rely on process and tools.

Example 1. Before we got to standardizing terminology at J.D. Edwards (JDE), purchase orders could be pushed, Microsoft Office ClipArtcommitted or sent. And it all meant the same thing. That had several obvious consequences:

  • Loss of productivity by customers: They had to research documentation to find out what would happen if they clicked Push on one form, Send on another or Commit on the third.
  • Loss of productivity by translators: They walked across the hall, which fortunately was possible, to enquire about the difference.
  • Inconsistency in target languages: If some translators did not think that these three terms could stand for the same thing (why would they?), they replicated the inconsistency in their language.
  • Translation memory: Push purchase order, Commit purchase order and Send purchase order needed to be translated three times by 21 languages before the translation memory kicked in.

All this results in direct and indirect cost.

Microsoft Office ClipArtExample 2. The VP of content publishing and translation at JDE used the following example to point out that terms and concepts should not be used at will: reporting code, system code, application, product, module, and product code. While everyone in Accounting had some sort of meaning in their head, the concepts behind them were initially not clearly defined. For example, does a product consist of modules? Or does an application consist of systems? Is a reporting code part of a module or a subunit of a product code? And when a customer buys an application is it the same as a product? So, what happens if Accounting isn’t clear what exactly the customer is buying…

Example 3. Standardization to achieve consistency in the source language is self-evident. But what about the target side? Of course, we would want a team of ten localizers working on different parts of the same product to use the same terminology. One of the most difficult languages to standardize is Japanese. My former colleague and Japanese terminologist at JDE, Demi, explained it as follows:

For Japanese, “[…] we have three writing systems:

  • Chinese characters […]
  • Hiragana […]
  • Katakana […].

We often mix Roman alphabet in our writing system too. […]how to mix the three characters, Chinese, Katakana, Hiragana, plus Roman alphabet, is up to each [person’s] discretion! For translation, it causes a problem of course. We need to come up with a certain agreements and rules.”

The standards and rules that Demi referred should be reflected in standardized entries in a terminology database and available at the localizers’ fingertips. Now, the tricky part is that, for Japanese, terms representing different concepts than those selected during upfront term selection may need to be standardized. In this case, it Microsoft Office ClipArtis vital that the terminology management system allow requests for entries from downstream contributors, such as the Japanese terminologist or the Japanese localizers. The requests may not make sense to a source terminologist at first glance, so a justification comment speeds up processing of the request.

To sum up this series on how to identify terms for inclusion in a terminology database: We discussed nine criteria: terminologization, specialization, confusability, frequency, distribution, novelty, visibility, system and standardization. Each one of them weighs differently for each term candidate and most of the time several criteria apply. A terminologist, content publisher or translator has to weight these criteria and make a decision quickly. No two people will come up with the same list upfront. But tools and processes should support downstream requests.

Posted in Content publisher, Selecting terms, Terminologist, Terminology 101, Translator | Tagged: , , , , | Leave a Comment »

How do I identify a term—system

Posted by Barbara Inge Karsch on June 30, 2010

Here is one that is forgotten often in fast-paced, high-production environments: system. This at first glance cryptic criterion refers to terms that may not be part of our text or our list of term candidates, but that are part of the conceptual system that makes up the subject matter we are working in. And sometimes, if not to say almost always, it pays off to be systematic.

A very quick excursion into the theory of terminology management: We distinguish between ad-hoc and systematic terminology work.

  • When we work ad-hoc, we don’t care about the surrounding concepts or terms; we focus on solving the terminological problem at hand; for example: I need to know what forecasting is and what it is called in Finnish.
  • When we take a systematic approach, we go deeper into understanding a particular subject. We may start out researching one term (e.g. forecasting) and understand the concept behind it, but then we continue to study its parent, sibling and child concepts; we work in a subject area and examine and document the relationships of the concepts.

In the following example, the terminologist decided to not only set up an entry for forecasting, but to also list different types of forecasting—child or subordinate concepts—and the parent or superordinate concept. The J.D. Edwards terminology tool, TDB, had an add-on that turned the data into visuals, such as the one below. It goes without saying that displays of this nature help, for instance, the Finnish terminologist to find equivalents more easily when s/he knows that besides qualitative forecasting there is also quantitative forecasting, etc.

JDE types of forecasting

In his Manuel pratique de terminologie, Dubuc suggests that ad-hoc terminology work is a good way to get started with terminology management. Furthermore, he is right in that documenting concepts and their systems takes time and money, both of which are in short supply in many business environments. On the other hand, a more systematic approach will, in my experience, lead to entries that stand the test of time longer, create less downstream problems or questions, and need less maintenance. So, investing more time in the initial research and setting the surrounding concepts while you have the information at hand anyway, may very well pay off later. Seasoned terminologists know when to include terms to flesh out a system and when to simply answer an ad-hoc question.

Posted in Advanced terminology topics, Content publisher, J.D. Edwards TDB, Selecting terms, Terminologist, Terminology 101 | Tagged: , , , | Leave a Comment »

 
Follow

Get every new post delivered to your Inbox.

Join 73 other followers