Here is another question that came up during one of the webinars recently: Do you think it is worth using a concordance tool to help sifting the technical terms in order to build up the terminology?
Another really good question! Let me address this by answering when it would NOT be worth using available corpora. I would not use it if I didn’t expect work from the same client or in the same subject area again. I would also not do it, if I hated the user experience of the concordance tool. If I have to struggle with a tool, I am probably faster and attain a more reliable result by simply researching the concepts from scratch. BUT if the subject matter is clear-cut, you expect more work in that area and the tool provides a nice interface that allows you to work efficiently, by all means use the existing bilingual corpus as one of your research tools.
And here comes my second qualification: Double-check the target-language equivalents found in bilingual corpora. Your additional research will a) confirm that the target term used in the corpus was correct and b) give you the metadata that you might want to document anyway. I am thinking mostly of context samples. While you could easily use the translated context from your corpus, context written by a native-speaker expert in the target language gives you a higher reliability: it shows that the term is correct and used and how it is used correctly. The beauty of working with a corpus is that you already have terms that you can check on. Be prepared to discard them, though, if your research does not confirm them. Ultimately, you want your term base entries to be highly reliable: Do the work once, reuse it many times!
Terminology is for translators! Why should I, as a fill-in-the-blank expert, worry about terminology? Oh, but we are in marketing, not in translation! Excuses, excuses. When you wait until your terminology hits the translator, it is too late. Besides the fact, it is not true that folks in the content supply chain don’t deal with terminology management. Most of them just don’t deal with it consciously. Some do it very effectively.
But there are links in that chain who very, very actively deal with terminology. Only three out of 23 car sales people I interviewed at the Canadian International AutoShow in February, for instance, were stumped by the question “what is terminology”. All others had very good definitions, explanations and synonyms handy. What’s more, almost all of them pointed out the effect of terminology choices on their customers. They knew muuuch more about terminology issues than most people in the content supply chain are willing to admit. Some of them were just not that happy with the terminology that came down the pipeline to them!
Terminology is very deliberately used by marketing and branding departments to achieve brand recognition and ultimately to sell. Here is a commercial that uses presumed synonymy to introduce essential concepts of a product and reach potential buyers on different levels:
It brings in terms from other subject areas to introduce what could be an unknown technical term: “clipper shavers” vs. “twin blades.”
Using presumed synonymy as a technique allows the marketing experts to have a likeable bungler explain what is implied to be a technically excellent product, all with the tag line “Hard to describe, easy to use.”
It is not as over-the-top as the Turbo Encabulator that has my students rolling on the floor even at 9 PM. But it shows how clued into terminology methods some branding folks really are. So, if you are part of the content supply chain and think you have nothing to do with terminology principles and methods, think again. Your competition is using them while you are still denying they exist.
Terminology in a commercial
*For more on Kryptonite see the Wikipedia entry. What I find interesting is that the commercial refers to it, even though it stands for a weakness. The makers of the commercial rely on the association to Superman being so strong, powerful and positive that the target audience completely forgets what Kryptonite stands for.
BIK: Thanks to Ben W. for pointing out a much more logical explanation, which eluded me in the final minutes of writing the above: The direct association with Kryptonite is that with a powerful material. And who wouldn’t want something that is stronger even than Superman.
One of the main reasons we have doublettes in our databases is that we often don’t get around to doing proper terminological analysis. I was just witness to and assistant in a prime example of a team doing this analysis at the meetings of ISO TC37.
ISO TC 37 is the technical committee for “Terminology and other language and content resources.” It is the standards body responsible for standards such as ISO 12620 (now retired, as discussed in an earlier posting), 704 (as discussed here) or soon 26162 (already quoted here). This year, the four subcommittees (SCs) and their respective working groups (WGs) met in Seoul, South Korea, from June 12 through 17.
One of these working groups had considerable trouble coming to an agreement on various aspects of a standard. Most of us know how hard it is to get subject matter experts (or language people!) to agree on something. Imagine a multi-cultural group of experts who are tasked with producing an international standard and who have native languages other than English, the language of discussion! The convener, my colleague and a seasoned terminologist, Nelida Chan, recognized that the predicament could be alleviated by some terminology work, more precisely by thorough terminological analysis.
First, she gave a short overview of the basics of terminology work, as outlined in ISO 704 Terminology work – Principles and methods. Then the group agreed on the subject field and listed it on a white board. Any of the concepts up for discussion had to be in reference to this subject field; if the discussion drifted off into general language, the reminder to focus on the subject field was right on the board.
The group knew that they had to define and name three different concepts that they had been struggling with, although lots of research had been done; so we put three boxes on the board as well. We then discussed, agreed on and added the superordinate to each box, which was the same in each case. We also discussed what distinguished each box from the other two. Furthermore, we found examples of the concepts and added what turned out to be subordinates right into the appropriate box. Not until then did we give the concepts names. And now, naming was easy.
After this exercise, we had a definition, composed of the superordinate and its distinguishing characteristics as well as terms for the concepts. Not only did the group agree on the terms and their meanings, the data can now also be stored in the ISO terminology database. Without doublettes.
Granted, as terminologists we don’t often have the luxury of having 15 experts in one room for a discussion. But sometimes we do: I remember discussing terms and appellations for new gaming concepts in Windows Vista with marketing folks in a conference room at the Microsoft subsidiary in Munich. Even if we don’t have all experts in shouting distance, we can proceed in a similar fashion and collect the information from virtual teams and other resources in our daily work. It may take a little bit to become fluent in the process, but terminological analysis helps us avoid doublettes and pays off in the long run.
Two years after the then new cloud-computing technology by Microsoft was named Windows Azure, Microsoft employees and partners are still wondering how to pronounce the name. Is that a good thing for product branding? Probably not.
Naming is a big part of terminology management. In her presentation for the last DTT symposium, Beate Früh, language service manager at Geberit International AG, a European producer of sanitary technology, described very well how she and her team support engineers in finding the right names, terms or labels for new products or parts (for examples see the adjacent image or the slide deck in German). One of the keys: The team comes in early in the process to help engineers find the best possible terms.
What are best possible terms or appellations? Obviously, each language has its own rules on term formation, as discussed in What I like about ISO 704. But here are the main criteria as well as a checklist that good terminology should meet, again courtesy of ISO 704:
Transparency: Can the reader understand what the concept is about by looking at the term?
Consistency: Is the new term or appellation consistent with the naming in the subject field? Or does it introduce new aspects at least very deliberately or only when necessary?
Appropriateness: Are the connotations evoked by the designation intentional? And do they follow “established patterns of meaning within the language community?”
Linguistic economy: Is the term or appellation as short as possible, so as to avoid arbitrary abbreviations by users?
Derivability and compoundability: Is it easy to form other terms, e.g. compounds, with the new term?
Linguistic correctness: Does the new designation conform to morphological, morphosyntactic, and phonological norms of the language?
Preference for native language: Is the new term or appellation borrowed from another language? Or could it be replaced by a native-language designation?
Why would it take a terminologist to name things correctly? In the software industry, we used to say that programmers became programmers because they wanted to deal with 0s and 1s, not with words and terms. Similarly, product engineers are probably better with designing, developing, or testing devices rather than naming them. What’s more, they don’t necessarily think about what happens downstream, let alone set up entries in a terminology database.
Participants of the Life Science Roundtable at LocWorld yesterday in Seattle illustrated the necessity to deliberately choose terms and appellations early in the process, document them as well as their target-language equivalents and then use them consistently: After a device has gone through the regulatory process, even linguistic changes are extremely difficult, if not impossible to make. Tough luck then if a name doesn’t work very well in one or more of the other 25 target markets.
At Microsoft, most product names are run through a process called a globalization review. Marketing experts work with native-language terminologists on evaluating whether the above criteria are met. Some names obviously don’t get submitted. So, Aaaazure, Azzzzure…let’s call the whole thing off? No. But since I am now married to an “Azure evangelist”, I hope that the concept behind the appellation is really solid and makes up for the trouble we have with its pronunciation.
Linguistic quality is one of the persistent puzzles in our industry, as it is such an elusive concept. It doesn’t have to be, though. But if only quantity matters to you, you are on your way to ruining your company’s linguistic assets.
Because terminology management is not an end in itself, let’s start with the quality objective that users of a prescriptive terminology database are after. Most users access terminological data for support with monolingual, multilingual, manual or automated authoring processes. The outcomes of these processes are texts of some nature. The ultimate quality goal that terminology management supports with regard to these texts could be defined as “the text must contain correct terms used consistently.” In fact, Sue Ellen Wright “concludes that the terminology that makes up the text comprises that aspect of the text that poses the greatest risk for failure.” (Handbook of Terminology Management)
In order to get to this quality goal, other quality goals must precede it. For one, the database must contain correct terminological entries; and second, there must be integrity between the different entries, i.e. entries in the database must not contradict each other.
In order to attain these two goals, others must be met in their turn: The data values within the entries must contain correct information. And the entries must be complete, i.e. no mandatory data is missing. I call this the mandate to release only correct and complete entries (of course, a prescriptive database may contain pre-released entries that don’t meet these criteria yet).
Let’s see what that means for terminologists who are responsible for setting up, approving or releasing a correct and complete entry. They need to be able to:
Transfer the result of the research into the data categories correctly.
Assure integrity between entries.
Approve only entries that have all the mandatory data.
Fill in an optional data category, when necessary.
Let’s leave aside for a moment that we are all human and that we will botch the occasional entry. Can you imagine if instead of doing the above, terminologists were told not to worry about quality? From now on, they would:
Stop at 50% research or don’t validate the data already present in the entry.
Fill in only some of the mandatory fields.
Choose the entry language randomly.
Add three or four different designations to the Term field.
Do you think that we could meet our number 1 goal of correct and consistent terminology in texts? No. Instead a text in the source language would contain inconsistencies, spelling variations, and probably errors. Translations performed by translators would contain the same, possibly worse problems. Machine translations would be consistent, but they would consistently contain multiple target terms for one source term, etc. The translation memory would propagate issues to other texts within the same product, the next version of the product, to texts for other products, and so on. Some writers and translators would not use the terminology database anymore, which means that fewer errors are challenged and fixed. Others would argue that they must use the database; after all, it is prescriptive.
Unreliable entries are poison in the system. With a lax attitude towards quality, you can do more harm than good. Does that mean that you have to invest hours and hours in your entries? Absolutely not. We’ll get to some measures in a later posting. But if you can’t afford correct and complete entries, don’t waste your money on terminology management.
The body of ISO 704 “Terminology work—Principles and methods” lists a bunch of important information for terminology work. But what stuck in my mind is actually the annexes, most of all Annex B.
In the current version 704:2009, Annex B is devoted to term-formation methods. In other words, it gives us the most important methods that we have available when creating new terms or appellations in English. It also notes what might be obvious to us, i.e. that these methods differ from language to language. For German, for example, we now have the new Terminologiearbeit – Best Practices which the German terminology association, DTT, published recently and which is more systematic about this topic than a standard might be.
Comprehensiveness need not be the goal of 704; awareness of these methods is more important. If half the content publishers, PMs, branding or marketing folks that I worked with in the IT world had read those five short pages, it would have done a world of good. Instead, I have heard colleagues mocking terminologists who, when coining new terms, pull out the Duden (the main German-German dictionary, similar to The Webster’s or Le Petit Robert) to apply one of these methods. She who laughs last, laughs best, though: New terms and appellations that are well-motivated—either rooted in existing language or deliberately different—last. Quickly invented garbage causes misunderstandings and costs money.
Annex B doesn’t claim to be comprehensive, but it lists the most important methods that can be used in term formation. Here are the three main methods and some examples:
The first one is the creation of completely new lexical entities (terms or appellations), also known as neoterms. One way of creating a neoterm is through compounding, where a new designation is formed of two or more elements, for example cloud computing.
Two methods that fall into the category of using existing forms are terminologization (see also How do I identify a term—terminologization) and transdisciplinary borrowing. An example of terminologization is cloud, where the everyday word cloud took on a very specific meaning in the context of computing, while the name of the computer virus Trojan horse was obviously borrowed from Greek mythology.
Translingual borrowing results in new terms and appellations that originate in another language. English climbing language, for example, is full of direct loans from a variety of other languages; just think of bergschrund, cairn or scree.
The above are just a few examples to give you an impression of what could be learned by reading Annex B. Incidentally, these are methods. They need to be applied correctly, not randomly. I can already hear it, “but I used transdisciplinary borrowing to come up with this [junk]”. No. Even if your orthopedist uses minimally-invasive arthroscopic surgery to fix your knee, you want him to be sure that you actually need surgery, right? If you need to coin English terms or appellations on a regular basis, Annex B of ISO 704 is worth your while. I also like Annex C. More about that some other time.
As I am putting my thoughts together, I was wondering: Who knows or uses ISO 704? I would like to invite you to do two things: Click on the little survey below in this posting. And, if you haven’t done so, please tell me about yourself by participating in the surveyon the Survey tab. Both surveys are anonymous and might help me understand what this standard could do. If you know the standard and have something to share about it, please leave a comment below. I would be very grateful to get your input.
Because, quite frankly, I am puzzling over this standard. I have read it three times over the past year and every time after a few weeks go by, I have to think about what this standard is actually for. I believe it stems from the fact that it is a bit wordy at the moment. It contains a lot of good information, but the presentation is ineffective.
But now, what can it do for the reader? As its title says, it lays out the various principles underlying terminology management. For example, it tells us what objects, concepts, concept relations and concept systems are. It then goes into definitions and definition writing, before the subject of designations is discussed. Remember, this little graphic from What is a Term? As an aside, we talk about terms many times when we actually mean designations; in German, we even find the ugly Anglicism Term and its plural Terme.
So, ISO 704 really does do what the title says, it presents us with principles and methods. It just doesn’t seem to stick with me. Yet.