Mar 18, 2014

The curious case of crappy XML in memoQ

Recently one of my collaboration partners sent me a distressed e-mail asking about a rather odd XML file he received. This one proved to be a little different than the ordinary filter adaptation challenge.

The problem, as it was explained to me, seemed to involved dealing with the trashed special characters in the German source text:


"Special" characters in German - äöüß - were all rendered as entities, which makes them difficult to read and screws up terminology and translation memory matches among other things. Entities are simply coded entries, which in this case all begin with an ampersand (&) and end with a semicolon. They are used to represent characters that may not be part of a particular text encoding system.

At first I thought the problem was simply a matter of adjusting the settings of the XML filter. So I selected Import with options... in my memoQ project and had a look at what my possibilities were. The fact that the filter settings dialog had an Entities tab seemed like a good start.


This proved to be a complete dead end. None of the various options I tried in the dialog cleared up the imported garbage. So I resolved to create a set of "custom" entities to handle with the XML filter, and used the translation grid filter of memoQ to make an inventory of these.

Filtering for source text content in the memoQ translation grid
That's when I noticed that the translatable data in this crappy faux XML file was actually HTML text. So I thought perhaps the cascading filters feature of memoQ might help.

Using all the defaults, I found that the HTML was fixed nicely with tags, but I did not want the tags that were created for non-breaking spaces ():


So I had another look at the settings of the cascaded HTML filter:


I noticed that if the option to import non-breaking spaces as entities is unmarked (it is selected by default), these are imported quite properly:

 
Now the text of some 600 lines was much easier to work with - ordinary readable document with a few protected HTML tags.

I'll be the first one to admit that the solution here is not obvious; in fact, apparently one of my favorite Kilgray experts took a very different and complex path using external tools that I simply understand. There are many ways to skin a cat, most of them painful - at least for the cat.

As I go through and update various sections of my memoQ tips guide, I'll probably expand the chapters on cascading filters and XML to discuss this case. But I haven't quite figured out a simple way to prepare the average user for dealing with a situation like this where the problem is not obvious. One thing is clear however - it pays to look at the whole file in order to recognize where a different approach may be called for.

Maybe a decision matrix or tree would do the trick, but probably not for many people. In this case the file did not have a well-designed XML structure, and that contributed to the confusion. My colleague is an experienced translator with good research skills, and he scoured the memoQ Help and the Kilgray Knowledgebase in vain for guidance. Our work as translators poses many challenges. Some of these are old, familiar ones, repackaged in new and confusing ways, as in this case. So we must learn to look beyond mere features and instead observe carefully what we are confronted, using that wit which distinguishes us from the dumb machines which the dummies fear might replace them.

Mar 16, 2014

thebogword stands for tradition! Like droit du seigneur.

After the unfair parodical attacks by piratical parrots like Yours Truly, thebogword, in consultation with the MoJ, Capita and other agents of social reform and control, has taken necessary steps to silence unruly laughter at the expense of one of Crony Capital's shiniest knights in the service of language.

New clauses in thebogword'Z Online Contract ensure that freelancers will toe the line and show due deference or face painful bowel movements and other consequences as pins are inserted in voodoo dolls bearing their images:
7.14. For the duration of this Agreement and for a period of three years thereafter, you agree that you shall not publish or participate in any online or print media in which the content is abusive and/or defamatory and/or a parody of us and/or our officers and/or our employees and /or their families and/or in which you impersonate us and/or our officers and/or our employees and/or their families.
7.15: You acknowledge that a breach of the provisions in clause 7.14 would cause us irreparable injury for which we would not have an adequate remedy at law. In the event of a breach, you agree that we shall be entitled to injunctive relief in addition to any other remedies we may have at law or in equity.
Note carefully the words
"... you agree that you shall not ... participate in any online ... media in which the content is ... a parody of us and/or our officers and/or our employees ..."
You have been warned. If you are a current or future contractor of thebogword, alias thepigturd, cosa nostra or The Big Word, you must not participate in this medium or any other in which unsanctified content appears. Violators will face appropriate divine retribution. You must particularly refrain from posting any comments on this blog in agreement or dispute, even in the defense of porcine persons, natural or legal, and their right to leave their digestive end products in places of their choice in language professions. Nor are you permitted to participate in online venues such as Facebook, ProZ.com, Translators Café or elsewhere, which have been or shall be identified as containing content in violation of Section 7.14 of the Online Contract.

In so doing you also grant your implicit consent pursuant to the spirit of Section 7.14 OC i.c.w. the will of the Privy Council of thepigturd, to wit that you grant your full and voluntary consent in accordance with capital Principles of Capital and in acknowledgement of the social engineering contributions of theturd and its frustrated, underpaid Executive Excellency, and surrender cheerfully from this day forth your virgin person or a suitable substitute, male or female, not to exceed eleven years in age, to serve the Higher Corporate Power of theturd in ways necessary for consideration over and above the value added in HAMPsTr processes and other instruments of its profit and control.

These measures are the just and necessary sequelae of the continued unfairness of no remedy at law for a sense of humor not shared by those of the porcine persuasion. Further steps will also include another 15% reduction in your language service fees to fund the next executive bonus required to soothe a bruised ego.

Mar 11, 2014

Across: The Great Divide



A leading figure in the international translation technology world once remarked to me that when translation agency principals get together they gripe about SDL, but when tool vendors get together, Across is the subject of complaint. I can only imagine that to be because translation agencies are usually smart enough to stay away from Across altogether and know little of the horrors awaiting behind its virtual barbed wire. The philosophy and implementation of Across is like a virtual gulag for translators and data; adopt this ill-considered solution and let the software's developing perpetrator, Nero, fiddle while your business burns.

In the 1980s I worked at the research center of a major international enterprise and saw the terrible economic consequences of a proprietary laboratory information management system (LIMS) which led to the loss of years worth of data because the data could not be migrated after the software provider failed. So I was absolutely astounded several years ago to hear an Across representative at LocWorld in Berlin defend the companies "unique selling point" of incompatibility as a "security measure" that corporate clients appreciate. Not smart corporate clients with a future, certainly. Without even trying hard I can come up with half a dozen means of five-finger discounting the "language assets" stored on an Across server; what I can't do is suggest convenient ways for someone burdened with an Across server to work with the majority of linguists who refuse to have anything with what one agency owner described as "the only CAT tool that pretty much guarantees you under 2k words per day."

I looked at Across myself some time ago and was shocked by its appalling ergonomics, which complicate the work of translators to an extreme degree compared to popular and interoperable solutions like SDL Trados Studio, WordFast, OmegaT, memoQ and others. Across is a trap which offers its users no real advantage and a host of liabilities for data management and work planning. The "advantages" are entirely for the tool provider because of client lock-in and the inability of Across users to migrate easily to a better solution which allows cooperation with a wider spectrum of qualified service providers instead of merely those desperate enough to sacrifice themselves for the crusts to be had with this bottom-tier solution.

Although I'm known for my personal preference for memoQ for the kinds of work I do, I am familiar with quite a range of tools, and I can endorse any tool with reasonable ergonomics produced by a stable team with good support and a commitment to interoperable data standards. With a clear conscience I might support the use of WordFast, OmegaT, SDL Trados, Ontram, STAR Transit, memSource, memoQ and others depending on the particular needs of a situation, because I know that the client will not be locked in and will have viable options of work with qualified service providers who may have optimized workflows involving other tools. Unfortunately, this is not the case with Across.

A recommendation of Across by me would be a deeply hostile and unethical act, in principle a statement that I wish the client to be locked in to an inefficient, costly platform that will give the advantage to competitors with more flexible means of work. And fortunately, I cannot think of anyone deserving of such a harsh sentence as Across.

Creator Antony Stanley. This file is licensed under the Creative Commons Attribution-Share Alike 2.0 Generic license.
Photo: Antony Stanley. This file is
licensed under the Creative Commons
Attribution-Share Alike 2.0
Generic license.

[Update from the 2015 Jaba Partner Summit:
There may be important changes ahead. There may be cracks in this virtual Berlin Wall and perhaps a hope for the future for some of those now so cruelly shackled by the deluded "security" philosophyof Across. Time will tell, and I will be very pleased if action is taken to implement the words I heard yesterday from an Across representative.
]

Mar 8, 2014

The carnival is over. The MpT emperor still has no clothes.

The late Miguel Llorens once commented about David Grunwald, machine pseudo-translation (MpT) developer and advocate and owner of GTS Global Translations:
"... I disagree with Mr. Grunwald about most things. His ideas about translation as a commodity are depressing and I wouldn’t work for him unless something with a bit more dignity—such as “circus freak”—weren’t a viable career option (for whatever reason)."
However, Miguel also respected the man's efforts to expose the sleazy scam of a Canadian translation technology company called Ortsbo a few years ago. I also find many of Mr. Grunwald's views troubling, particularly statements that "one translator is easily replaceable with another" based upon a long string of unsupportable suppositions. His company's blog often contains interesting and useful insights into events, actors and issues of interest to translators but his obsession with machine pseudo-translation (MpT) and fanatical devotion to the ideology of commoditization over the years wore me out, and I prefer not to spend my energies contemplating the campaigns of one who seems to be on a personal mission as a mental battering ram directed against individuals who are professional language service providers.

So I was surprised when I found his recent guest post on the TAUS blog, which is too often a semi-coherent organ for the hucksters in the MpT carnival. It's more or less what I've expected to hear for a while and many of his points can be clearly picked out of arguments that Kirti Vashee and others make (and which are often overlooked or contradicted by their sources on other occasions). But Mr. Grunwald's piece strikes me as the clearest, most comprehensive and honest statement of current art that I've heard from the MpT camp so far.

I'm not quite the enemy of automation and machine pseudo-translation that some take me for. I am simply against lies, liars and (un)professional abuse in forms such as the human-assisted machine pseudo-translation (HAMPsTr) processes that so many piratical organizations and their enablers push. There are clear cases where automated translation processes offer value, but damned few of these have anything to do with my fields and level of work, and the attempts of SDL and other organizations to pretend otherwise are dishonest and/or deluded at best.

Read the TAUS post and think about it. You might wonder why an MpT advocate would make such unambiguous admissions. Well, unlike some in that camp, Mr. Grunwald never struck me as dishonest, merely as one who inhabited a stratum of the barrel where translators are perhaps indeed interchangeable. He clearly has a good mind, a sense of ethics which seems sound enough in most respects and perhaps a little taste for shaking things up. But as he points out, the money has gone elsewhere now.
"The VCs have rendered their decision: MT is out, human translation is in. In the last 2-3 year a number of venture capital companies have poured millions into companies that develop human translation automation platforms."
And
"Post-edited MT is not as good as from-scratch. Everyone has heard the ‘you get 2 out of 3’ saying. When you deliver post-edited translations, it will be cheap and fast, but will not be (as) good."
The whole MT carnival for years has reminded me of The Great Y2K Scam aka The Last Hurrah of Cobol Programmers. Grab the cash as fast as you can as long as the suckers leave it on the table. Things did not change much a few years ago in a technical sense when MpT became all the rage in the bottom tiers. What did change was the perception that there was money to be made, a fix of VC heroin to dream of language automation at least, and those in the pay of special interests began to brand skeptics like Mr. Llorens as "haters and naysayers" and worse.

Now the money has gone away; it's time to wake up and face reality - or the latest deceptions.

Mar 5, 2014

Spanish/English language service education online: Q&A with Judy Jenner

When I heard a few years ago that colleague Judy Jenner, who lives in Nevada (USA) had begun teaching courses at a California university, I pitied her for the commute. Then I heard that it was all done online, and while I remember some good programming courses I took that way ages ago, I was curious how this would apply to translation and interpreting. Since then I have met other professionals bringing their experience to university education this way and seen excellent online education in the Portuguese university system, so when Judy a few days ago that she would be offering two courses again this spring, I asked to see her syllabus to get a better understanding of her approach to education. That led to more questions... and answers.

(KSL) Judy, thank you for sharing the syllabi for your two courses offered this term through UC San Diego Extension. I have a great personal interest in how effective distance education is conducted, and the syllabus organization makes it clear that these are structured courses with clear objectives for those interested in Spanish/English language service careers. Could you tell me how you became involved with the UCSD Extension program?
(JJ) My pleasure. The university’s Extension program contacted me a few years ago and asked me to serve on their advisory committee for the certificate in English/Spanish translation and interpretation. I gladly accepted and helped them shape their program a bit. Following that pro bono work, the university asked me to teach a class. I was initially a bit hesitant, as I really enjoy giving one-day workshops for fellow professionals, but wasn’t sure about teaching beginners. However, I immediately fell in love with teaching and have gradually added two more classes. The positive feedback from students has been overwhelming, and I really enjoy helping educate the next generation of translators and interpreters.
(KSL) What is the typical profile of a student in your course? For whom would these courses not be suited? Do you have students outside the US?
(JJ) Anecdotally, I’d say that the vast majority of my students already have a bachelor’s degree and many hold a graduate degree as well. I’ve had many students looking for a career change and others who are young and looking for an education in T&I from a well-known university. I’ve even had a college professor of Spanish take my class, and he really enjoyed it. I also get many heritage Spanish speakers. These students are first-generation Americans with Hispanic parents who grew up in the US and spoke Spanish at home but have no formal education in Spanish. Many students do think that being bilingual means that they are automatically a translator or interpreter, and I go to great lengths to clear up that misconception (and many others). Some students have called my class “reality check,” and I think that describes it well. It’s my job to tell students about the realities of our profession without sugar-coating anything, and if a student finishes my class thinking she’s not yet qualified to sell her language services, then I think that’s also a perfectly good outcome. I think it’s crucial to prepare students for the realities of the marketplace, which T&I programs have traditionally not done. I also do have a lot of students from outside the US, including Spain, Bolivia, Argentina, Italy, Switzerland, etc. I love reading about my students’ lives and their adventures. I have a student who is currently teaching English in Russia.
(KSL) The syllabi mention the "Blackboard" system? What is this exactly? Are you familiar with other course management systems, and if so, how do these compare? Are your lectures live or recorded, and how are assignments handled?
(JJ) Blackboard is a powerful online learning system that’s widely used by leading universities around the world. It’s entirely web-based and houses all the lectures, discussion groups, grades and exams and quizzes. I’ve worked with other online learning systems, and I think Blackboard is probably the best system, as it is quite intuitive and user-friendly. However, just like every piece of software, it has some flaws. The lectures consist of pre-recorded PowerPoint presentations with audio that include exercises, lots of graphics and pictures, etc. Students have weekly deadlines that they must meet, and everything is handled through the Blackboard system.
(KSL) Are these courses part of a degree or certificate program? What other related offerings does UCSD Extension have?
(JJ) Yes, these courses are part of the online certificate in translation/interpretation. There are two certifications: one for translation (http://extension.ucsd.edu/programs/index.cfm?vAction=certDetail&vCertificateID=174) and one for translation and interpretation (http://extension.ucsd.edu/programs/index.cfm?vAction=certDetail&vCertificateID=83) Most of the classes, if not all of them, are offered online. Other classes are held at the La Jolla campus (which is quite gorgeous). However, students can just take individual classes as they wish and don’t necessarily have to work towards the certificate.
(KSL) What are the advantages and disadvantages you find in this type of instruction? Do you feel there are particularly important aspects in organizing such a course successfully?
(JJ) Here in the US, it’s quite a challenge to get a university education in translation and interpreting, which certainly isn’t the case in Europe. However, in recent years, many universities have started online programs, which helps fill a void. While these are not full degree programs, at least these are certificate programs that teach students the basics. The more people we can educate about our industry and its challenges and rewards, the better. However, I’d like to point out that I think it’s essential to look for a good university – preferably a bricks-and-mortar university that has added an online component rather than an online-only program, as the quality of instruction at some of these programs can be sub-par at best. Online instruction removes the geographical barrier and lets students do the work on their schedule. I really don’t think my translation students are missing out on anything they would get in a traditional classroom. I still answer questions in a very timely fashion, interact with students on a daily basis, and grade all their work. For the interpretation class, I think online education might be a bit better because few universities in the US have the simultaneous interpreting labs that are needed to properly practice simultaneous interpretation. Without these labs – and I have been to many workshops and classes where we have had to do this – the instructor just plays an audio file and all students interpret at the same time and record their performances using their iPhone (or similar). This is obviously not ideal as it gets quite noisy. With the online interpreting exercises, I read dozens of prepared speeches every class and students get to interpret them at home with a headset in a quiet environment. They can interpret the same files over and over and get practice that way. Of course they don’t get immediate feedback because I am not there with them, but in my experience, even when I attend week-long training sessions (I went to the highly regarded Monterey Institute of International Studies last year), you also get limited feedback, as class sizes tend to be large. Of course there are drawbacks as well because there’s nothing quite like meeting students in person. Luckily, I have met many of them at conferences, which is lovely. Unfortunately, I usually do not write letters of recommendation – I feel like I really need to spend time with someone before I can recommend them for graduate school, a job, etc.
(KSL) What is the deadline to sign up for your next courses?
(JJ) Introduction to Translation (five weeks) starts again on April 1, and you can sign up until April 1, although the class does tend to fill up. Introduction to Interpretation starts on May 6, and my brand-new class, Branding and Marketing for Translators and Interpreters, starts April 1 as well.
An indexed, recorded course lecture in the Blackboard environment. A lot easier to follow than typical "webinar" formats.
Click to enlarge.
*********** 

Judy Jenner is a court-certified interpreter (Spanish and German) and legal and business translator in Las Vegas, NV. She runs Twin Translations with her twin sister, Dagmar. They are the authors of the industry book “The Entrepreneurial Linguist: The Business-School Approach to Freelance Translation,” which has sold more than 3,000 copies and is required reading at universities around the world. Judy pens the monthly Entrepreneurial Linguist column for the American Translators Association’s Chronicle and is a regular contributor to the Institute of Translation and Interpreting’s Bulletin. Judy was born in Austria, grew up in Mexico City and has lived in the US since she was a teenager. She is the immediate past president of the Nevada Interpreters and Translators Association and a frequent keynote speaker at conferences and workshops in many countries, including Brazil, Ireland, UK, Austria, Germany, etc. In addition to her translation and interpreting work, Judy is an adjunct in the online program of the University of San Diego-Extension’s online certificate in translation and interpretation. She blogs at www.translationtimes.com and she’s on Twitter (@language_news).

Mar 4, 2014

März Übersetzertreffen in Berlin, 06.03.2014



Liebe Leute,

hier die Einladung zum nächsten Übersetzertreffen am:

            Donnerstag, 6. März 2014, ab 20.00 Uhr

Wir gehen wie letztes Mal in den Reuterkiez, nämlich ins:

            Chelany
            Friedelstraße 41
            12047 Berlin (Neukölln)
            U-Bahn: Schönleinstraße

Im Chelany erwartet euch pakistanische (man könnte auch sagen: indische) Küche mit zum Beispiel Hühnchengerichten und Pakoras (frittiertes Gemüse).


Bis übermorgen!
Andreas