Dec 31, 2009

Trados Classic as the fulcrum of collaboration

ful·crum (ˈfu̇l-krəm, ˈfəl-), n.
plural: fulcrums or ful·cra \-krə\
etymology: Late Latin, from Latin, bedpost, from fulcire to prop
1 (a): prop; specifically : the support about which a lever turns (b): one that supplies capability for action
2 : a part of an animal that serves as a hinge or support (also applies, given what a beast Trados is!)

Users of other translation environment tools often become irritated when Trados is referred to as a "standard". It is certainly not one in an official sense; no national or international body has recommended compliance with its formats and protocols. Yet as an early entrant to the field of machine-assisted human translation tools backed by ruthless marketing, for many it became a de facto standard, and in seeking to gain acceptance on the market, vendors of many better alternative tools adopted Trados file formats or enabled their users to work with them in some way.

I suppose that in ten years' time formats like TTX and the marked-up "bilingual" format of files processed with the Trados macros in Microsoft Word or some equivalent will be rare or non-existent. However, with other tools such as Wordfast Classic or Anaphaseus using this markup as a primary format and MemoQ and Déjà Vu X supporting it as an exchange format for compatibility and collaboration, the engine driving the persistence of Trados formats is no longer entirely of the original provider's making. So for a number of years yet, I think it will be important to understand the role that Trados data formats can play in information exchange and collaboration in translation projects. My comments below in this post can be understood primarily as tips and instructions for people who use the old Trados (version 8.3 or earlier) as a tool for a project where there is the intent to outsource some of that project's content to translators who use some tool other than Trados. Please note that I am describing only scenarios with which I am familiar; if there are important differences that apply to different tools, I encourage more knowledgeable persons to enlighten me and others in the comments.

The first thing an outsourcer must understand and decide is whether the "compatibility" that a translator offers by working in a tool other than Trados and exporting "Trados-compatible" files (bilingual files, TMs and terminology resources) is really sufficient. The answers to this question can vary a lot.
  • If you insist on a Trados project using a Trados TM server (TM Anywhere technology), especially where multiple translators must be coordinated, you are probably aware that there is really no substitute for the translator working in Trados itself in some way. None of the major TEnT vendors' servers are accessible by clients from other providers as far as I know. So here the translator will have to bend his or her knee and kiss the pope's ring or just skip the job. If you as an outsourcer are willing to accept compromises or the translator will be working alone (so you can perhaps provide a TM export and the translator won't miss out on contributions from others in a project), then keep reading for more options. I have a client in Switzerland that likes translators to work off their Trados server online. Although I have the technical ability to do this, I refuse to work this way due to years of bad experience with "enterprise technologies" from Trados. (TeamWorks put such a bad taste in my mouth that it would take an Act of God to make me willing to use such things from that source again.) So this client kindly provides TM exports with a password for access. For "confidentiality reasons" and because of contractual obligations to the end client (I am told) they cannot give me the password so I can use this database. However, this is a virtual fig leaf, because they know that I can find out the TWB password in seconds using TMPwdRec.exe from Kakeeware. It works with Trados versions through 8.3 (despite the fact that the highest version mentioned on the linked web page is 6.5).
  • If your source file is fairly complex or maximum leverage (i.e. highest quality matching) is very important in later projects, then you want to have your files pre-processed with Trados. How depends on your needs and workflows. As many who have moved from processing MS Word files with the macros in Word to translation of these files in TagEditor have learned, what should be a 100% match from the TWB TM often is not; the same issue will often be found if you get a TMX file from OmegaT, MemoQ or some other tool. Your translator may or may not have a copy of Trados to do this, but if you have special segmentation definitions or other unusual circumstances, you might want to prepare the files yourself. Also make it clear to the translator whether it is "allowed" to change your segmentation. I always assumed that combining bad segments and occasional adjacent ones in order to create more sensible and/or better content and avoid nonsense in the TM was desirable until I encountered a colleague in Colorado whose projects often demand such extreme levels of automation that he expects translators to change absolutely nothing with the default segmentation. This extreme attitude is an unfortunate byproduct of working with primitive technologies like Trados; with TM-driven segmentation like you'll find in MemoQ this is no longer an issue, as the best match can be created dynamically. Since my colleague is a smarter guy than I am, I assume he's already moved on to something better or will do so at some point.
    If you will prepare the Trados files for your translator, so-called presegmentation of the files will generally be necessary (unless the translator uses something like Wordfast Classic or Anaphraseus, which more or less follow the old Trados macro rules for segmenting as they work). Find out whether the translator wants files where the target segments are populated with an exact copy of the source (necessary at the current time for OmegaT as I understand it) or whether fuzzy match content - where present - should be written to the target. The latter option is best if the translator's software can handle it, and it will usually save the most time. If your translator does not have a licensed copy of Trados, you should also be kind enough to export the TM, usually to TMX 1.4 instead of the Trados TXT format, so that the translator can use it for concordance lookups. Instructions on how to perform this presegmentation procedure using the Workbench Translate function and particular settings for copying the source to target on no match will be found in my old published instructions for processing Trados RTF and Word projects with Déjà Vu or the information on handling TTX files with Dejà Vu. The preparation is generally the same for software other than DVX. (Both instruction sets are long overdue for an upgrade, but are still OK for orientation purposes. There is also a lot more information to be found in online forums and the Yahoogroups lists for various tools.)
  • If all you care about is getting a good translation and having something for your Trados TM that will usually give you reasonable matches and enable concordance use, then TMX or bilingual exports from tools like DVX or MemoQ are generally more than adequate. In fact, in some cases, this is the only way that an outsourcer can access Trados project content in Trados. I have one customer, a great EN>DE translator and Class A editor who subcontracts a lot of her DE>EN work to us. She works with Trados and expects to receive TM material for her Workbench TMs as part of the deliveries. However, many of her projects, including the MS Word files, require the use of TagEditor, and she has an old version of Trados which cannot handle most of these files in TagEditor. So we do the work in DVX or MemoQ and send her a bilingual RTF or MS Word file to clean, even if the source format is InDesign, Excel, PowerPoint, XML or something else. She can access the information in her concordance and she's happy. I like the true joke that, in many instances, third-party tools are more compatible with Trados than Trados itself. I have seen many examples of this. Even die-hard Trados users would be well-advised to keep licensed or unlicensed versions of a tool like MemoQ around to iron out such circumstances or to send a translation of an InDesign file to a translator with Trados who refuses to use Tageditor. Or to deal with cases where there are a lot of numbers and dates to fix, as noted in an earlier post.
  • Exchange of terminology data poses its own challenges at times and should probably be handled in a separate post. However, in my experience, there are few outsourcers who make sophisticated use of MultiTerm, and terminologies are usually maintained and exchanged in another format. There are, however, a number of good methods for receiving and sending terms, and the only scenario I see as an insurmountable problem for other tools beside Trados is one where dynamic availability of terms via an online server is important. Those cases are rare.
Except for the first case cited - translation projects that use an online Trados server - outsourcers really can be confident that "Trados jobs" done with a third-party tool really are 100% compatible with their processes. This is especially the case if an actual copy of Trados (licensed or demo) is used for pre- and post-processing steps. With regard to segmentation issues, there are possibilities for "improvement" in some cases, which have been discussed for TTX files in an earlier post ("Crossing Segment Boundaries").

As with any project, it's also important to remember to provide copies of the source material as a PDF where possible, so that the formatting and the purpose of mysterious tags/codes can be understood better and errors avoided. But this is true for any project, not just ones involving a mix and match of tools. Yet I am continually surprised by how many experienced project managers and translation consumers forget this basic principle.

Dec 29, 2009

Translating Trados TTX files with MemoQ

Quite some time ago, I summarized the techniques for translating TTX files with Atril's Déjà Vu X and published the information as a PDF file, which is available as a free download from one of my web sites (by clicking the link earlier in this sentence, for example). Now I would like to present how this task might be approached using Kilgray's MemoQ.

First of all, let's consider why you might want to do this at all. I think a typical situation might be where your customer insists on a TTX file as the deliverable translation. Another case might be where the files to be translated need to be pre-processed with TagEditor to reduce import time (like when MS Word files are heavily laden with graphics). I often use TagEditor to pre-process jobs I translate with Déjà Vu, and while MemoQ's import capabilities are generally better than those of DVX (and MemoQ sometimes handles files that TagEditor can't), sometimes everything just runs faster and better if I make a TTX to import into MemoQ.

You can argue about TMX compatibility with the customer all you like, but in many cases too much leverage is lost for future work if you deliver anything except a properly translated TTX file. So don't argue, just do it. In most cases you will want to pre-segment the file. The procedure for doing this is described in Steps 0 & 1 of the Trados TTX in DVX instructions previously mentioned. If you do not have access to a Trados license to use a TM provided and get the best leverage, you should ask your client or a colleague with a Trados license to assist you in the presegmentation (the latter only if the client's confidentiality rules permit).

When you are ready to import the TTX file into MemoQ, there are two options in the Project Wizard or in the Project Manager window: "Add document" and "Add document as...". The former is an import routine that simply brings in the segmented content. The second option opens the document import settings window (click for an enlarged view):  

Selecting the option to import unsegmented content (circled in red with a red arrow pointing to it) will cause numbers and dates, which are usually skipped by Trados, to be imported into the MemoQ project. I don't know of any other software that will do this currently. This is very helpful for technical or financial documents with tables of numbers to be corrected. It is not easy to find all this content in the TagEditor environment, so in this regard, this aspect of quality assurance is a lot easier for a Trados project if it is done in MemoQ.

Quick minds may have realized at this point that with this second import option, all the content of an unsegmented TTX file can be imported. While this is indeed possible, it's usually not a great idea, and it may upset the customer in many cases. This is because the the TTX file cannot be "cleaned" to transfer the data into the Trados translation memory. However, a target file can be saved from it. If for some reason you translate a TTX without segmenting it, the TM information is transferable to the client as a bilingual file (Trados-compatible Word document) or a TMX export from the MemoQ TM, but this is a rotten idea for all but the simplest files, because the segment leverage of the content imported into the Trados TM will probably be awful. You are much better off getting someone to segment the original TTX for you and "retranslating" it from the TM in MemoQ. The only time I would translate an unsegmented TTX myself is if I am using that format for expedience in the case of a huge Word file full of graphics or something similar.

After the TTX has been imported into MemoQ, if you want to clear the target cells, then right-click anywhere in the translation work area and choose Clear Translations... from  the context menu. If you want to clear only a specific range, select the beginning of that range, then shift-click at the end of the range to select all the cells in between. In either case, there are a number of options for what should be cleared (all translations, just unconfirmed segments, etc.). The way this option is implemented is less dangerous than the analogous function in DVX, where I have to remember to filter what I want to keep before clearing target cells. Filter functions can, of course, be applied in MemoQ too.

When you are done translating the (segmented) TTX file in MemoQ, your output is a "uncleaned" TTX file that contains both the source content and your translation. If you have a copy of TagEditor and the original file available, you can save a copy of the translated file in its original format by using the command File > Save Target As... in TagEditor. If you don't have the original file, you can't save a target file - your customer will have to do that.

If your customer has a translation memory relevant to your project, it should be exported from Trados Workbench in TMX 1.4 format and imported into a MemoQ database for concordance use. Please note that it is better to use these databases for the pretranslation/presegmentation step than to presegment the Trados file against an empty database (basically copying the source content 1:1 to the target) and then translate in MemoQ using the migrated TM content from Trados; the leverage will generally be higher (i.e. more and better matches).

This procedure is safe and 100% compatible with Trados. It can also be performed with the unlicensed version of MemoQ (MemoQ4Free) with the restrictions that apply to that product (only one file, no import to the TM, just to the termbase).

Dec 25, 2009

The new Déjà Vu

As the developers choose to classify it, it's only a new build of the current version 7.5 of Déjà Vu X. It was expected quite some time ago but was delayed for the testing and refinement of file filters among other issues.

I downloaded the update file (about 35 MB) from Atril's web site and ran the installation right away. As I expected, there were problems with the dongle drivers immediately thereafter, so that I was confronted with a dialog informing me that DVX would only run in demo mode. This is a common problem, which was dealt with as usual by rebooting and running the program to reinstall the drivers (path: C:\Program Files\ATRIL\Deja Vu X\Dongle\setupdrv.exe). Afterward, when I launched the application, I was pleased that my other settings, including recent projects, were all intact.

According to Atril's version history, the changes in the new build versus Build 303 are:

  • Added new filter for working with XLIFF files, including SDL Trados Studio 2009 SDLXLIFF
  • Added support for FrameMaker v9.0 in FrameMaker MIF filter
  • Added support for InDesign CS4 in InDesign INX filter
  • Microsoft Windows 7 officially supported
  • Microsoft Office 2010 (current at Beta 2) officially supported
  • Improvements in the RTF filter, including reduced extraneous codes and improved performance
  • Fixed issues with curly brackets in PO filter
  • Improvements in the XML filter, including better handling of large CDATA sections and improved performance
  • Improvements and fixes in the MIF filter, including better handling of index entries, footnotes and text insets
  • Various fixes to the SDL IDT filter
  • Fixed issues with exporting satellites
  • Improvements in number handling (particularly in Propagate) and case conversions
  • Fixes issues with filter on selection
  • Fixed various issues with mouse/keyboard focus
  • Fixed various issues with TMX import/export
  • Fixed various issues with MultiTerm import
  • Fixed various issues with TM import/export
  • Fixed various issues with TD import/export
  • Fixed issues with alternate portion handling when work with a separate edit area
  • Fixed issues with search and replace in TM and project

The points highlighted in red are ones that have particularly concerned me in my work; others will have a greater interest in other points, of course. I particularly look forward to seeing if the improvements in the RTF filter will eliminate the need to run Dave Turner's CodeZapper macro on almost every RTF or DOC file I translate with Déjà Vu. Also, the fact that all attempts to import MultiTerm data in the past year have failed has been very irritating. I look forward to testing the performance of the new InDesign filter; in prior versions the filter was vastly inferior to the one in SDL Trados TagEditor, and the best I worked with in most cases was Kilgray's filter for MemoQ.

As has always been the case so far, this upgrade is free to all registered users of DVX. Free upgrades forever aren't the smartest business model if you are trying to cover the cost of ongoing development and support, so I hope that changes at some point so we might see more frequent improvements to what is still in many respects the best translation environment tool (TEnT) option available for freelance translators and small agencies. When asked what I recommend these days, it's a hard call for me. Most of the time I recommend MemoQ now, because of the advanced features, momentum and support that product has as well as its affordable server capabilities, but for quite a number of project types I do frequently, Déjà Vu X remains a critical element. Right now it's very hard to state the best technical choice without knowing a lot about the asker's project mix, so my recommendation is usually based largely on support now. In that respect the team of Atril and PowerLing still has a lot of lost ground to recover.

Dec 22, 2009

The 2008 BDÜ rate survey

Last year's rate survey published by the German translators association BDÜ caused a bit of a stir; it was the first time that such information had been collected and published, and some looked at the published rates as being unrealistically high, while others considered them laughably low. In other words, the rate debates it inspired were no different than any others I've heard or read elsewhere.

This year's publication format is much nicer than the last one. The numbwits involved in last year's booklet were so afraid the data might be copied that they printed it on red paper. This made it very hard to read, and I was quite annoyed at the strain my eyes experienced when trying to read some of the interesting information at the back. This year, good sense prevailed, and the booklet was printed black on white with occasional bits of grey shading, the purpose of which escapes me (maybe it's explained somewhere - I haven't read everything yet).

What interested me was whether the reported rates were significantly different from last year. On the whole I would say that they are not. The number of respondents is slightly lower in most categories than last year (for my DE>EN pair at least), some rates are a bit higher, some a bit lower. No dramatic change in any direction, and I doubt that the changes found are statistically significant. So it would seem that, for German to English at least, the number of panicked translators slashing rates in anticipation of the End of the World (aka "the crisis") is balanced by those of us partying our way to Armageddon by raising rates (something it would seem daft not to do given the huge increases in utility costs and food in the past year).

This year's data included not only the averages but also the median values to give a better picture of the distributions. Personally I would like to see the raw data or at least some standard deviations. Then I can aim to become a "six sigma translator" :-)

Here are some of the current rates reported for the DE<>EN combinations:

For those who want these data as word rates, go query Ms. Muzzi's Fee Wizard or do some word counts on a few of your own documents and figure it out. There are word rate data published by the BDÜ as well, but the number of respondents was much lower (5 for court work), so the data are less indicative I think. It's important to remember that these are average data from an ordinary range of presumably qualified translators. Although the BDÜ does require various types of translation credentials for membership, there are plenty of credentialed translators who would make much better gardeners and pastry chefs or something else. Anything but language service providers.

Averages and median values for hourly rates published for German to English range from 40 to a bit over 60 euros for all the categories. English to German is a bit less, probably due to the competition in Germany.

What's the situation for other language combinations? There is data for language pairs that do not include German, such as FR<>EN. The number of respondents for those combinations is low, but the numbers they report would probably cause some translators in the US and UK to respond with disbelief. On the whole, however, the data reported by the BDÜ is plausible and fits what I have see in recent years. There is a huge range of rates in the real world, and it's as much a matter of marketing and customer service as it is linguistic skill.

Once again, the full booklet with all rate tables and other useful information can be ordered from the association at www.bdue.de. It should be noted that the booklet is in German; the tables above are my translations of an excerpt of the information.

Dec 19, 2009

Translation tool interoperability: Achieving more without the war

A Call to Armistice

Those involved for years with the language services industry have become accustomed to arguments about the best translation environment tools or related programs. To someone familiar with the IT scene for over three decades, these discussions have a very recognizable tone, one often found in the pitched battles between acolytes of IBM, DEC, Sun, Apple and a long list of software and hardware providers too numerous to list in a hefty telephone book. A bit of quiet reflection and a bit more well-grounded understanding then and now, however, lead to the same conclusion: there is no ideal, universal solution to be found anywhere. Some solutions are better on the average for most situations than others, but even the worst tools on offer probably have some scenario which they handle better than any others. Aside from human stubbornness and greed, a good reason why there are so many solutions available for computer-aided translation technology and other IT technologies is that nothing does everything well.

The IT departments of companies came to this realization long ago, not only for practical reasons, but also due to budget constraints. After IT had matured somewhat as a discipline, it was no longer cool to buy the whole package from Big Blue or another source if the mix-and-match approach could produce a better solution for less money. This led to the situation we have today of alliances between vendors co-promoting each others' products and ensuring reasonable degrees of compatibility and interfacing.

Providers of tools to the language services industry have by necessity worked with some common standards, albeit imperfectly in many cases, and have provided compatible or at least semi-compatible solutions for working with file formats from competitors. However, a true commitment to interoperability has not been apparent up to now; such work has typically been presented as a necessary evil if a client expects deliverables in a format out of the ordinary for the translator's preferred tool. There are, however, situations in which parts of a project simply work better with a certain tool and other parts are better done with other software. Sometimes these processing advantages for certain operations are great enough to justify the purchase of software licenses which one does not intend to use to the full extent. What these advantages are in specific situations will be described in later articles, and the judgment of whether they justify learning new software and possibly spending money is left to the reader, who is presumably a competent adult able to make independent decisions and accept responsibility for errors. Each project has its own unique set of criteria, and I make no claim that the techniques discussed will lead to an acceptable solution in every case. The information will be presented as food for thought, which may be of benefit in the some circumstances, and discussion and amendment is encouraged.

I think the time is long overdue to call an end to CAT fights and encourage courtesy and cooperation between solution providers. I would go as far as to call for common interface standards for server communication to allow translators to do client projects on remote servers using the TenT client of their choice. Unrealistic? I don't think so. Solutions like that are not uncommon in mature areas of IT. We need more than browser interfaces. I think that even with the discipline of common communication and exchange interfaces for TEnT data, there is a lot of value to be added by the individual providers such as Atril, Kilgray, SDL and a host of others as they optimize ergonomics and improve data management features.

MemoQ Fest 2010

Earlier this year, Kilgray held the first MemoQ Fest in Budapest after a year of rapid development and progress with the company's flagship application that brought it from the status of an interesting but somewhat impractical newcomer to the TEnT scene to a serious contender for a championship title. Before attending the event last April, I had tested MemoQ off and on for about a year, but I had not been satisfied that it would work for me. Then along came version 3.5 shortly before the conference, and an introduction to it as well as success stories from corporate, agency and freelance users finally pushed me to use the latest version for serious work, not just tests. And for the last seven months I have done just that.

On the whole I am very satisfied with MemoQ as readers of this blog have probably noted. There are some behaviors of the editor module which drive me nuts, and I am not very happy with the program's performance tuning when I use large databases, and I miss certain basic features that I have taken for granted with Déjà Vu X, but MemoQ has brought many unique capabilities to my business which were lacking in the tools I used otherwise (mostly DVX, Trados and Star Transit). The filters for various formats often proved better than SDL Trados or DVX, the ability to import unsegmented content from TTX files (or even do a TTX without segmentation), reasonably stable bilingual Trados exports (though perhaps in need of a few rules, like blocking or converting hard returns within segments), seriously cool server capabilities, relatively fast, trouble-free data import and export, TM-driven segmentation and more. Oh yes, and my favorite, if trivial feature: the ability to customize and save keyboard configurations, so the ergonomics of my MemoQ installation are much like my DVX. No disorientation like I used to experience all the time when switching between DVX and Trados.

MemoQ Fest 2009 opened up more than just technical possibilities for me. It gave me several days of opportunity for private discussions with ordinary users, who shared personal stories of the excellent support they had received from the Kilgray team long before I ever heard of those guys. And of course we all had a lot of fun. Hard not to in a city as beautiful as Budapest when you get to spend the days around nice people with good attitudes and ideas.

So when I got the e-mail from Sandor Papp of Kilgray announcing the call for papers and registration for MemoQ Fest 2010, the decision to attend was pretty much a no-brainer. Especially given the impending release of MemoQ version 4. A rather awful personal schedule in the last few weeks has prevented me from attending the various overview webinars so far, but I know enough about the plans for version 4 to know that it will only improve my opinion of this software and the team behind it. The development and release schedule has in fact slipped a bit compared to announcements earlier this year. But not by much, really, and not without significant advance notice and good justifications. (At the same time various competitors have either released crap on schedule that should have been kept in development for another 6 months or passed a promised release deadline and said nothing at all.)

I'm looking forward to a few fun days with the Kilgray team and its fan base... uh, I mean customers... and the opportunity to learn a lot more about how to get the most out of my personal software investment as well as how clients of mine might benefit from developments in MemoQ Server technology. Since it's a particular interest of mine, I'm also considering a brief talk on tool interoperability for those who must in one way or another integrate MemoQ with other tools such as Trados, Star Transit or DVX in a project. With that in mind, I'll probably publish snippets of relevant information here or on my Facebook page (depending on which format proves more practical for organizing) as preparation. Stuff like the old instruction sets I wrote for editing Trados bilingual files with DVX, etc. If there are any particular things on your wish lists in this regard, let me know.