Translation Tribulations: 10/1/15

Oct 29, 2015

Revised target document workflows in SDL Trados Studio 2015 vs. memoQ

Yesterday I had an unexpected opportunity to see the new SDL implementation of the feature Kilgray introduced to memoQ two years ago, in which a revised target document (or some portion thereof) is re-imported to a translation project for purposes of updating the translation memory. Since my involvement with the concept and specification of this feature in memoQ, I have been expecting the competition to follow suit, since in principle at least, this is a useful feature which nearly everyone can use in several common scenarios.

The way in which SDL Trados Studio 2015 handles project updates with edited target documents appears very different than what memoQ does, so that one might easily think that the functions are different. And this is one of those rare instances where I have to give SDL credit for a smoother, more streamlined procedure less likely to cause confusion and frustration with users.

The positive difference starts with the choice of terminology in the command interface. SDL refers to a "target document" rather than a "monolingual document" - I think this is less ambiguous and potentially confusing to an average user. The fact that these updates are perhaps not supported for bilingual formats in memoQ is one of those nerdy details which will not interest most people, especially given that there is a stable, established update process for project updates using bilingual documents.

When the reviewed file to import is selected, the user has the option to go to the aligner and correct possible matching errors for the revised target document (desirable if, for example, edits might cause the segmentation to change), but the default is to go straight back to the working window for translation and editing, with the changes already shown in tracked changes mode. Very nice.

In memoQ, the trip through the aligner is mandatory, but for simple changes, this is usually not needed, so I like the fact that Studio 2015 offers this as an option. And in memoQ, several extra steps are needed to show the changes in tracked mode (redlined markup), with confusing traps in the interface along the way. In a recent blog post, I described how Kilgray's emphasis on commands and terms relevant only to server projects, with the usual tracked changes options a translator would want buried under the "Custom" command, causes many users to conclude that tracked changes simply do not work in memoQ, which is not true at all. You just have to run the evil interface gauntlet to get there.

Does this mean I think everyone should dump memoQ and start using SDL Trados Studio 2015? Heck no. There are many processes involved in successful translation work, and switching from one tool to another based on a single feature or a just a few features is not particularly clever, no matter which way you go. (Except for "away from Across", which is always a good idea.) I am very pleased and encouraged by SDL's different approach to this feature, because it shows once again the importance of competition and different approaches to a problem. Ultimately, ergonomics and user experiences should determine the further development of a feature. In my opinion, memoQ usually has the edge here, but not always, and this is a case where improvements to this innovative feature which first appeared in memoQ could very well be inspired by SDL.

Oct 27, 2015

Beware the document Reimport trap in memoQ!

In between sneezes and hot shots of gingered lime tea I saw the Skype icon on my Windows task bar change to indicate a message. A distress call from a financial translator friend who had just received a new version of the Q3 report she was translating. memoQ has excellent version management features, which include a document-based pretranslation (X-Translate), which allows one to use a current or previous version of a translation to identify unchanged sections which have already been translated when the client sends a new version. This avoids potential confusion with undesired matches coming out of any ofd many translation memories or LiveDocs corpora which might be attached to a project.

This time, however, memoQ seemed to be getting weird on her, with error messages referring to ZIP archives and password protection. Her customer's file was not password protected, and as far as she knew, there was no ZIP archive anywhere in sight. She was dealing with "ordinary Word files". I have no idea what those are, but I hear about them often enough, and that is often where the trouble starts.

Last July I was teaching a week-long introductory course to memoQ in Lisbon, and when I wanted to show the course participants how this X-Translate feature worked, everyone ran into unexpected problems. When it was first introduced in memoQ, I noticed that the updates would work in any format. A translation which starts out as a script in a word processing file might later be updated as a set of presentation slides, and memoQ's document-based pretranslation did an excellent job of enabling me to focus quickly on the new material. It still does, but since the early days, some advocate of unintelligent programming decided that the filter used for the Reimport function to bring in the updated source text should assume that the source format was unchanged from the previous version rather than simply offer an appropriate filter for the current format. One must specify the filter to be used for an updated version if this assumption is not correct (as I also explained in my book New Beginnings with memoQ shortly after noticing this).

I can probably guess why this was done. With certain filters, the filter to use is not obvious from the extension (the multilingual delimited text filter, for example, if it is needed), or there may be a custom configuration of an "obvious" filter needed. In these cases, the assumption of using the last filter settings makes a lot of sense. However, if there is a change of format, where it is clear that the new filter should not apply, then some action should be taken other than a virtual assault on the user with mysterious error messages.

In the case of my financial translator friend, the update came as a DOC file, where the original had been DOCX. Geeks who have nothing better to learn with their time might know that DOCX files are actually renamed ZIP files, so at least the confusing error message above was "truthful" in a sense.

I see this sort of "switch hitting" with Microsoft Word file formats of various generations or changes from RTF to DOC or DOCX rather often. But in the case of importing new document versions, these changes mean trouble for memoQ if the user does not notice the difference, and given that the majority of working translators I have encountered who use Windows operating systems never fix the default system setting which hides the extensions of known file extensions, the chances that your average mortal wordworker will figure out this problem is just about zilch.

Armed with new insight into the problem, my friend was able to import the new document version successfully by specifying the appropriate filter manually and then use X-Translate to get her previous translation applied to sections of source text which had not changed (so that inappropriate 100% matches from a TM or LiveDocs corpus could be avoided). But for the future, I hope that Kilgray will apply a little more intelligent logic to the selection of filters for the document Reimport function of memoQ.

Oct 25, 2015

European Commission Workshop - Contracts for translation services

What the Linguistic Sausage Producers don't want you to know:

Did you know that tenders for work with the European Commission are not just for the big Wortwurstläden but can be submitted by individual translators who are EU citizens - and that these individuals have equal standing before the Directorate General for Translation? The DGT does not differentiate and many of its best external contractors are individuals, either self-employed persons or dynamic teams of two or three professionals.

The DGT uses taxpayers’ money and must be transparent, with fair and equal treatment for each candidate. Reading their specifications may appear daunting at first, but taking a closer look is worthwhile! Questions may be submitted and are answered during the weeks when the call for tender is open; this can be done in three languages, almost in real time, with all questions and replies made public on the DGT web site.

Quality pays and they will pay for quality: decisions are based on a quality/price ratio of 70/30, in favor of quality. For each job done, a quality note with feedback is sent to facilitate ongoing improvement.

But to get this far, you must first submit a persuasive offer to the selection board.

On November 28, 2015 from noon to 4 pm, IAPTI's UK chapter is hosting a workshop in Manchester (UK) to inform you of what it takes to tender and win at Europe's highest public level for translation. Profit from this important business event at yet another iconic venue! Registration information is available here.

The beautiful Manchester Central Library, venue for the EC tender workshop!

*******

The speaker: Monica Garcia-Soriano started her EU career as a lawyer linguist 24 years ago at the Court of Justice in Luxembourg. She later joined the Spanish Translation Unit at the European Commission in Brussels and for the last 8 years she has been in charge of procurement at the Commission's External Translation Unit.

Oct 15, 2015

The Invisible Hand of memoQ LiveDocs - making "broken" corpora work

Last month I published a post describing the "rules" for document visibility in the list of documents for a memoQ LiveDocs corpus. Further study has revealed that this is only part of the real story and is somewhat misleading.

I (wrongly) assumed that, in a LiveDocs corpus, if a document was visible in the list its content was available in concordance searches or the Translation Results pane, and if it was not shown in the list of documents for the corpus in the project, its content would not be available in the concordance or Translation Results pane. Both assumptions proved wrong in particular cases.

In the most recent versions of memoQ, for corpora created and indexed in those versions, all documents in a corpus shown in the list will be available in the concordance search and the Translation Results pane as expected. And the rules for what is currently shown in the list are described accurately in my previous post on this topic. However,

if there are documents in the corpus which share the same main language (as EN-US and EN-UK both share the main language, English) but are not shown in the list, these will still be used for matching in the memoQ Concordance and Translation Results and
if the corpus was created in an older version of memoQ (such as memoQ 2013R2), documents shown in the list of a corpus may in fact not show up in a Concordance search or in the Translation Results.

This second behavior - documents shown in the list but their content not appearing in searches - has been described to me recently by several people, but it could not be reproduced at first, so I thought they must be mistaken, and statements that "sometimes it works and sometimes it doesn't" made these pronouncements seem even more suspect. Except that they happen to be true and I now (sort of) understand why.

Prior to publishing my post to describe the rules governing the display of documents for a LiveDocs corpus in a project, I had been part of a somewhat confusing discussion with one of my favorite Kilgray experts, who mentioned monolingual "stub" documents a number of times as a possible solution to content availability in a corpus, but when I tried to test his suggestion and saw that the list of documents on display in the corpus had not expanded to include content I knew was there, I thought he was wrong. But actually, he was right; we were talking about two different things - visibility of a document versus availability of its content.

For purposes of this discussion, a stub document is a small file with content of no importance, added only to create the desired behavior in memoQ LiveDocs. It might be a little text file - "stubby.txt" - with any nonsense in it.

I went back to my test projects and corpora used to prepare the last article and found that in fact for the main languages in a project all the content was available from the corpora, regardless of whether the relevant documents were displayed in the list. In the case of a corpus not offered in the list for a project because of sublanguage mismatches in the source and target, adding a stub document with either a generic setting (DE, EN, PT, etc.) or sublanguage-specific setting for the source language or the correct sublanguage setting for the target (DE-CH, EN-US, etc.) made all the corpus content for the main languages available instantly. (In the project, documents added will have the project language settings; use the Resource Console for any other language settings you want.)

Content of a test corpus before a stub document was added. Viewed in the Resource Console.

The test corpus with the document list shown in my project; only the stub document is displayed, but
all the indexed content shown above is also available in the Concordance and Translation Results.

It is unfortunate that in the current versions of memoQ the document list for a corpus in a project may not correspond to its actual content for the main languages. Not only does this preclude accessing a document's content without a match or a search, it also means that binary documents (such as one of the PDF files shown in the list) cannot be opened from within the project. I hope this bug will be fixed soon.

Since a few of my friends, colleagues and clients were concerned about odd behavior involving older corpora, I decided to have a look at those as well. Kilgray Support had made a general recommendation of rebuilding these corpora or had at least suggested that problems might occur, so I was expecting something.

And I found it. Test corpora created in the older version of memoQ (2013 R2) behaved in a way similar to my tests with memoQ 2015 - although the "display rules" for documents in the list differed as I described in my previous blog post, the content of "hidden" documents was available in Concordance searches and in the Translation Results pane. But....

When I accessed these corpora created in memoQ 2013 R2 using memoQ 2015, even if I could see documents (for example, a monolingual source document with a generic setting), the content was available in neither the Concordance nor the Translation Results until I added an appropriate stub document under memoQ 2015. Then suddenly the index worked under memoQ 2015 and I could access all the content, regardless of whether the documents were displayed in the list. If I deleted the stub document, the content became inaccessible again.

So what should we do to make sure that all the content of our memoQ corpora are available for searches in the Concordance or matches in the Translation results?

If you always work out of the same main source language (which in my case would be German or "DE", regardless of whether the variant is from Germany, Austria or Switzerland), then add a generic language stub document for your source language to all corpora - old and new - under memoQ 2015 and all will be well.

If your corpora will be used bidirectionally, then add a generic stub for both the source and target to those corpora or add a "bilingual stub" with generic settings for both languages. This will ensure that the content remains available if you want to use the corpora later in projects with the source and target reversed.

Although it's hard to understand the principles governing what is displayed, when and why, following the advice in the red text will at least eliminate the problem of content not being available for pretranslation, concordance searches and translation grid matches. And the mystery of inconsistent behavior for older corpora appears to be solved. The cases where these older corpora have "worked" - i.e. their content has been accessible in the Concordance, etc. - are cases where new documents were added to them under recent versions of memoQ. If you just keep adding to your corpora, doing so particularly from a project with generic language settings, you'll not have to bother with stub documents and your content will be accessible.

And if Kilgray deals with that list bug so we actually see all the documents in a corpus which share the main languages of a project, including the binary ones, then I think the confusion among users will be reduced considerably.

Oct 9, 2015

Words in music: what vocabulary and languages tell us about leading musicians

Corpus linguistics has been a passion of mine since an article published by three colleagues about a decade ago showed me the possibilities of "mining" collections of text in a subject area to determine its critical vocabulary, what sorts of words belong together in specialist language, etc. The public webinar offered tomorrow by the International Association of Professional Translators and Interpreters (IAPTI) is a new take on this familiar subject, exploring the use of vocabulary by popular musicians and relating these to things like mastery of multiple languages or commercial success. A familiar subject, sort of, but nonetheless something completely different.

It is this kind of reimagination of things we "already" know which can open our minds to knew possibilities of many kinds which are available now, but which nobody expects and therefore nobody sees. So I will enjoy hearing what Mr. Jewalikar has to say at 3 pm UTC tomorrow, October 10th and look forward to the new ideas this may stir up in my head. And if you would like to join us in the presentation, you can register by sending a short e-mail request to info.request@iapti.org; there is no charge to participate.

This event is the sort of professional information with fresh perspectives and a clear focus on the needs and interests of individual professionals - not the linguistic sausagemakers and exploiters - that I have come to expect from this organization. While others sell out to the commercial interests of the bulk market bog to turn language professionals and aspiring professionals into "input" for their post-edited machine sausage of words, IAPTI and a very few other groups keep the focus on stimulating content of real professional worth.

Track changes in memoQ: misunderstandings and navigation

Although tracked changes have been part of memoQ since the distant days of memoQ 5.0, many users are still confused about how to use these features and how to navigate marked changes in a translation.

The confusion starts with the menu for activating the tracked changes, which in recent versions of memoQ is found on the Review ribbon. What most people do not realize is that the first two options - Against Last Received Version and Against Last Delivered Version - are not relevant to the usual workflows of an individual translator working in a local project created on his or her computer. Often I have caught myself selecting the option Against Last Delivered Version for the tracked changes to show, because I want to compare against the last version I delivered to my client by exporting and e-mailing the document, because I forget that this refers to the actual Deliver function in a server project.

If I am working locally in my own projects, the only track changes option that is relevant is Custom, with which I can show comparisons to specific minor versions:

In the present example, I've selected a comparison with a "snapshot" I made before an editing session. A snapshot creates a record of the status of a translation at a given time and makes rollbacks possible. Use the submenu of the Versions icon on the Documents ribbon to make a snapshot of your translation:

Once the tracking of changes for the current translation compared to a previous minor version has been activated, the relevant changes will be marked in red in the translation grid. If changes have been made to the source text (correcting OCR errors, for example, by editing the source text with F2), these will be shown as well.

Changes can be rejected by choosing Revert To Earlier Version on the Review ribbon, in the context menu (right-click) or with the corresponding keyboard shortcut. Or a version of a target text not shown in the markup can be recalled with the Row History and restored by copying it from the dialog (Ctrl+C) and pasting in the target cell and editing out extraneous information.

But how can one navigate many tracked changes in a larger document? Many users think this is not possible, though in fact it's rather simple with memoQ's filtering features.

Clicking the filter icon above the target text column opens a dialog to specify filter criteria for the working view. On the Status tab under Other properties... the option Change tracked can be selected to show only those segments with tracked changes.

Alternatively, the Go to next segment settings (Shift+Ctrl+G) can be configured in the same way with Change tracked on the Status tab, so choosing Go to next (Ctrl+G) or confirming a segment (if the option Automatically jump after confirmation is selected in the Go to next segment settings dialog) will take you to the next segment with tracked changes.