Dec 29, 2018

memoQ Terminology Extraction and Management

Recent versions of memoQ (8.4+) have seen quite a few significant improvements in recording and managing significant terminology in translation and review projects. These include:
  • Easier inclusion of context examples for use (though this means that term information like source should be placed in the definition field so it is not accidentally lost)
  • Microsoft Excel import/export capabilities which include forbidden terminology marking with red text - very handy for term review workflows with colleagues and clients!
  • Improved stopword list management generally, and the inclusion of new basic stopword lists for Spanish, Hungarian, Portuguese and Russian
  • Prefix merging and hiding for extracted terms
  • Improved features for graphics in term entries - more formats and better portability
Since the introduction of direct keyboard shortcuts for writing to the first nine ranked term bases in a memoQ project (as part of the keyboard shortcuts overhaul in version 7.8), memoQ has offered perhaps the most powerful and flexible integrated term management capabilities of any translation environment despite some persistent shortcomings in its somewhat dated and rigid term model. But although I appreciate the ability of some other tools to create customized data structures that may better reflect sophisticated needs, nothing I have seen beats the ease of use and simple power of memoQ-managed terminology in practical, everyday project use.

An important part of that use throughout my nearly two decades of activity as a commercial translator has been the ability to examine collections of documents - including but not limited to those I am supposed to translate - to identify significant subject matter terminology in order to clarify these expressions with clients or coordinate their consistent translations with members of a project team. The introduction of the terminology extraction features in memoQ version 5 long ago was a significant boost to my personal productivity, but that prototype module remained unimproved for quite a long time, posing significant usability barriers for the average user.

Within the past year, those barriers have largely fallen, though sometimes in ways that may not be immediately obvious. And now practical examples to make the exploration of terminology more accessible to everyone have good ground in which to take root. So in two recent webinars, I shared my approach - in German and in English - to how I apply terminology extraction in various client projects or to assist colleagues. The German talk included some of the general advice on term management in memoQ which I shared in my talk last spring, Getting on Better Terms with memoQ. That talk included a discussion of term extraction (aka "term mining"), but more details are available here:


Due to unforeseen circumstances, I didn't make it to the office (where my notes were) to deliver the talk, so I forgot to show the convenience of access to the memoQ concordance search of translation memories and LiveDocs corpora during term extraction, which often greatly facilitates the identification of possible translations for a term candidate in an extraction session. This was covered in the German talk.

All my recent webinar recordings - and shorter videos, like playing multiple term bases in memoQ to best advantage - are best viewed directly on YouTube rather than in the embedded frames on my blog pages. This is because all of them since earlier in 2018 include time indexes that make it easier to navigate the content and review specific points rather than listen to long stretches of video and search for a long time to find some little thing. this is really quite a simple thing to do as I pointed out in a blog post earlier this year, and it's really a shame that more of the often useful video content produced by individuals, associations and commercial companies to help translators is not indexed this way to make it more useful for learning.

There is still work to be done to improve term management and extraction in memoQ, of course. Some low-hanging fruit here might be expanded access to the memoQ web search feature in the term extraction as well as in other modules; this need can, of course, be covered very well by excellent third-party tools such as Michael Farrell's IntelliWebSearch. And the memoQ Concordance search is long overdue for an overhaul to allow proper filtering of concordance hits (by source, metadata, etc.), more targeted exploration of collocation proximities and more. But my observations of the progress made by the memoQ planning and development team in the past year give me confidence that many good things are ahead, and perhaps not so far away.

Dec 28, 2018

SDLTM to TMX conversion update (without SDL Trados Studio)

From time to time I hear from colleagues or read social media posts from people who have been given particular SDL reference resources to use (like an SDLTM file for translation memory or an SDLTB termbase file). But without a license to the corresponding SDL software, it can be troubling to deal with these formats and convert them into files which are easily imported into other environments, such as memoQ, WordFast, OmegaT or whatever.

There are a number of solutions to this problem published on YouTube, and memoQ Translation Technologies Ltd. (mQtech, the solution artists formerly known as "Kilgray") even has a nice help page in their knowledgebase, but all the solutions I have seen so far are either a bit dated, or they omit some information which might help to avoid problems.

The best free solution I have seen is the one described on the mQtech knowledgebase page, but the current description still leaves a few potential stumbling blocks.

There are two pieces to the solution:

Dec 22, 2018

Extraktion der Terminologie mit memoQ u.v.m.: das Video

Seit etwa ein Jahr hat memoQ Translation Technologies Ltd. (ehemals „Kilgray“ – „mQtech“) diverse hilfreiche Erweiterungen der Terminologieoptionen in der Software memoQ eingeführt, unter anderem welche, die die Erforschung der Terminologiebestände in großen Dokumenten, Translation Memories und Korpora vereinfachen sowie die Verwaltung deren Befunde auch besser ermöglichen.

Für mich persönlich sind die verbesserten Möglichkeiten für Stoppwortlisten und die Möglichkeit, sehr einfach Kontextbeispiele den Terminologieeinträgen hinzuzufügen von größter Bedeutung. Aber auch die neue Export- und Importfunktionen nach und von Microsoft Excel i.V.m. dem roten Text für verbotene Terminologie sind für kundenbezogene Prozesse sehr hilfreich (obwohl die Rottextfunktion in memoQ Versionen 8.5 bis 8.7.3 noch auf Reparatur für den Import wartet!).

Da deutschsprachige Informationen für den Einsatz der leistungsfähigeren Aspekte von memoQ relativ dünn gesät sind, und einige verantwortliche Verbandsleute nach wie vor eher unter dem Einfluss von SDL* stehen, habe ich es mir erlaubt, meinen deutschsprachigen Kollegen folgendes Video über die Extraktion, Einsatz und Pflege der Terminologie (von dem gestrigen Webinar) zu Weihnachten zu schenken:


Also Frohes Fest sowie Frohes Schaffen mit dem derzeit besten CAT-Tool – nach angemessener Pause für die Feiertage, natürlich! Glühwein muss sein.


* man sagt, der Herr Putin habe diese Beziehungen als Vorlage für seine mit Donald Trump genommen :-)

Deutsche Kommandos für „Hey memoQ“


Mit der memoQ Version 8.7 hat memoQ Translation Technologies Ltd. (ehemals „Kilgray“ – „mQtech“ unten) eine kostenlose, integrierte Spracherkennung eingeführt, die für die Arbeit in vielen Sprachen eine wesentliche Effizienzsteigerung verspricht. Für die Arbeit in deutscher Sprache soll es von vornherein klar gestellt werden: Dragon NaturallySpeaking (DNS) ist und bleibt für die vorhersehbare Zeit die bessere Wahl. Das gleiche gilt für alle Sprachen, die von DNS (in der aktuellen Version 15) unterstützt sind: Deutsch, Englisch, Spanisch, Französisch, Italienisch und Niederländisch.

Aber für die slawischen Sprachen, nordischen Sprachen, sonstigen romanischen Sprachen, Arabisch u.v.m. sind andere Lösungen gefragt, wenn man mit Spracherkennung arbeiten will. Vor etwa 4 Jahren, als ich angefangen habe, solche Lösungen zu erforschen, waren diese für „exotische“ Sprachen wie Russisch oder europäisches Portugiesisch als Teil der Übersetzungsarbeiten kaum gedacht; heute gibt es vielfältige halbgute Möglichkeiten, zu denen jetzt auch „Hey memoQ“ gehört. Noch warten wir auf Lösungen auf der Ebene von DNS für die sonstigen Sprachen und noch lange werden wir sicher warten, bis gute Erkennungsqualität mit einfach erweiterbarem Wortschatz und flexiblen, konfigurierbaren Kommandos für die Systemsteuerung für Sprachen wie Dänisch oder Hindi allgemein verfügbar sind. Zur Zeit sind wir nicht mal so weit mit Englisch, wenn man z.B. die Diktierfunktion auf Handys betrachtet. Spracherkennung ohne eigenständig erweiterbarem Wortschatz ist und bleibt eine Technologie auf Krücken.

Aber die Krücken bei Hey memoQ sind erstmal nicht schlecht für eine aufkommende Technologie. Die mit der 8.7er Version von memoQ freigegebene App ist m.E. noch „Beta“ – was kann man sonst sagen, wenn nur für Englisch die Steuerungskommandos standardmäßig konfiguriert sind? – aber für den Stand der derzeit zahlbaren Technologie ist die von mQtech eingeführte Lösung die beste in der Klasse, sogar mit einem tauglichen Umgehungslösung für das Problem des nichterweiterbaren Wortschatzes, nämlich die Möglichkeit, sprachgesteuert die ersten neuen Treffer aus der Ergebnisliste der Terminologie, Korporasuche, Nontranslatables usw. in den Zieltext einzufügen. Wenn man sowieso vernünftige Terminologiearbeit leistet und ein memoQ-Glossar mit den nötigen Sonderbegriffen ausstattet, kann man schon ziemlich gut arbeiten. (Und wer eventuell eine Einweisung in die statistisch basierte Erfassung der häufigen Begriffe aus einem Dokument bzw. einer Dokumentensammlung benötigt, kann sich hier informieren.)

Hey memoQ hat auch andere Alleinstellungsmerkmale, u.a. einen Wechsel der Erkennungssprache, wenn man den Cursor im Textfeld für die andere Arbeitssprache setzt. Also wenn ich z.B. Englisch als Zieltext diktiere, will aber einen Tippfehler im deutschen Ausgangstext korrigieren oder vielleicht den gesamten Text nach einem bestimmten Wort im Ausgangstext filtrieren, wechselt die von Hey memoQ verstandene Sprache von Englisch auf Deutsch, wenn ich bloß auf Zieltextseite klicke. So geht das auch bei jedem unterstützten Sprachpaar. Nicht schlecht.

Wer bereits meckert, dass diese derzeit auf Apple iOS basierende Lösung nicht für die beliebten Android-Handys verfügbar ist, begreift die Realität der Softwareentwicklung bzw. Produktentwicklung einfach nicht. Schon vor mQtech mit der Entwicklung dieser Lösung begonnen hat, habe ich selber aus persönlichem Anlass die möglichen Application Programming Interfaces (APIs) untersucht, und bei den meisten war die Kommandosteuerung, wie sie bei Hey memoQ zu finden ist, nicht verfügbar. In den meisten Fällen nur die Übertragung eines gesprochenen und transkribierten Textes. Aber das hat wir bereits. Bei myEcho zum Beispiel. Oder auch die Lösung für Chrome-Spracherkennung in jedem Windows- oder Linux-Programm. Was wir dringend brauchen ist nicht das Bier von gestern. Wir brauchen zukunftsweisende Prototypen, die die Entwicklung der branchenüblichen Technologien wie memoQ, SDL Trados Studio, WordFast und andere in eine bessere Richtung treiben, und das macht schon Hey memoQ. Also ein dickes Lob an das memoQ-Team und seinen deutschen Entwicklungschef :-)

Aber auch mit einem deutschen Entwicklungschef, ist der Zeitdruck manchmal so, dass man vorläufig keine konfigurierten Steuerungskommandos mit der ersten Release-Version freigibt, wahrscheinlich weil das eigentlich aufwändiger ist, als die meisten Leute sich glauben würden. In jeder Sprache. Wer zum Beispiel Polnisch diktieren will und nicht nur die gesprochenen Phrasen ins Textfeld transkribiert haben will, sondern auch sprachgesteuert den Text editieren oder Filterkommandos oder Konkordanzsuche ausführen will, muss erstmal polnische Kommandos im Programm einrichten. Und da stoßt man oft unerwartet an die Grenzen und Merkwürdigkeiten der individuellen Erkennungstechnologie. Eine gewählte Phrase kann, zum Beispiel, einer sehr häufigen Phrase ähneln, so dass oft diesen anderen Text geschrieben wird, wenn man eigentlich ein Kommando ausführen lassen wollte. Also sind ungewöhnliche aber erkennbare Texte oft die beste Wahl für Kommandotexte. Meine erprobten Kommandotexte für Deutsch sind unten als Screenshot angegeben. Wie man gleich merkt, ist das zu konfiguriende Dialog noch nicht für die deutsche Benutzeroberfläche lokalisiert. In kommenden Versionen wird das natürlich der Fall sein. Aber ob irgendwann aus Ungarn die Bearbeitungskommandos für Griechisch vorkonfiguriert kommen werden, kann ich nicht raten. Selber konfigurieren kann man sie aber heute schon, wenn man Geduld hat.

Noch zu bemerken: die iOS-Spracherkennung benötigt gute Internet-Bandbreite, da der Erkennungsserver im Cloud liegt. Datenschutz, Datenschutz, ja, ja. Sparen Sie mir den Vortrag bitte und lassen sie diese Technologie sich erstmal weiter entwickeln. Die Fragen zum Datenschutz waren schon vor einigen Jahren ausreichend von deutschen Vertretern der Firma Nuance beantwortet, und sogar die verrückten US-Behörden haben den Einsatz solcher Technologie intern freigegeben. Aber in Deutschland dreht sich die Welt anders, und gut so :-) Übrigens erlebe ich mehr Erfolg, wenn ich in kurzen, sogar dramatischen Phrasen spreche, und nicht in langen, wortreichen Sätzen. Eine ganz andere notwendige Vorgehensweise als mit DNS, zum Beispiel. Wer zu schnell spricht, merkt auch schnell, dass Wörter ausgelassen werden. Nichts mit Hey memoQ zu tun, sondern Bestandteil des Standes der Technik bei iOS-Spracherkennung sowie bei manchen anderen Technologien dieser Gattung.

Und jetzt die Ansicht meiner selbstkonfigurierten Hey memoQ Steuerungskommandos für Deutsch. Wem sich meine Wortwahl nicht gefällt, kann sich was Besseres aussuchen und hoffentlich testen und danach in den Kommentaren unten allen deutschsprachigen Kollegen mitteilen.


Die iOS-Kommandos für Interpunktion u.v.m. habe ich auf Basis der von Apple publizierten MacOS-Kommandos erforscht; es gibt in einzelnen Fällen leichte Unterschiede (d.h. man muss ein wenig experimentieren, bis man auf das richtige Kommando stoßt - falls es tatsächlich existiert), aber hiermit hat man einen guten Anfang für Sonderzeichen usw. wie ich neulich in einem englischen Blogbeitrag erklärt habe. Für fehlende Informationen kann man mQtech keine Schuld zuweisen, wenn nicht mal der iOS-Hersteller Apple die vollständige und richtige Liste mitteilt. Aber mit viel Zeit wird der Kuchen sicher gut gebacken!

Dec 11, 2018

Your language in Hey memoQ: recognition information for speech

There are quite a number of issues facing memoQ users who wish to make use of the new speech recognition feature – Hey memoQ – released recently with memoQ version 8.7. Some of these are of a temporary nature (workarounds and efforts to deal with bugs or shortcomings in the current release which can reasonably be expected to change soon), others – like basic information on commands for iOS dictation and what options have been implemented for your language – might not be so easy to work out. My own research in this area for English, German and Portuguese has revealed a lot of errors in some of the information sources, so often I have to take what I find and try it out in chat dictation, e-mail messages or the Notes app (my favorite record-keeping tool for such things) on the iOS device. This is the "baseline" for evaluating how Hey memoQ should transcribe text in a given language.

But where do you find this information? One of the best way might be a Google Advanced Search on Apple's support site. Like this one, for example:


The same search (or another) can be made by adding the site specification after your search terms in an ordinary Google search:


The results lists from these searches reveal quite a number of relevant articles about iOS dictation in English. And by hacking the URLs on certain pages and substituting the language code desired, one can get to the information page on commands available for that language. Examples include:
All the same page, with slightly modified URLs.

The Mac OS information pages are also a source of information on possible iOS commands that one might not find so easily otherwise. An English page with a lot of information on punctaution and symbols is here: https://support.apple.com/en-us/HT202584

The same information (if available) for other languages is found just by tweaking the URL:

and so on. Some guidance on Apple's choice of codes for language variants is here, but I often end up getting to where I want to go by guesswork. The Microsoft Azure page for speech API support might be more helpful to figure out how to tweak the Apple Support URLs.

When you edit the commands list, you should be aware of a few things to avoid errors.
  • The current command lists in the first release may contain errors, such as mistakenly typing "phrase" in angular brackets as shown in the first example above; on editing, the commands that are followed by a phrase do not show the placeholder for that phrase, as you see in the example marked "2".
  • Commands must be entered without quotation marks! Compare the marked examples 1 and 2 above. If quotes are typed when editing a command, this will not be revealed by the appearance of the command; it will look OK but won't work at all until the quote marks are removed by editing.
  • Command creation is an iterative process that may entail a lot of frustrating failures. When I created my German command set, I started by copying some commands used for editing by Dragon NaturallySpeaking, but often the results were better if I chose other words. Sometimes iOS stubbornly insists on transcribing some other common expression, sometimes it just insists on interpreting your command as a word to transcribe. Just be patient and try something else.
The difficulties involved in command development at this stage are surely why only one finished command set (for the English variants) for memoQ-specific commands was released at first. But that makes it all the more important to make command sets "light resources" in memoQ, which can be easily exported and exchanged with others.

At the present stage, I see the need for developing and/or fixing the Hey memoQ app in the following ways:
  • Fix obvious bugs, which include: 
  • The apparently non-functional concordance insertions. In general, more voice control would be helpful in the memoQ Concordance.
  • Capitalization errors which may affect a variety of commands, like Roman numerals, ALL CAPS, title capitalization (if the first word of the title is not at the start of the segment), etc.
  • Dodgy responses to the commands to insert spaces, where it is often necessary to say the command twice and get stuck with two spaces, because a single command never responds properly by inserting a space. Why is that needed? Well, otherwise you have to type a space on the keyboard if you are going to use a Translation Results insertion command to insert specialized terminology, auto-translation rule results, etc. into your text. 
  • Address some potentially complicated issues, like considering what to do about source language text handling if there is no iOS support for the source language or the translator cannot dictate commands effectively in that language. I can manage in German or Portuguese, but I would be really screwed these days if I had to give commands in Russian or Japanese.
  • Expand dictation functionality in environments like the QA resolution lists, term entry dialog, alignment editor and other editors.
  • Look for simple ideas that could maximize returns for programming effort invested, like the "Press" command in Dragon NaturallySpeaking, which enables me to insert tags, for example, by saying "Press F9". This would eliminate the need for some commands (like confirmation and all the Translation Results insertion commands) and open up a host of possibilities by making keyboard shortcuts in any context controllable by voice. I've been thinking a lot about that since talking to a colleague with some pretty tough physical disabilities recently.
Overall, I think that Hey memoQ represents a great start in making speech recognition available in a useful way in a desktop translation environment tool and making the case for more extensive investments in speech recognition technology to improve accessibility and ergonomics for working translators.

Of course, speech recognition brings with it a number of different challenges for reviewing work: mistakes (or "dictos" as they are sometimes called, a riff on keyboard "typos") are often harder to catch, especially if one is reviewing directly after translating and the memory of intended text is perhaps fresh enough to override in perception what the eye actually sees. So maybe before long we'll see an integrated read-back feature in memoQ, which could also benefit people who don't work with speech recognition. 

Since I began using speech recognition a lot for my work (to cope with occasionally unbearable pain from gout), I have had to adopt the habit of reading everything out loud after I translate, because I have found this to be the best way to catch my errors or to recognize where the text could use a rhetorical makeover. (The read-back function of Dragon NaturallySpeaking in English is a nightmare, randomly confusing definite and indefinite articles, but other tools might be usable now for external review and should probably be applied to target columns in an exported RTF bilingual file to facilitate re-import of corrections to the memoQ environment, though the monolingual review feature for importing edited target text files and keeping project resources up-to-date is also a good option.)

As I have worked with the first release of Hey memoQ, I have noticed quite a few little details where small refinements or extensions to the app could help my workflow. And the same will be true, I am sure, with most others who use this tool. It is particularly important at this stage that those of us who are using and/or testing this early version communicate with the development team (in the form of e-mail to memoQ Support - support@memoq.com - with suggestions or observations). This will be the fastest way to see improvements I think.

In the future, I would be surprised if applications like this did not develop to cover other input methods (besides an iOS device like an iPhone or iPad). But I think it's important to focus on taking this initial platform as far as it can go so that we can all see the working functionality that is missing, so that as the APIs for relevant operating systems develop further to support speech recognition (especially the Holy Grail for many of us, trainable vocabulary like we have in Dragon NaturallySpeaking and a very few other applications). Some of what we are looking for may be in the Nuance software development kits (SDKs) for speech recognition, which I suggested using some years ago because they offer customizable vocabularies at higher levels of licensing, but this would represent a much greater and more speculative investment in an area of technology that is still subject to a lot of misunderstanding and misrepresentation.

Dec 10, 2018

"Hey memoQ" command tests

In my last post on the release of memoQ 8.7 with its new, integrated speech recognition feature I included a link to a long, boring video record of my first tests of the speech recognition facility, most of which consisted of testing various spoken iOS commands to generate text symbols, change capitalization, etc. I tested some of the integrated commands that are specific to memoQ, but not in an organized way really.

In a new testing video, I attempt to show all the memoQ-specific spoken command types and how the commands are affected by the environment (in this case I mean whether the cursor is on the target text side or the source text side or in some other place in the concordance, for example).

Most of the spoken commands work rather well, except for insertion from the concordance, which I could not get to work at all. When the cursor is in a source text cell, commands have to be given in the source text language currently, which is sure to prove interesting for people who don't speak their source language with a clean accent. Right now it's even more interesting, because English is the only language with a ready-made command list; other languages have to "roll their own" for now, which is a bit of a trial-and-error thing. I don't even want to think how this is going to work if the source language isn't supported at all; I think some thought had to be given to how to use commands with source text. I assume if it's copied to the target side it will be difficult to select unless, with butchered pronunciation, the text also happens to make sense in the target language.


It's best to watch this video on YouTube (start it, then click "YouTube" at the bottom of the running video). There you'll find a time code index in the description (after you click SEE MORE) which will enable you to navigate to specific commands or other things shown in the test video.

My ongoing work with Hey memoQ make it clear that what I call "mixed mode" (dictation with concurrent use of the keyboard) is the best and (actually) necessary way to use this feature. The style for successful dictation is also quite different than the style I need to use with Dragon NaturallySpeaking for best results. I have to discipline myself to speak more in short phrases, less in longer ones, much less in long sentences, which may cause some text to be dropped.

There is also an issue with Translation Results insertions and the lack of spaces before them; the command to insert a space ("spacebar" in English) is dodgy, so I usually have to speak it twice and end up with a superfluous space. The video shows my workaround for this in one part: I speak a filler word (in one case I tried "dummy" which was rendered as "dumb he") and then select it later and insert an entry from the Translation Results pane over the selected text. This is in fact how we can deal with specialist terminology not recognized by the current speech dictionary until it becomes possible to train new words some day.

The sound in the video (spoken commands) is also of variable quality; with some commands I had to turn my head toward the iPhone on its little tripod next to my laptop, which caused the pickup of that speech to be bad on the built-in microphone on the laptop's screen. So this isn't a Hollywood-class recording; it's simply a slightly edited record of some of my tests to give other memoQ users some idea of what they can expect from the feature right now.

Those who will be dictating in supported languages other than English need some patience right now. It's not always easy coming up with commands that will be recognized easily but which are unlikely to occur as words to be transcribed in typical dictation work. During the beta test of Hey memoQ I used some bizarre and unusual German words which just happened to be recognized. I'm developing a set of more normal-sounding commands right now, but it's a work in progress.

The difficulties I am encountering making up new command phrases (or changing the English ones in some cases) simply reinforce my belief that these command lists should be made into portable light resources as soon as possible.

I am organizing summary tables of the memoQ-specific commands and useful iOS commands for symbols, capitals, spacing, etc. comparing their performance in other iOS apps with what we see right now in Hey memoQ.

Update: the summary file for English is available here. I will post links here for any other languages I can prepare later.

Migrating memoQ with Mac Parallels


A recurring complaint among memoQ users is the perceived (and actual) difficulty of moving all one's resources to a new computer when it's time to retire the old one. My favorite strategy is to create a single backup file for all my projects and restore that on a new machine, though this still leaves some details to clean up like moving light resources that don't happen to be included in any of those projects. Some people like to make a dummy project and attach all the heavy resources to that, but this is really only a viable option if you work in a single language pair. There are other strategies, of course, some better than others, given the particular situation.

Recently, a colleague who prefers to work on Apple Macintosh computers with Windows running in a Parallels virtual machine (VM) shared her new approach to migration. Apparently it's working well. And I suspect that the same approach could be used with a Windows VM under Windows as I used to do a lot in the days I tested more unstable software and needed to quarantine potential disasters.

Rather than re-install everything on the new machine, she simply
  1. installed Parallels on the new Mac,
  2. copied the VM file from the old Mac to the new one and
  3. copied resource folders to an identical path on the new Mac.
That's all. The fact that this is a virtual environment makes it all easier. So any memoQ user who runs the software using some virtual machine (VMware, Parallels, etc.) could do the same I suppose. Or any user with any operating system who runs Windows on a virtual machine.

That should have occurred to me earlier. Years ago I used to run several virtual machines with ancient versions of Windows to access old CD-based dictionaries that could not run under newer OS versions, but I've fallen out of that habit in recent years as the emphasis of my translation work shifted to other fields.

Dec 7, 2018

Integrated iOS speech recognition in memoQ 8.7

Today, memoQ Translation Technologies (the artists formerly known as "Kilgray") officially released their iOS dictation app along with memoQ version 8.7, making that popular translation environment tool the first on the desktop to offer free integrated speech recognition and control.


My initial tests of the release version are encouraging. Some bugs with capitalization which I identified with the beta test haven't been fixed yet, and some special characters which work fine in the iOS Notes app don't work at all, but on the whole it's a rather good start. The control commands implemented for memoQ work far better than I expected at this stage. I've got a very boring, clumsy (and unlisted) video of my initial function tests here if anyone cares to look.

Before long, I'll release a few command cheat sheets I've compiled for English (update: it's HERE), German and Portuguese, which show which iOS dictation functions are implemented so far in Hey memoQ and which don't perform as expected. There are no comprehensive lists of these commands, and even the ones that claim to cover everything have gaps and errors, which one can only sort out by trial and error. This isn't an issue with the memoQ development team for the most part, but rather of Apple's chaotic documentation.

The initial release only has a full set of commands implemented in English. Those who want to use control commands for navigating, selecting, inserting, etc. will have to enter there own localized commands for now, and this too involves some trial and error to come up with a good working set. And I hope that before long the development team will implement the language-specific command sets as a shareable light resources. That will make it much easier to get all the available languages sorted out properly for productive work.

I am very happy with what I see at the start. Here are a few highlights of the current state of Hey memoQ dictation:
  • Bilingual dictation, with source language dictation active when the cursor is on the source side and target language dictation active when the cursor is on the target side. Switching languages in my usual dictation tool - Dragon NaturallySpeaking - is a total pain in the butt.
  • No trainable vocabulary at present (an iOS API limitation), but this is balanced in a useful way by commands like "insert first" through "insert ninth", which enable direct insertion of the first nine items in the Translation Results pane. Thus is you maintain good termbases, the "no train" pain is minimized. And you can always work in "mixed mode" as I usually do, typing what is not convenient to speak and using keyboard shortcuts for commands not yet supported by voice control, like tag insertion.
  • Microphones connected (physically or via Bluetooth) with the iPhone or iPad work well if you don't want to use the integrated microphone in the iOS device. My Apple earphones worked great in a brief test.
Some users are a bit miffed that they can't work directly with microphones connected to the computer or with Android devices, but at the present time, the iOS dictation API is the best option for the development team to explore integrated speech functions which include program control. That won't work with Chrome speech recognition, for example. As other APIs improve, we can probably expect some new options for memoQ dictation.

Moreover, with the release of iOS 12, I think many older devices (which are cheap on eBay or probably free from friends who don't use them) are now viable tools for Hey memoQ dictation. Update: I found a list of iPhone and iPad devices compatible with iOS 12 here.)

Just for fun, I tested whether Hey memoQ and Dragon NaturallySpeaking interfere with one another. They don't it seems. I switched back and forth from one to the other with no trouble. During the app's beta phase, I did not expect that I would take Hey memoQ as a serious alternative to DNS for English dictation, but with the current set of commands implemented, I can already work with greater comfort than expected, and I may in fact use this free tool quite a bit. And I think my friends working into Portuguese, Russian and other languages not supported by DNS will find Hey memoQ a better option than other dictation solutions I've seen so far.

This is just the beginning. But it's a damned good start really, and I expect very good things ahead from memoQ's development team. And I'm sure that, once again, SDL and others will follow the leader :-)

And last, but not least, here's an update to show how to connect the Hey memoQ app on your iOS device to memoQ 8.7+ on your computer to get started with dictation in translation:


Dec 5, 2018

Bilingual Excel to LiveDocs corpora: video

A bit over three years ago, I published a blog post describing a simple way to move EUR-Lex data into a memoQ LiveDocs corpus so that the content can be used for matching and concordance work in translation and editing. The particular advantage of a LiveDocs corpus versus a translation memory is that the latter does not allow users to read the document context for concordance hits.

A key step in the import process is to bring the bilingual content in an Excel file into a memoQ project as a "translation document" and then send that content to LiveDocs from the translation files list. Direct import to a LiveDocs corpus of bilingual content using the multilingual delimited text import filter is still not possible despite years of asking the development team for memoQ to implement this.

This is irritating, though in the case of a EUR-Lex alignment which may be slightly out of sync and need fixing, it is perhaps all for the best. And in some other situations, where the content may be incomplete and require filtering (in a View) before sending it to the corpus, it also makes sense to bring the file in as a translation document first to use the many tools available for selecting and modifying content. However, in many situations, it's simply a nuisance that the files cannot be sent directly to a LiveDocs corpus.

In any case, I've now done a short (silent) video to make the steps involved in this import process a little clearer:


Dec 4, 2018

New URL search for IATE terminology

There was some consternation recently among translators who use the EU's IATE (Interactive Terminology for Europe) terminology database with web search tools integrated in their translation environments such as OmegaT, SDL Trados Studio or memoQ. Quite a number of colleagues emphasized the need for URL parameter searching, and the IATE development team has now responded by implementing exactly that.

Here is how the new URL parameters might look in the memoQ Web Search settings, for example:


The basic format uses three important parameters: term (the expression to look for), sl (the source language) and tl (the target language). So a search for the French term for the German word Eisen (iron) would look like:




Thanks to Zsolt Varga of the memoQ team for notifying users of the IATE search upgrade via social media!

Optimizing memoQ terminology extraction

On December 28, 2018 from 2:00 to 3:30 pm Lisbon time (3:00 to 4:30 pm CET, 9:00 to 10:30 am EST), I'll be giving a talk on terminology extraction in the latest version of memoQ. Recent versions of this tool have included many improvements to its terminology features, and it's time for an update on how to get the most out of the term extraction features of memoQ among other things.

Topics to be covered include the creation of new stopword lists or the extension of existing ones, customer-, project- or topic-specific stopword lists, criteria for corpora, term mining strategies and the subsequent maintenance and use of term bases in projects. Participants will be equipped with all the information needed to use this memoQ feature confidently, reliably and profitably in their professional work.

The webinar is free, but registration is required. To register, go to:
https://zoom.us/meeting/register/cfd1a47cd5c54114d746f627e8486654

The same presentation (more or less) will be held in German on December 21 at the same time for those who prefer to hear and discuss the topic in that language.

Dec 3, 2018

Terminologieextraktion mit memoQ: die neuesten Möglichkeiten


Am 21. Dezember um 15:00 Uhr bis 16:30 Uhr MEZ findet wieder eine deutschsprachige memoQ-Schulung online statt. Thema: Optimierung der Terminologieextraktion. Der Vortrag bietet eine Übersicht der Möglichkeiten für effizientes Arbeiten mit dem Extraktionsmodul für Terminologie in memoQ. Von der Neuerstellung bzw. Erweiterung der Stoppwortlisten, kunden-, projekt- oder themenspezifische Stoppwortlisten, Korpuskriterien und Extraktionsstrategien bis zu der anschließenden Pflege der Terminologiedatenbanken und dem Einsatz im Projekt werden Sie mit den notwendigen Informationen gerüstet, diese Funktion bei Ihren professionellen Tätigkeiten sicher, zuverlässig und gewinnbringend einzusetzen. Teilnahme ist kostenlos aber registrierungspflichtig: https://zoom.us/meeting/register/d68e024c63ad506f7c24e00bf0acd2b8 Ein inhaltsgleicher Vortrag in englischer Sprache findet eine Woche (am 28.12.2018) später statt: https://zoom.us/meeting/register/cfd1a47cd5c54114d746f627e8486654