Showing posts with label regex. Show all posts
Showing posts with label regex. Show all posts

Nov 26, 2023

memoQ term base "roundup" chat

 

The last of the planned office hour discussions for the self-guided online course "memoQuickies Resource Camp" will be held on November 27th at 17:00 CET (8:00 PST). For those not already registered for the Zoom chats, the link to do that is HERE. If you have done so previously, the access URL is the same. The chat is open to everyone, regardless of whether they are registered or not in the online course.

We'll begin with open Q&A time on any of the course sections in the term base unit or any other presentations of the subject matter by me on this blog or my YouTube channel. Afterward, I will show my general method for editing or updating term base content via Microsoft Excel exports brought in to the memoQ working grid, which facilitates certain kinds of changes or actions involving regular expression use. This goes beyond the possibilities of the integrated memoQ term base editor, which will also be presented briefly.

A recording of the talk will be made available later in the course structure.

In December the course will move on to discussions of QA profiles and other aspects of quality assurance in memoQ, with some live discussion possibilities to be announced. All course material will remain online for access until the end of March. More information on the memoQuickies Resource Camp can be found HERE.

Oct 5, 2023

What's wrong with my segmentation (in translation)?

The fifth open office hours session for the self-guided online course "memoQuickies Resource Camp" discussed segmentation problems with documents imported to translation environments such as memoQ, Trados Studio, Phrase, Cafetran Espresso, etc. and various ways that these issues might be identified so that they can be corrected.

Segmentation problems waste enormous amounts of time, and bad segmentation rules are a plague on the translation and localization service community. Unfortunately, nearly all the rules I have seen, for all working environments, simply suck sewage. memoQ's rules usually suck less, but still....

This week's talk presented, among other things, some methods for identifying segmentation trouble spots quickly and easily with the use of special regular expressions describing common patterns followed by texts with troubled segmentation. And a Regex Assistant library has been provided (and will be updated during the course period) to help with all of this.

The video and related course pages will remain completely open to the public, with downloads available, at least through the end of 2023. After that the pages and resources may be taken down for updates and reorganization in other courses.

The video recording of the lecture "What's wrong with my segmentation?" can be accessed on YouTube (embedded below) or course participants can access the page to download it by clicking the "segmentation rules" icon at the top of this article.


An important part of checking the performance of your segmentation rules and possibly improving them is to have a good sampling of test data. One of my favorite sources for this are the European Community archives at the DGT, where EU legislation and other important information is available in a parallel corpus of all the official languages of the Community.

I have downloaded part of the 2022 DGT distribution and prepared a number of monolingual and bilingual corpora (about 2.6 million words, approximately 150,000 TUs) in EU languages and translation pairs. Moreover, information on my method has been published so that others can reproduce it for the languages that interest them.

Sep 19, 2023

Flirting with a Fiverr & more

If you always want to get paid...

Payment practices are a perpetual pain in Trashlation World. What professional translator or interpreter has not, at some point, faced difficulty getting paid for work delivered. Or in my case, consultant, independent solution developer and instructor, since I retired from translation three months ago and no longer accept such tasks in the increasingly thankless environment where they are requested.

Net never has become the modus operandi of too many wankers in the NMT-AI-MOUSE Fanboy and -gurl Klub, and when B of A, Barclays, Santander of some other clan of thieves fails to provide the desired credit for the Incredible Journey to Ruin, there will always be those AI Artists Formerly Known as Trashlators who understand that in matter of money, all that really matters is mindset.

Fuck you. Pay me. Well, imagine getting paid! Isn't that exciting? Fuck you. Pay me. I'm more excited by the structure of your fucking kneecaps and how fragile it is... SET YOUR MIND TO PAY ME.

But wait, Paulie, there may be a better way!

A perpetually solvent friend who owns a couple of German service companies once shared his secret: when in doubt, demand payment in advance. When there is no doubt, demand double. But what if the prospect just walks away? Offer them a peck on the check and hold the door for them in gratitude for the grief they are about to save you.

Times are hard. But payment practices are, alas, too often limp. Like a little mushroom past its sell-by date and full of mold and other things best not named.

I'm personally fortunate not to deal with many deadbeats; avoiding business with Italian and American companies certainly helps. Well, I have a soft spot for the Portuguese, but let's not go there.

I have another problem. I had administrative work even more than I hate not getting paid, and since I acquired a retired surgeon as a billing assistant and, at about the same time, took on a new role as a trainer for incoherent billing software like SAGE, not getting paid has been a source of surprising pleasure. But our five dogs still demand food, and if it does come in bags and come on time, well... there is that extra weight I'm carrying, the Portuguese pit bull is fond of reminding me.

And I am sure that many clients and friends and friends with the misfortune to be clients keep a special dartboard with my face on it for those times I get around to writing the bill after a year of so.

Fuck you. Pay me. Well fuck you. Write the fucking bill. Yeah, right, you tell me again how all that works here in Portugal where the tax laws are so screwy that almost none of the invoicing tools typically used by translators, companies involved with language services (regardless of whether they actually provide any) in other countries are compliant with Portuguese tax law, so I often feel myself well and truly fucked.

Enter the performance platforms such as Fiverr, which I am beginning to consider for certain recurring requests where I have asked Dios for a better intake process in which people tell me exactly what they need, provide the means of doing that, including the money to pay for the electricity to drive the tools I use to reach their goal and, well, just let me get on with it and make something nice and bless them for a change.

I started toying with platformed pre-payment about four years ago, when I started using Teachable as a way to demonstrate to memoQ and others how professional tools instruction could be improved and content could be shaped in more useful ways. I may not have been successful in convincing others to Do the Right Thing, but now that I have more time on my hands and have resigned myself to take a shot at that myself and maybe actually charge for all that knowledge that so many people mint money with, I am really, really glad that the Portuguese tax mysteries are handled in a way that does not involve me at all and is completely correct. No more constant special requests for special invoices for special people in special countries and special claims on my non-existent admin time.

So, that's the news, I guess. Next time you need a training video, a dash of regex for your project soup, a magical mysterious import filter for Formats Unknown or the like, there's a process. And it's not "fuck you". That's between consenting buyers and the platforms from which they draw their services....



Sep 16, 2023

The memoQ Regex Assistant Revisited at 15:00 CET on 21 September 2023!

 


Eleven months ago I was supposed to talk about terminology in a three-hour evening class taught by one of my friends at Universidade Nova de Lisboa, but I was so excited about the progress of the quality team I was training at one of my agency clients in Portugal that I twisted what was expected to be my usual straightforward 90 minute lecture on term base best practices in memoQ into an unusual take on the role regular expressions might play in terminology management.

That was a weird one, for sure, but the potential I saw was very real. I was defining "terminology" in rather broad terms to include not only more efficient term base management based on problem patterns but also translation memory clean-up, better filtering and find/replace operations in the working grid and QA.

"What the heck has all that got to do with scary ol' REGEX?" you're probably thinking.

Well, all this was triggered by memoQ's recent release of the One Ring Thing we've been needing to unify our memoQ management processes for the routine use of regex by The Rest of Us. The Regex Assistant. The Nazgul of Trados World are surely jealous.

That quality team I was training, my friends at Linguaemundi. is headed by Inês Lucas, about whom I had heard many good things for years from her enthusiastic professors at university but whom I hadn't actually met until she was hired by the agency a few years ago. At the recent memoQ Fest in Budapest, she explained how changing the approach to regex mastery from the struggles of syntax to organizing and applying packaged solutions in well-engineered processes significantly upgraded their work capacities and reduced stress levels. When you cut the nerdy crap and focus on understanding what solutions are called for particular tasks, everything gets much easier.

I was so amazed to see people who had struggled for years to learn regex well enough for simple tasks suddenly become solution powerhouses that I put together (rather spontaneously) a series of three online 90-minute workshops, which were repeated a month later. And new refinements to these methods come each time the ideas are presented.

The raw recordings of those six workshops are included in the current online course ("memoQuickies Resource Camp"), but one - the first of six - is publicly available on YouTube, where you can have a look.

However in the memoQuickies Resource Camp, a self-guided course that is serving as a platform for me to organize and distribute the best resources from my 14 years as a memoQ user, solution provider and trainer before I retire, I'll be taking another more streamlined pass at teaching some of the best possibilities for using memoQ Regex Assistant resource libraries. The webinars offered in most weeks of the course are simply an overview of the current topic emphasized in the course and also serve as a Q&A platform and a means of offering some different perspectives on information from the self-guided units. Recordings are always added to the course for later viewing.

This coming Thursday at 15:00 Central European Time, I'll give a brief overview of the Regex Assistant much like the public YouTube video does and answer any questions that attendees might have. Further information and an event notice can be found here on LinkedIn.

You can join the webinar with this link. The meeting ID is 878 3540 2561, and the passcode is 385434

Mar 2, 2023

memoQ Regex Assistant workshops re-run

The series of three workshops on the use of regex resources in memoQ, with a particular emphasis on the integrated Regex Assistant library, has been updated and will be offered again on March 9, 16 and 23 from 3:00 pm to 4:30 Lisbon time (4:00 pm to 5:30 pm CET, 10:00 am-11:30 am EST).

You can register here to attend any or all of the three sessions:
https://us02web.zoom.us/meeting/register/tZEpde-sqTkvGtCdMsrBl825tFrpDQ98FkAI

This is an evolving course, with the content continuously adapted in response to new questions, workflow challenges and process research as well as interoperability studies with other tools. Participants in the last series asked quite a number of interesting things during and after the talks, and their questions provided excellent material for new examples and approaches, and I hope for the same experience in this round.

The memoQ Regex Assistant is a unique library tool introduced in its current form in memoQ version 9.9. The little bit of public discussion there has been about this tool is quite misleading. Contrary to the "pitch" from memoQ employees and nerdy fans in the user base, this isn't really a tool for learning regular expressions. There are much better means for doing that. And I have strong personal objections to the idiotic statements I hear so often that "everyone should learn some regex". What utter nonsense.

What everyone should do is take advantage of the power regular expressions offer to simplify time-consuming tasks of translation, review, quality assurance and more to ensure accuracy and consistency in language resources and translations. The Regex Assistant helps with this by providing a platform where useful "expressions" can be collected and organized with readable names, labels and descriptions in any language. These libraries can be sorted, exchange with other users and applied for filtering, find and replace operations, QA checks, segmentation improvements, structured translation of dates, currency expressions, bibliographic information, legal citations and more or exported and converted to formats for easy use in other tools such as Trados Studio, Phrase/Memsource, Transtools+ and more. All without the need to learn any regular expression syntax!

HTML created from a memoQ Regex Assistant library export
An exported Regex Assistant library converted to a readable format by XSLT

My objective is not to teach regex syntax. It is to empower users to take more control of their work environment and save time and frustration for their teams and enjoy more life beyond the wordface. To help with that, I provide some usable examples in a follow-up mail after each sessions: resources that you can use in your own work and share freely with colleagues. 

And in this next round of workshops, available for purchase, there will be some additional high value resources to help achieve better outcomes for work in particular language pairs and particular specialties, such as financial translations. These complex resources were developed over a period of years, sometimes at great cost. In the last session I'll be getting "down and dirty and a little nerdy" to show you my way of maintaining complex resources like these auto-translation rules and others in a very effective, sustainable way that enables you to adapt quickly to changing requirements and style guides.

Sign up free to join the fun here.

Jan 12, 2023

memoQ&A: The Regex Assistant in Practice

Note: there will be another series of workshops for this subject matter in March. Details are HERE. Register once to attend any or all of the sessions.

Users of memoQ version 9.9 or later have a powerful library tool available with which they can organize solutions or solution elements that use regular expressions and apply these without the complications of learning regex syntax. In each session, we'll look at different ways in which these portable libraries can be used, with a particular emphasis on solving common problems faced by translators and reviewers. Materials will also be made available to participants for later study and practice.

There will be three sessions of 90 minutes each on three consecutive Thursdays: January 19, January 26 and February 2 at 11:00 a.m. Lisbon time (i.e. noon CET). The first session will introduce the Regex Assistant library and its basic functions for organizing and exchanging information and then move on to specific examples of using the library to deal with common problems encountered in translation and review work. Particular emphasis in the first session will be on filtering and Find/Replace operations. 

The two later sessions will continue to explore filtering and options for making changes to texts and tags, and we will also take a tour of possibilities for using regular expression resources (from the library!) in other parts of memoQ such as the Regex Tagger, QA checks or auto-translation rules. As time permits, examples or requests from participants can also be explored. 

Those interested in joining the free sessions can register here.

Update: a recording of the first session is available here: https://youtu.be/KKR5aH5oGH8

To get a little taste of what's to come, have a look at this video created by a colleague last year:

Jun 5, 2021

Get better dates with memoQ

There's hope for all those incels stuck with RWS Trados Studio, Memsource, Wordfast, OmegaT and a host of other horrors if they are willing to make a change....


Not surprisingly, dates can be a real nuisance to translate and check, depending on the client's specifications. Target language specifications that include the use of elements such as non-breaking spaces can be particularly troublesome. But even apparently simple tasks like writing all dates in the target language as DDMonth-AbbreviatedYYYY or the like can go wrong far more than expected, and skilled reviewers easily get caught up in the flow of the text and overlook details of format (and often even correct content) in dates. This proved to be a shock to one LSP client who found hundreds of overlooked date errors in a large volume of recently reviewed text.

What's the solution to these inevitable human errors? Proper automation of the monkey work so professionals can concentrate on what they are good at: a fluent text that accurately reflects the intent of the original.


In the case of the QA horror of checking four source languages to ensure the dog's breakfast of date input formats (which also included day+month and month+year entries) regardless of capitalization, memoQ enabled a simple auto-translation ruleset to be created (a few hours' work, including testing and documentation); this was then attached to a project using a QA profile configured to check only against the enabled auto-translation rules, and BOOM! after about a minute, all the date errors in something like 100,000 translated words were revealed. The only false positives found were a few instances where times were written after the date, and the rule can be updated easily to avoid this issue.

I do a lot of date rule development for many languages, and I've published some of this in simplified forms on this blog. But interesting new tricks come up all the time. And I've found it useful when developing rules for others, who usually understand little about writing proper specifications that capture all the likely source input, to create special screening rules like the one shown in the first screenshot, which can be used to examine an entire large TM imported to the memoQ working grid, and see how the input and target texts vary. I used that expression on two large TMs in a view, and in just a few seconds, my laptop screen showed me all renderings of every English date in those TMs into Portuguese. While researching target formats for some new rules I also found quite a number of errors in the TM which I could have corrected had I cared to.

Having rules like this available in the translation phase can prevent quite a few errors to start with. I've found too many cases of dates in March translated as May and overlooked by both the translator and the reviewers. memoQ is - as far as I know - the only tool which will offer such conversions in a results table from which they can be inserted just like any other "terminology hit".

Other tools like RWS TradoZe Studio will allow you to use regular expressions for quality assurance (checking the text), but I'm not aware of any tool other than memoQ which allows you not only to include these checks in customized QA profiles but which can also provide them for on-the-fly review from a library of named expressions. That's what memoQ now does in version 9.8 (scheduled for release in June, within a few weeks) as shown in the first screenshot above.

This new rules library feature of memoQ makes it possible for the first time for users who have a life which does not involve wasting brain cells learning to program regular expressions to use the efforts of those who actually like that sort of thing and do it well. So with this tool, anyone can easily check for things like date errors and a lot more without knowing a single bit of regex syntax. That's some progress :-)

Stuff like dates just keeps getting better for memoQ users, leaving them more time for life and better stuff... like other kinds of dates.

If this kind of thing interests you, I think my friend Marek Pawelec may be teaching an in-person course in regular expressions for memoQ in July of this year, and he, I and others (including memoQ's Business Services unit) are available to help you with turnkey solutions to project challenges like those described here.

Jun 3, 2021

A Hebrew abbreviations "hint base" points the way for other languages

Years ago I published a guideline for how to create something like a term base for memoQ that can handle the irregularities one might find in the way German attorneys on tight deadlines might type the many abbreviations they use in crazy ways. The memoQ term base model can't cope with punctuation and many special characters, so it's basically impossible to use it to map something like "US-$" to the standard currency code "USD". But regular expressions in an auto-translation rule can do that, of course.

The same principle can be used simply to map abbreviations to their full expression so the translator can decode the abbreviation and decide how to render it. Here's an example of that in Hebrew:

This can, of course, be done in other languages, but the fellow who had this idea and asked me about it happens to be a Hebrew translator working into several target languages. I'm tempted to adapt one of my German abbreviation sets to map to the full German expression in the target to serve as an aid to translators who might not be as familiar with the abbreviations as I am and who are also not bound strictly to a particular target language expression. A cheat sheet, basically, or a "hint base" if there is such a thing.

The code for this is particularly simple. Here's a quick look at the resource in an external editor:


The basic "engine" is just a list (#abbreviations#). And the resource was created quickly using search and replace on a list of over 600 abbreviations in an Excel spreadsheet.

In the awful memoQ rule editor it looks like this:


Those who know Hebrew may note that some periods are out of place. I'm not an RTL expert, so I had a few issues with punctuation migrating as I moved data from one format to another, but someone familiar with issues like that can fix things without much ado. This was just a quick prototype to demonstrate feasibility. And a few minutes of search and replace work in a text editor beats entering more than 600 pairs manually in the built-in editor for memoQ. It would be nice if that damned editor included list import features that would read Excel files directly!

As with other auto-translation rules, certain characters may need to be represented by entities or uuencoding. The simple rule shown above can also be made more robust by dealing with variable punctuation, for example. Complexity can always be added. 

Many thanks to the translator colleague who shared his challenge and gave me something fun to do after a grueling day of mapping many messed-up date formats from a lot of different source languages I mostly don't know :-)

Jun 4, 2019

Regular expressions in memoQ demystified - THE workshop!

Next week in Utrecht there will be a unique workshop to enhance your productivity with memoQ, as you learn how to develop rules for automated formatting and QA of patterned expressions, such as dates, currency expressions, unusual or custom text formats and more. THIS knowledge is one of those "secret weapons" that I deploy to help the most sophisticated financial and legal translators I know save countless hours of mind-numbing donkey work doing QA on things like legal references and expressions involving currency (such as EUR 3 million vs. €3m, etc.) or creating those references in the first place and inserting them in the translation with a simple keystroke.

The course instructor, Marek Pawelec, is one of my personal resources when I am in over my head on technical problems or when I need to be very sure that a client of mine gets the right help in time. He has a rare gift of taking subject matter which many find baffling and presenting in a way that makes it accessible to most any educated adult.

Because of the scope of this subject matter and the importance of proper follow-up and support while learning it, the workshop will be held over two days - June 10 and 11 (Monday and Tuesday) - from 10 am to 4 pm each day, which will give plenty of time to learn the basics and move on to apply your new technical skills to common and not-so-common technical challenges in translation projects where memoQ is involved.

Trust me on this one: we are talking about critical process secrets to save massive amounts of time and do better work on things like annual reports, court briefs and more. Or creating projects for text formats that seem impossible to work with at first glance. THIS is where the money is in an increasingly competitive market.

Information to register now can be found on the Facebook event page for the workshop or on the relevant Regex Workshop page for the host, the All Round Translator education cooperative in the Netherlands.

May 21, 2018

Best Practices in Translation Technology: summer course in Lisbon July 16-21

As usual each year, the summer school at Universidade Nova de Lisboa is offering quite a variety of inexpensive, excellent intensive courses, including some for the practice of translation. This year includes a reprise of last year's Best Practices in Translation Technology from July 16th to 21st, with some different topics and approaches.

Centre for English, Translation and Anglo-Portuguese Studies

The course will be taught by the same team as last year – yours truly, Marco Neves and David Hardisty – and cover the following areas:
  • Good translation workflows.
  • Using voice recognition in translation.
  • Using machine translation in a humane, intelligent way.
  • Using checklists to improve communication in translation.
  • Using glossaries, bilingual texts and other references in multiplatform environments.
  • Good practices for using terminology and reference texts in the target language.
  • Planning and creating lists for auto-translation rules and the basics of regular expressions for filters.

Some knowledge of the memoQ translation environment and translation experience are required.

The course is offered in the evening from 6 pm to 10 pm Monday (July 16th) through Friday (July 20th), with a Saturday (July 21st) session for review and exams from 9 am to 2 pm. This allows free days to explore Lisbon and the surrounding region and get to know Portugal and its culture.

Tuition costs for the general public are €130 for the 25 hours of instruction. The university certainly can't be accused of price-gouging :-) Summer course registration instructions are here (currently available only in Portuguese; I'm not sure if/when an English version will be available, but the instructors can be contacted for assistance if necessary).

Two other courses offered this summer at Uni Nova with similar schedules and cost are: Introduction to memoQ (taught by David and Marco – a good place to get a solid grounding in memoQ prior to the Best Practices course) from  July 9–14, 2018 and Translation Project Management Tools from September 3–8, 2018.

All courses are taught in English and Portuguese in a mix suitable for the participants in the individual courses.

Mar 23, 2017

First month with SDL Trados 2017


A month ago, when I announced the Great Leap Forward from my rather neglected SDL Trados 2014 license to the latest, presumably greatest version, SDL Trados 2017, after seeing how wet the largely untested release of memoQ 8 (aka Adriatic) has proved to be, there was some surprise, as well as smiles and frowns from various quarters. It's been a busy month, and I am still testing options for effective workflow migration and exchange (useful in any case given how often memoQ users work together with those who prefer SDL tools) as well as discussing the good and bad experiences of friends, colleagues and clients who use SDL Trados Studio 2017.

As can be expected, this product has more than a bit of a bleeding edge character, though on the whole it does seem to be a little more stable and less buggy than memoQ Adriatic so far, with fewer what the Hell were they smoking moments. However....

I was a little concerned at the report from a colleague in Lisbon that the integration of the plug-in for SDL Trados Studio access to Kilgray Language Terminal amd memoQ Server translation memories doesn't work with SDL Trados 2017 after functioning so well in SDL Trados 2014 and 2015. Despite the stupid inter-company politics between SDL and Kilgray, which hindered the approval of the plug-in so that a warning dialog appeared each time it was loaded in SDL Trados Studio (bad form by the boys in Maidenhead), it was a great tool for users of SDL Trados Studio and memoQ to share TMs in small team projects. I was very happy with how it worked with SDL Trados Studio 2014, and I am very disappointed to see that API changes in the latest version have bunged things up so that Kilgray will have more work to re-enable this useful means of collaboration. I hope that SDL will see fit to be less petty and more cooperative with the upcoming "fixed" plug-in! It is in their interest to do so, as this makes it easier for SDL Trados users to stick to their favorite tool while working on jobs for or with those who prefer memoQ as their resource. Better work ergonomics for everyone and no BS with CAT wars.

I was pleased to see that SDL Trados Studio has added AutoCorrect facilities recently. And they seem to work reasonably well in English and mostly in German, though there was a strange quirk which hamstrung the "correct as you type" feature. That setting took a while to "stick" somehow when I tested it first with German. It was fine for Portuguese too. However, Ukrainian and Arab colleagues can't get it to work for some reason. I did not believe this at first until a colleague in Egypt showed me live via shared screens in Skype how the autocorrection simply failed to activate. Perhaps this is an issue with languages that don't use the Roman alphabet, so I suppose colleagues in Russia, Serbia, Japan and elsewhere may be tearing some hair out over this one. It doesn't affect me directly, but it looks like a pretty serious bug that ought to be addressed ASAP.

SDL generally kicks some butt with regex facilities in SDL Trados Studio; customer service guru Paul Filkin has written a lot about these features on his Multifarious blog, and most advanced users of the platform make heavy use of regular expressions in filters and QA rules. For a long time, memoQ users could only look on in envy at all the excellent possibilities before Kilgray belatedly added more regex options to its work environment. However, there are a few raw rubs remaining.

My Arabic translator friend pinged me recently to ask if I was aware of the "regex trouble" in the latest Studio version. He made heavy use of these features for Arabic and English work in some rather amazing, creative and inspiring ways (I had not imagined) in earlier versions of SDL Trados Studio, and some of these features are rather broken at present in SDL Trados 2017. He gave me a very useful tutorial (which I had planned to beg him for anyway soon) in the use of regex in SDL Trados Studio for basic filtering, advanced filtering and QA checks. Overall I was very impressed with the possibilities, but the failure of some regular expressions which worked well in the advanced filters to work at all in the basic filter or in QA rulesets was very disturbing. We argued a little about what the basis of the problem could be in the software programming, but it is a major problem which limits the functionality of SDL's latest software severely and should cause advanced users and LSPs to wait and watch for the fix before upgrading to the latest version. The persistence of such a major flaw in such an important area as quality assurance some 6 months after release is frankly shocking. I hope this will be addressed very soon so that I can migrate and upgrade some of me favorite QA routines from memoQ.

Last but not least is an irritating bug in an auxiliary feature for what has always been one of my favorite terminology tools, MultiTerm. It was the first Trados product many years ago, and despite many quirks over the decades, it remains one of the best. Face it: the memoQ terminology model is OK for most practical uses, but for maintaining high quality corporate terminologies tracking many important attributes it is hopeless garbage. Most other CAT tool terminology databases and glossaries are far worse. MultiTerm sets the standard today still for affordable, flexible, powerful terminology management. For 17 years I have used this excellent platform for my best terminologies for my best clients and delighted in its output management options (even when they can be a pain in the butt to configure properly).

When I want to access my high value MultiTerm resources while translating in memoQ or working in web pages or MS Word, I use the convenient MultiTerm widget to access the data. However, I am very disappointed to find that recent versions do not display the attributes for terms when the widget is used for lookup. Damn. That makes the results just as annoying as the lobotomized MultiTerm/TBX imports into memoQ. I really hope that SDL fixes this flaw ASAP and remains on top of the terminology game with MultiTerm and its lookup tools as a valuable resource even for translators who hate Trados Studio and won't use it.

Overall I am seeing a lot of nice things in SDL Trados Studio 2017, and I would say it is probably more mature and stable than memoQ 8 at this point. But it really is just a late-stage beta release, and more fixes are needed before I can trust it for routine production work. We are all better off for now to stick with the prior versions of both SDL Trados Studio and memoQ.

Mar 4, 2017

Documenting auto-translation rule development for memoQ

In an recent article, I described my simple method of recording examples of structured information like dates, financial expressions or legal references to help developers plan auto-translation rules (or other features using regular expressions, such as Regex Tagger rules) in memoQ and other applications. These are a sort of simplified performance specification - a table of examples showing how the rules should "perform", what they should do: what patterned source language expressions are to be transformed into particular structured expressions in the target language.

The need for proper documentation of such efforts does not end there, however. It is very important, especially for more complex sets of rules, that there be clear documentation of the purpose and logic of the rules developed, and that this documentation be present
  • in the rules themselves (as comments) and
  • in external documents to be used as references for troubleshooting, maintenance and further development.
Auto-translation rules and other resources using regular expressions should not be scripted and maintained for the long run in memoQ itself or in any other environment which does not allow thorough commenting of the regular expressions used. Without comments, it is simply too easy to destroy functioning rules by forgetting why they were written a certain way once-upon-a-time, and an environment able to use comments also allows old rules to be "commented out" (disabled, but still available for reference or later re-use) while new versions are tested. That is basically impossible with memoQ's internal resource editors at the present time. And to make matters worse, if auto-translation rules are edited inside memoQ, their order changes, sometimes with dire consequences if functionality depends on the rule order. Try sorting out problems like that in a set of 70 or so rules.

Excerpt from a large set of currency format rules with extensive comments. These comments are stripped when
the rules are imported into memoQ, so all maintenance should be done externally in a tool like
Notepad++.
As I began to revise and improve old rules that I created years ago for dates and currency expressions, I found that it was helpful to create a record of what changes I had made - and why I made them - and keep this information in a tabular form for easy reference and re-use.
Click to access a PDF sample of my rule development record (2 pages)
The graphic above is one example of how I maintain my personal records of some work developing regular expressions. I usually include
  • descriptions of all information recorded
  • a specific example on which I will base the general rule
  • a simple ("fragile") version of the rule part (source input and target output) with only the most essential elements; this is not error-tolerant, but it is the easiest to understand and the first place to look if something isn't working as I would like it to
  • more robust variations which take into account differences in spacing, punctuation, etc. or include things like non-breaking spaces that might be desired in the output (this can get cluttered and hard to read)
  • color-marking for easier identification of some elements
  • comments about why things are written as they are or about possible improvements or problems
This record is a template of sorts from which rules can be assembled very quickly or rules can be re-purposed for other languages or formats in a way that is easy to follow and catch mistakes. Such records are also helpful if the rules are to be shared with other developers or maintained by someone else.

My example is certainly not the final word in project documentation for such efforts; it is simply part of a set of personal tools to help me work more efficiently with the limited time I have. Professional development and consulting organizations often have far more extensive and detailed systems of project documentation; when I was part of one such shop nearly 20 years ago, my (downloadable) 2-page example might easily have filled twenty pages of very important-looking professional technobabble. Life's too short for shit like that anymore.

But if you value your time as a developer or your investment as one who hires others to develop such useful rules, it pays big dividends in most cases to demand some sort of clear, systematic and accurate record of how your special rules, filters, etc. were developed so that they can be maintained and improved in the future.

Feb 27, 2017

Planning special rules for structured "expressions" and multi-word abbreviations

Translators and editors often deal with what I'll call "structured expressions" or "patterned data" in many forms, which include:
  • long and short dates (2016-01-13; 1/13/16; 13.01.2016; January 13, 2016; 13th January 2016; etc.
  • time expressions (14:35; 2:35 pm; 2:35 PM; 2:35 p.m.; etc.
  • currency expressions (EUR 2.3 million; € 2,300,000; €2.3m; etc.) 
  • legal references (Section 14a paragraph 3 line 2; section 14a (3) line 2; etc.
  • bibliographical references for chapters, pages, margin notes, etc.
  • and much more.
There is also a wealth of abbreviations for multiple word expressions in some categories of text; favorites in German include:
  • in Verbindung mit (variously written as i.V.m., i. V. m., iVm or some typoed hybrid of the aforementioned with spaces and periods included or forgotten depending on the authors' preferences and degree of care)
  • im Sinne des (i.S.d., i. S. d., iSd, etc.)
These can be devilishly hard to check efficiently for consistency or other quality factors in a long text, and for the translation, there is often no single "right" way to format the target text equivalents, with many individual preferences to be found with translation buyers. Even with a good style guide (all too rare anyway), these issues can be challenging time-wasters.

Translation assistance tools such as Apsic Xbench, SDL Trados Studio and others, even memoQ, have various approaches to making life easier for a translator or editor faced with these challenges. Unfortunately for most people, these approaches usually involve the use of "regular expressions" or "regex" as nerds affectionately call it. Not an easy thing even for many hardcore techies!

On past occasions when I have written about the use of regex in translation tools, I have usually stated clearly that the best approach for the best, most reliable results is to have the regex "rules" for handling the text developed by a knowledgeable third party. The experts who deal with this stuff routinely can often reduce a task that would take a semi-skilled person like myself hours or even days to the time for a coffee break, and even if a task takes a while and runs up a bit of a bill, it's much more likely to be done right the first or second time.

But... there's a catch usually. Most of these regex fireaters are not skilled in mind reading, many are not translators, and even those familiar with translation challenges might not be familiar with your working languages or your particular subject areas and their possibly unique challenges. So effective communication is really, really important (it always is, of course, but here even more so if you are dealing with a verbally challenged, monolingual math freak who might be your local expert for regex).

Even for areas I know reasonably well and languages I more or less master, I am often frustrated by help requests from colleagues and clients who need special rulesets developed for a client's preferences for date and currency information, because the request is not clear in its scope and detail, and many important cases are left out, so the end result is not fully satisfactory.

Over the years and with a lot of back and forth (sometimes inside my own head with yours truly as my nightmare of a "client"), I have developed a system of simple documentation for planning and testing rules to help translate and quality check patterned information or multi-word abbreviations. This system provides an easy structure for non-techies (or even hardcore techies) to organize the help request for most efficient handling. Here is an example of part of such a planning sheet for a recent project involving Arabic:


When the time comes to test, just copy the source text column into a separate file, add whatever variations you want to the examples to test your accomodation of typos, etc. and then load that file as a "translation text" for testing in your working environment. If you have the same information for another, overlapping language pair, such as German and English, it is easy to couple that to make a ruleset which maps multiple source languages to a target language. An example of such a result is a memoQ auto-translation ruleset for mapping long dates and month-plus-day dates from German, English, French and Spanish into Portuguese which can be obtained here.

This simple, tabular approach to data collection to plan regular expression rules has made me a lot more efficient at such tasks and faciulitated the re-use of data to make new rulesets for clients and colleagues (or myself) as needs arise. The liberal commenting of examples can be very helpful; information to include which could affect rule structure might involve capitalization, location in a sentence, variations or differences in particular contexts, etc.

For my own work, rulesets include a series for dates, currency and legal reference formats from German to English for generic and client-specific use for US and UK English. With the help of these tabular planning sheets, I can adapt any of these quickly for most other languages.

For tracking the development of rules and their improvement history I have another set of templates which I use for systematic planning and identification of areas to improve. That will be discussed on another occasion.

Feb 20, 2017

Building a regex-savvy "termbase" in memoQ


For years I have been frustrated by and dissatisfied with how abbreviations are handled in the current memoQ termbase model. The crux of the problem is the handling of the periods in the expressions. This can be seen with termbase entries like the following, for example:


If the abbreviation "Art." appears in the source text, only the second source entry - the one without the period - will give a match result in memoQ. The first entry is simply ignored.

An additional problem which one would face, even if the terminal period character in the term did not pose a problem, is that authors are often notoriously variable in the way they write abbreviations. Take, for example, the abbreviation for the German expression "in Verbindung mit", usually written as "i.V.m."

In recent legal translation work, I have encountered this expression written as above, but also as "i. V. m." (with spaces), "iVm" (no spaces no periods) and sloppily typed variations like "iV.m" or "i. V.m." What's a poor wordworker to do?

The answer came to me while refining a set of auto-translation rules for bibliography formatting and legal references. These, too, can suffer from similar troubles: "page 7" might be abbreviated as "p. 7", but in the sloppy chaos of source texts poorly edited one might find "p.7", "p 7", "p7" or even variations with the letter capitalized, like "P.7". If you are translating nearly 1000 references in a bibliography, robust shortcuts are very helpful and save a lot of time, and if those shortcuts are based on memoQ auto-translation rules, they can also be used in a QA profile to ensure that every bit matches correctly.

As the screen capture from a memoQ Facebook group above suggests, the way to go about this is to identify which parts of the expression might vary with different deliberate and accidental typing. These are usually spaces and periods in the case of abbreviations; sometimes, particularly with German legal abbreviations, capitalization and dashes may play roles as well. (I tore my hair out not long ago trying to understand an Austrian legal text referring to two laws, which differed in their three-letter abbreviations only by a dash inserted after the first letter of one.)

In regular expressions, the question mark character means "zero or one" of whatever character precedes the question mark. So if I want a rule that acts in the case of one or no periods, I put a question mark after the period character. And because in the language of regular expressions, a period is shorthand for any character, if I want to talk about an actual period ("."), I have to precede that character by a backslash ("\."). In the technical jargon of Nerdworld that is known as "escaping the period" and there is no escaping such syntax if you want a regular expression rule about periods, period.

Spaces (normal or non-breaking ones) are represented by an escaped lowercase "s": "\s". So a matching rule for the English abbreviation "e.g" which catches a lot of typing variations might be

e\.?\s?g\.?

And in German, the target replacement rule might be

d.h.

Of course, if a typist is sloppy, there might be more than one space, or a comma might be typed accidentally instead of a period (the keys are adjacent, and if your screen is as dirty as mine gets sometimes, your eyes might not notice); capitalization might also differ accidentally or based on context. The regular expressions for matching can be adapted to handle all these cases if need be.

Rules of this type are not particularly difficult to construct, but refining them to accommodate all the variations you are likely to encounter may require an expert hand. Thus, as I have suggested before,. the average user should focus on documenting all the possible source variations clearly in a table which includes the desired target equivalents, and this table should be given to an expert (Kilgray support, a qualified consultant like Marek Pawelec or a technical programmer familiar with regular expressions and their use in memoQ). Trust me, this will save a lot of frayed nerves and probably significant time and money as well.

So now I am building a few memoQ auto-translation rulesets which are essentially fault-tolerant abbreviation glossaries. These, together with the similar rulesets for formatting bibliographical references and references to sections, paragraphs, lines, margin notes, etc. in laws, have been very helpful in reducing the time spent translating messy legal source texts, and the accuracy of the work has been improved significantly. Give it a try for your translation challenges!

Jan 12, 2017

The ART of all-round translation....


There is a certain mythology that in Ye Goode Olde Days, life was simpler and more generalist and a whole lot easier. I suspect that is mostly bunk. The stresses and pressures were different, but probably no less when considered objectively. I remember trying to help my wife, a sometime English to German translator, find clients in the early 1990s, and back then if you weren't local, the clients mostly did not want to know. And don't get me started on the time and effort of terminology research for my own translations then and in the decades before.

But I think it is fair to say that today, even the specialist must be a JOAT of sorts, at least when it comes to the bag of technological and project management tricks to subdue the unruly projects that many of us often face. Colleagues Dorota Pawlak and Ellen Singer recognized the difficulties faced by many language specialists in acquiring some of the specialist and non-linguistic skills needed to cope with particular work challenges and designed a program of quarterly, half-day small workshops to provide just the environment needed to cultivate this new knowledge and establish bonds with others in the same endeavor.

Upcoming workshops I find particularly interesting include:

Transcreation with Alessandra Martelli on February 4, 2017 in Leiden and

no kidding, the regex workshop on April Fool's Day 2017 with my favorite tech guru, the brilliant but articulate Marek Pawelec, a first-rate teacher who can make even nasty stuff like regular expressions seem simple for the rest of us. And as I have pointed out in various articles, this knowledge can be extremely useful for those who work with tools like SDL Trados Studio, memoQ, Xbench and more.

I encourage you to have a look at the ART project site and see what else is on the menu; it seems to me that they have the right approach for those looking for a good start in interesting new areas.

And keep up to date with them on Twitter....





Dec 28, 2016

Go Figure (with memoQ!)

When translating patents, legal briefs, reports, manuals and many other kinds of documents I inevitably encounter figure references to photographs and illustrations in the text as well as the labeled captions for these. In this morning's translation of a petition in a nullity suit, one such reference takes the form in Verbindung mit Figur 1,  but it might just as well appear as

Fig. 1
Fig 1
Abb. 1
or
Abbildung 1

in this or some other text; in documents with multiple and/or sloppy authors I might even find a mix of all these in the same text.

As I value consistency in writing even when the client might not care, I try to translate all of these to the same form in English where it makes sense to do so. That might be Figure 1 or Fig. 1 depending on the situation and the styleguide stipulated for the project.

But when I finish the 10,000 or so words for this job and need to do my final check before sending it to the client, I expect to be a little tired, and I want to use my attention and energy to focus on the accuracy and reading comfort of my translation. In doing so I tend to miss little details like the occurrence of "Fig. 1" on page 32 as opposed to "Figure 1" on the other 40 pages. That is why I use the QA feature of memoQ to check the consistency with which I have translated the figure references as well as other matters such as the accurate use of special terminology for the project.

The specific feature I use here for quality assurance is


an auto-translation rule set (aka "autotranslatables"), which is highlighted and selected in the screenshot of the project's settings above.

As I have stated many times before, autotranslatables should be used, but not created by the average translator. Aside from the fact that the regular expressions involved are not particularly easy even for most of the nerds among us, there are a lot of little subtleties that make the difference between a well-functioning rule set and annoying garbage, and even the "experts" struggle with this for sophisticated rules.

But the present example of Figure mapping is a comparatively simple case which can illustrate the principles and some of the "risks" to mere mortals.



My rule set for mapping figures from many German forms to a particular English form consists of a single rule.

All of the possibilities that I expect in German are compiled in a list, along with the English expression for each, and this translation pair list is named #figurelist# and is found on the corresponding dialog tab in the memoQ rule set editor for autotranslatables. (I usually edit rules externally in Notepad++ where I can comment them liberally, but in this case I felt no need to do so.) This named list is used as a variable in the regular expression for the rule to describe a source text match.

(#figurelist#)\.?\s+?\b(\d+)\b

Jeepers. That regex for the source text looks complicated, doesn't it? Wouldn't (#figurelist#) \d+ be just as good? After all, it seems to work just fine. Well, except that the list would need a few extra entries to account for abbreviations with and without periods.

No. "(#figurelist#) \d+" is total, incompetent crap. Here are some reasons why:
  • It is more efficient to express the possibility of a period after the text for "Figure" with the regex "\.?",  because you'll never have to worry about abbreviations with or without periods in your lists. Mine will get longer, as I'll probably expand these rules to cover Portuguese as well and use the same rule for both Portuguese and German sources.
  • There may or may not be a space or even extra spaces after the Figure expression. Simply typing a standard space after the (#figurelist#) group means that it must be present and it must be an ordinary space to match. If it's missing or someone typed a non-breaking space (a reasonable thing to do to keep both parts of "Figure 1" on the same line), the rule will not work! Using \s+? to express the possibility of 0 to n spaces after "Fig." or whatever is in fact the right way to go.
  • If you test the "simple" crappy regex, you'll also find that "Abb. 14" gives to results: Figure 1 and Figure 14. That is because the rule does not stipulate that the second part must be a whole "word", so the substring match with the first character also gives a result. Bad, bad, bad. The chaos that this sort of mistake can cause with more complex rules like currency expressions used in important financial translations is frightening.
The regex for the result also appears more complex than it should be, but there is a reason behind that as well. Instead of the simple $1 $2 (first group followed by a space followed by the second group), I specified output with a non-breaking space, because it looks rather unfortunate to have a line wrap in the middle of the expression for a figure. One sees that a lot, because it's a nuisance to remember to type non-breaking spaces all the time on the keyboard. This rule can also be used to check the use of the non-breaking space; an ordinary space will generate a warning when the memoQ QA profile is run with the autotranslatables check activated.

There are many ways in which regular expression rule sets can enhance the user experience and the quality of translation results when working in memoQ. It is not hard to use these rules, but it is beyond most users to create and maintain their own rule sets. Therefore
  • Kilgray should include more useful examples of rule sets (in addition to the very helpful number rules) in future releases of memoQ
  • The average user should ask the help of Kilgray Support for simple rules they need (in most cases this would fall under the usual commitment of paid support and maintenance for the year)
  • memoQ users should work with Kilgray's Professional Services department or other competent consultants to devise robust rule sets to boost their translation and quality assurance productivity. Beware of casual advice found in forums or social media; much of it does not consider issues like the problems described above despite the aggressive insistence one might see for a particular "solution". Truly, you get what you pay for :-)

Post scriptum:
An yet ye hack by night and sun, the work of regex be never done.
Of course something was forgotten in the example here. The myriad styles and customs of source text authors will inevitably offer up challenging variants to break your well-crafted rules. Today's is a text full of figure references like Abbildung 4.12, which would refer to the twelfth figure in the fourth chapter. For this the modified rule might be 

(#figurelist#)\.?\s+?(\b\d+\.?\d+?\b) 

Or perhaps not quite. Try it and you'll see a few problems. This is just another example of why it is good to make use of professional resources to help you with these challenges and to have a systematic way of recording and elaborating them. I'll explain more about such an effective system for planning and documentation in a future article. I've noticed that the "experts" in the translation field often care little for the usual standards of project specification, perhaps because they are sick and tired of translation projects with so many specification documents for those who know better.

Dec 13, 2016

The irregularities of regular expressions in #memoQ


Sometime back in the time-distant swamps where memoQ evolved, regex mysteriously became part of the software's virtual genes. It was unclear, exactly, which third-party engine or bacterial life form had been its source, and solution developers were often at a loss to know which advanced syntax would work or not unless they tried (and very often failed).

Many of us begged and pleaded for some kind of definitive documentation of allowed syntax for memoQ's regular expressions, which are an important feature for filtering (in recent versions), segmentation rules, special text import filters, autotranslatables rules and probably a few other things I've forgotten. But begging, threats - even bribery - led to no useful reference information, just some useless suggestions to read beginner's tutorials for other dialects somewhere on the Web.

Then, quite by accident, I learned yesterday that Kilgray uses the engine in Microsoft's .NET framework. Doh. Who'da thunk? Now, at last, I can get some definitive syntax information to help me solve more sophisticated problems for legal reference formats and other challenges in my translations with memoQ.

Even with accurate syntax guidance (at last!!!), regex development with memoQ is often not a simple matter. The integrated editors are often useless, especially for things like complex autotranslatables, where the bad feature of changing the order of rules after an edit can kill a ruleset. (It was long claimed by Kilgray Support that rule order does not matter, which is patently untrue. They simply did not look at the right test cases.)

Good code of any kind should usually be documented to facilitate maintenance. This is simply not possible with the editors for regex integrated in memoQ. So instead, I do all my rule-writing work in an external editor (such as Notepad++), where I can add extensive <!-- comments so I know what the heck I did when I have to revise the rules later --> and import the rulesets for testing into a memoQ project with appropriate test data included as "translation" documents. The hardest part of this workflow is remembering to enable the imported ruleset I want to test under Project home>Settings>Auto-translation rules; often I forget and think I really screwed up until I go back to the settings and mark the checkbox by the rules to test. Keep a lot of carb sources at your desk when you do regex work. Your brain will need them.

A lot of memoQ users think that regex is irrelevant to their working lives, but for hardcore financial and legal translators at least, this is an entirely mistaken idea. Correctly constructed rules can save much time and a lot of frayed nerves dealing with citations, dates, currency expressions and more, and the rules also decrease QA time while increasing accuracy.

I have quite a number of custom rulesets I have put together for my work and for some colleagues and clients. Regex is hard shit, no matter what anyone tells you. I have programmed computers in a host of languages since 1970 more or less and used to be known for a good memory for syntax rules, but I find regex so non-intuitive at anything more than a very basic level that if I use it only a few times a year, I have to re-learn it nearly every time. That's no fun. So the key to mastering regex is not to learn it. The massahs usually don't know sheet about workin' the fields, but if they are going to survive in this competitive world, they'll know which specialist to put on the job and reward him or her appropriately. Get to know a competent consulting specialist for memoQ regex, like colleague Marek Pawelec, and let that person's expertise save you many hours of typing and QA, not to mention undetected errors.

Kilgray also established a Professional Services department at last not long ago, and that team can also help you with these and other problems for optimizing the use of translation technologies. This is very often a better option than using consultants primarily focused on SDL solutions who do a bit of memoQ on the side, because even the best of these are often not really aware of the best approaches to use, and the consequences of this are sometimes dire. Are they at the memoQ wordface nearly every day, dealing with a wide range of challenges that push the technical envelope of the software to its limits? Or would they really rather do a beginner's workshop for SDL Trados Studio 2017 and show you all the cool features that memoQ has had for years and they probably never learned very well anyway? If it's not the first case, caveat emptor no matter the source.

Mar 17, 2016

Dynamic filtering with regular expressions in memoQ


Regular expressions (aka regex) are not a tool for everyone, though this is something that the nerdily inclined often fail to appreciate. For average users, a plain language query interface, perhaps with more limited options, is generally more accessible and used. However, sometimes it's nice to have such "shortcuts" available to select particular structures in a text for translation or editing, and the many people who complained for years that Kilgray did not provide a dynamic regex filter for the working translation grid - a feature of SDL Trados Studio for quite a while now - did have a point worth addressing in development. Now that has happened, though still a bit incompletely when considered in the full scope of memoQ's usual features for selecting text.

memoQ uses regex in a number of its modules, and Kilgray has several webinars which describe these applications, though they require some stamina to watch, and I expect that most people will become hopelessly confused if they try to take in more than one area of application in a single sitting. The uses of regex for segmentation rules, tagging, autotranslatables and text filtering on document import (with the Regex Text Filter) are very different in their approach, even though the underlying syntax of the regex is the same. However, all of these applications allow the configured rules to be saved and re-used, so one could ask an expert to create the settings needed and provide these in a resource file, and many users do exactly that. Thus as long as one understand that regex can be used for a particular problem, the details can be hired out.

This new application of regex for dynamically filtering, introduced in recent builds of memoQ 2015, is a little different (at present). Although the Find/Replace dialog will "remember" regex syntax in its dropdown menu of recent expressions, there is no way to store these expressions, and they must be entered manually to use them. This means that, for now, the average user will have to collect useful expressions like a tourist might scribble phrases in a notebook to use on holiday in a foreign country, and those with a little more sense of adventure might find themselves with a hovercraft full of eels and wonder why.

One such phrase might be the example in the screenshot above. I was translating some financial statements with several formats present for digits in account numbers, dates and monetary expressions. In order to work more systematically with these various formats, I used several different regex expressions to sort and separate them. In the example I was looking for instances where at least four digits were written together in a source segment. That isn't terribly selective, but most of these occurrences in my documents were account numbers, and this helpfully cleaned up the text a lot and allowed me to work a little faster. Other expressions were used to QA date formats and monetary expressions more specifically.

In the working grid for translation and editing, regular expressions can be used in one or both of the fields for the source and target text when the checkbox in the toolbar at the right is marked. Or the regular expressions option in the Find/Replace dialog can be used.


It is somewhat disappointing that regex cannot be used to create static views at the present time. While marking can be used in the Find dialog to enable one to go back and forth between the filter criteria and other configurations of the working grid, there is no way to make a permanent "record" of the filtered segments. For quite a few years, I have wished for the possibility to save the results of my filtering in the working grid in some sort of view, but I was always able at least to recreate the filtering criteria in the dialog to create a memoQ View, which could then be opened at any time or exported in various formats for clients and project collaborators. However, at the moment that is not possible with regex filtering. (There are workarounds involving a change in segment status, but these are often inconvenient in a project in progress.)

The addition of regex filtering to the working grid in memoQ is a welcome feature for many, which I hope will be expanded by Kilgray in the future to achieve more of its potential. But to take advantage of this potential in any way, the average user will indeed need a "phrase book" of sorts, and an efficient way of managing useful collected regex snippets (and naming them for easier re-use in searches and filtering) would be very desirable. If these "regex phrase books" for dynamic filtering and view creation were able to be saved as shareable light resources, it would be possible to build many useful collections to help users at all levels in the translation, editing and quality assurance tasks.