ABBYY Aligner
Thread poster: Kirstine Rennie
Kirstine Rennie
Kirstine Rennie
Local time: 19:42
Jan 15, 2015

Hello,

My translation firm is looking to align a very large amount of work. I had already posted a question regarding this in the Across forum, assuming that it would have to be done in a CAT tool.

Another member responded saying that he thought it would be easier/quicker in something like ABBYY Aligner. My question is has anyone ever used ABBYY Aligner and if so what are the advantages to using it instead of aligning translations in a CAT tool? Is it significantly fas
... See more
Hello,

My translation firm is looking to align a very large amount of work. I had already posted a question regarding this in the Across forum, assuming that it would have to be done in a CAT tool.

Another member responded saying that he thought it would be easier/quicker in something like ABBYY Aligner. My question is has anyone ever used ABBYY Aligner and if so what are the advantages to using it instead of aligning translations in a CAT tool? Is it significantly faster?

Any help would be much appreciated!

Kirstine
Collapse


 
Michael Beijer
Michael Beijer  Identity Verified
United Kingdom
Local time: 19:42
Member (2009)
Dutch to English
+ ...
Just typed, and lost a very long post :-( Jan 15, 2015

No time to retype it all.

I suggest first having a look at the free LF Aligner: http://sourceforge.net/projects/aligner/
You might also want to consult András Farkas: http://www.farkastranslations.com/alignment.php (the expert on aligning large amounts of text )... See more
No time to retype it all.

I suggest first having a look at the free LF Aligner: http://sourceforge.net/projects/aligner/
You might also want to consult András Farkas: http://www.farkastranslations.com/alignment.php (the expert on aligning large amounts of text )

Also have a look at AlignFactory (probably the best commercial aligner on the market today): http://www.terminotix.com/index.asp?content=brand&brand=1&lang=en

Always test a sample with several aligners. Each text is different, and there is no aligner that will do a good job on all text types.

Michael
Collapse


 
FarkasAndras
FarkasAndras  Identity Verified
Local time: 20:42
English to Hungarian
+ ...
Gah Jan 15, 2015

I also typed out a long post that the proz software destroyed instead of posting. Serves me right for typing in the browser instead of a text editor.
Anyway, how much material are we talking about (thousands of segment pairs or tens of thousands, maybe hundreds of thosuands or millions?) and how good a result do you want (is 95% correct pairing good enough, or do you want 100.00% correct)? Are you okay with discarding subpar documents or sections of texts or do you want every last sentence
... See more
I also typed out a long post that the proz software destroyed instead of posting. Serves me right for typing in the browser instead of a text editor.
Anyway, how much material are we talking about (thousands of segment pairs or tens of thousands, maybe hundreds of thosuands or millions?) and how good a result do you want (is 95% correct pairing good enough, or do you want 100.00% correct)? Are you okay with discarding subpar documents or sections of texts or do you want every last sentence extracted?
Your choices depend on these factors. In any case, you need an aligner with a good autoaligner algorithm. Most CAT tools' aligners fail at this hurdle. Then what sort of a review/edit you do after autoalignment depends on your needs.

[Edited at 2015-01-15 15:00 GMT]
Collapse


 
Mikhail Zavidin
Mikhail Zavidin
Local time: 22:42
English to Russian
+ ...
intelligent and fast Jan 15, 2015

As to my translation pair it seemed to me quite intelligent and fast.
I had a filling that it caught the meaning of the sentence when aligning each pair. However, there was mistakes in aligning.
To be frank I haven't seen better aligner so far, though I can't say that I have used a lot of them.
Now you can try ABBYY Aligner 2.0 trial which works only 15 days though and has some restrictions.
The version I have used is 1.0.


 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 20:42
Member (2006)
English to Afrikaans
+ ...
@Kristine Jan 15, 2015

kirstinerennie wrote:
My translation firm is looking to align a very large amount of work. I had already posted a question regarding this in the Across forum, assuming that it would have to be done in a CAT tool.


No, it typically can't be done "in" a CAT tool, although some CAT tools are accompanied by alignment programs. For very large amounts of work, the aligners that I have seen that come with CAT tools may not be suitable, though.

LF Aligner is a freeware option that tends to get good reviews. I haven't really used it myself, though. It is mainly a non-GUI aligner, but the latest version does have a GUI, but it's not the most user-friendly GUI that I have seen.

If you want high quality TMs that you can trust 100%, then you'd have to check the alignment manually, and that is when it becomes necessary to have an alignment GUI that is easy to use.

Another member responded saying that he thought it would be easier/quicker in something like ABBYY Aligner.


The product brief looks impressive, but you won't be able to tell whether it is "good" with your very large amount of work, because the trial version is limited to 1000 segments (or 50). For EUR 100 it is not cheap. The installer for the trial version is 300 MB (ouch!).

==Added:

Okay, I tested it. It is a "smart" aligner, which is a good thing, but it misses a crucial function: the ability to insert blank cells. It can move cells up and down only if there is a blank cell above or below that cell, but if there is no blank cell, and there is a misalignment, then you can't fix it using the keyboard shortcuts, but must fix it by manually copy/pasting text from one cell into another. That is *bad*.

There is a batch function (not tested) but I just dragged and dropped two files into it (EN and AF). It doesn't support AF, but recognised the AF file as NL, which is good enough. It only merges cells if you select them, and unfortunately the shortcut for merging is in an odd position, but it's not the end of the world. There is no simple shortcut for moving between whole cells, but the down and up arrows move between cells easily. It does not seem possible to delete a cell without deleting the entire row. Ctrl+Z works! PgDn and PgUp moves through the file (also a good thing).

The screenshot in the product brief showed that the program will mark possible misalignments in colour, but it didn't do it for me.

[Edited at 2015-01-15 15:45 GMT]


 
Kirstine Rennie
Kirstine Rennie
Local time: 19:42
TOPIC STARTER
Thank you Jan 16, 2015

Hi everyone,

Thanks so much for all your helpful replies, very much appreciated.

We are currently looking at all the options as it's a very big alignment project (around 2 million words!).

Thanks again,

Kirstine


 
FarkasAndras
FarkasAndras  Identity Verified
Local time: 20:42
English to Hungarian
+ ...
2M words Jan 16, 2015

kirstinerennie wrote:

Hi everyone,

Thanks so much for all your helpful replies, very much appreciated.

We are currently looking at all the options as it's a very big alignment project (around 2 million words!).

Thanks again,

Kirstine


That's a pretty big alignment project. If 2M words is in one language, that'll probably work out to about 200K segment pairs. That's already in the size range where I'd normally do an autoalignment with only a partial manual review (more a series of spot checks looking for potential quality improvement tricks than an actual review, re-running the autoalignment if I find something system-level and fixable). Still, a full manual review is not outside the realm of possibility. It's just a huge job. I reckon I have probably done manual review on 100K+ segments so far, for my personal use, for a hobby project (aligning public domain literary works) and for paying clients (translators who needed TMs made from translated documents)... But then I'm probably quite unusual in terms of my tolerance for certain types of monotonous work and being able to review/fix alignments quickly.

As to tools, I personally would use LF Aligner. But then I wrote it so I'm obviously partial. Alignfactory and ABBYY seem to get good reviews, although if it works as Samuel describes I would say ABBYY is out.

[Edited at 2015-01-16 20:26 GMT]


 
2nl (X)
2nl (X)  Identity Verified
Netherlands
Local time: 20:42
Transit Alignment tool Jan 17, 2015

The Transit Alignment tool offers interesting ways to improve the alignment result, e.g. by use of your dictionaries.

Quick Start: https://transitnxt.wordpress.com/2013/11/27/aligning-files-in-transit-nxt/

Full manual: http://tinyurl.com/qf56j5q
... See more
The Transit Alignment tool offers interesting ways to improve the alignment result, e.g. by use of your dictionaries.

Quick Start: https://transitnxt.wordpress.com/2013/11/27/aligning-files-in-transit-nxt/

Full manual: http://tinyurl.com/qf56j5q

Use internal word list
Transit NXT uses an internal word list to assess the probability of the source and target segments being correctly matched.
The alignments are saved in the file align.adc under config\global in your Transit NXT installation folder.
If Transit NXT finds that the source-language segment contains an entry from the internal word list, it searches for the translation of the term in the target-language segment.

Use project dictionaries
Transit NXT uses the current TermStar dictionary to assess the probability of the source and target segments being correctly matched.
If Transit NXT finds that the source-language segment contains a term that has been added to the current dictionary, it searches for the translation of the term in the target-language segment.

Resource files mode (with comparison of markup segments)
Transit NXT compares markup segments during align- ment, instead of text segments.
Use this option when aligning files with string IDs, perhaps for localisation projects.
Collapse


 
FarkasAndras
FarkasAndras  Identity Verified
Local time: 20:42
English to Hungarian
+ ...
standard feature Jan 17, 2015

2nl wrote:

Use internal word list
Transit NXT uses an internal word list to assess the probability of the source and target segments being correctly matched.
The alignments are saved in the file align.adc under config\global in your Transit NXT installation folder.
If Transit NXT finds that the source-language segment contains an entry from the internal word list, it searches for the translation of the term in the target-language segment.

Use project dictionaries
Transit NXT uses the current TermStar dictionary to assess the probability of the source and target segments being correctly matched.
If Transit NXT finds that the source-language segment contains a term that has been added to the current dictionary, it searches for the translation of the term in the target-language segment.

That's been a standard feature of many autoalignment algorithms for many-many years. It's kind of an obvious method to use, so it's certainly not a selling point for any one algorithm. In many cases it's a drawback.
Alignment history lesson coming up, skip if uninterested: many different efforts were made to get away from this dictionary-based method in order to be able to align texts in language pairs where no good dictionary is immediately available (for the text pair in question, in the right format, to the person running the alignment). Perhaps the most widely used one is the Gale-Church algorithm developed in 1993. It is based on segment length: longer segments tend to correspond to longer segments, and shorter ones to shorter ones. If you go through the whole text trying to equalize the segment lenths, things start to fall into place. Some algorithms try to find identical strings in the two texts to use as anchors (e.g. proper names), some run the texts through a MT engine or deploy other tricks. Quite a few use a combination of methods. Hunalign, which testing has shown to be one of the best algorithms, uses a combination of the dictionary method and the Gale & Church algorithm. (It can actually run the Gale & Church to get a rough alignment, automatically extract a dictionary from the aligned texts and then do a second alignment run with the freshly made dictionary.) LF Aligner uses hunalign as its alignment engine and comes with dictionaries for a wide range of language pairs. You can also add your own dictionary.


2nl wrote:
Resource files mode (with comparison of markup segments)
Transit NXT compares markup segments during alignment, instead of text segments.
Use this option when aligning files with string IDs, perhaps for localisation projects.

That's a neat trick, which is also employed by multiple other aligners. LF Aligner doesn't do this (I tried to integrate an open source alignement engine that does this but couldn't get the alignment engine to work and abandoned the idea). In the case of XML files or similar, it could be very useful if it's implemented well. It's usefulness is limited to specific file types, though. E.g. it might 'work' with HTML files in that it might correctly pair up paragraphs... but most autoaligners will do that anyway. The really important bit is correctly pairing up sentences, and HTML markup probably won't help you do that at all.


 
Victor Lage de Araujo MD IFCAP MSc
Victor Lage de Araujo MD IFCAP MSc
Brazil
Local time: 16:42
Member (2018)
English to Portuguese
+ ...
ABBYY Aligner Jan 23, 2018

Okay, I tested it. It is a "smart" aligner, which is a good thing, but it misses a crucial function: the ability to insert blank cells. It can move cells up and down only if there is a blank cell above or below that cell, but if there is no blank cell, and there is a misalignment, then you can't fix it using the keyboard shortcuts, but must fix it by manually copy/pasting text from one cell into another. That is *bad*.

There is a batch function (not tested) but I just dragged and
... See more
Okay, I tested it. It is a "smart" aligner, which is a good thing, but it misses a crucial function: the ability to insert blank cells. It can move cells up and down only if there is a blank cell above or below that cell, but if there is no blank cell, and there is a misalignment, then you can't fix it using the keyboard shortcuts, but must fix it by manually copy/pasting text from one cell into another. That is *bad*.

There is a batch function (not tested) but I just dragged and dropped two files into it (EN and AF). It doesn't support AF, but recognised the AF file as NL, which is good enough. It only merges cells if you select them, and unfortunately the shortcut for merging is in an odd position, but it's not the end of the world. There is no simple shortcut for moving between whole cells, but the down and up arrows move between cells easily. It does not seem possible to delete a cell without deleting the entire row. Ctrl+Z works! PgDn and PgUp moves through the file (also a good thing).

The screenshot in the product brief showed that the program will mark possible misalignments in colour, but it didn't do it for me.

[Edited at 2015-01-15 15:45 GMT] [/quote]

I liked ABBYY, it is in version 2. it is simple and straightforward for TMX manutention.

Just select any segment and split it to have an extra source-target text line. Just split the last line, copy and past any text (source-target) and have AABBYY align the text from that text (it willl automatically generate segments).

What I like in ABBYY Aligner 2 is that it allows me to view the whole text, have full control of alignment, manually correct both source and target, then keep it as an ABBYY project (custom ABBYY format) or else export either to bilingual RTF or to TMX.

Registered version allows for bigger files, but even it does have some limitations as to file size (check the help file for that, I just do not have that data now, sorry) but you can always split your work into "work1", "worr2" etc, if it gets too big.
Collapse


 
Stepan Konev
Stepan Konev  Identity Verified
Russian Federation
Local time: 22:42
English to Russian
Use 'Split segment' command Jan 24, 2018

victorlage wrote:
it misses a crucial function: the ability to insert blank cells.


You can run the Split Segment command (Ctrl+Enter) at any point of any segment. If you put cursor at the end of a sentence, this will just insert an empty pair of cells below.


There is no simple shortcut for moving between whole cells


Alt+arrows?

P.S. @victorlage,
Sorry... I did not notice that it was a quote from another user.

[Edited at 2018-01-24 06:09 GMT]


 
Susan Welsh
Susan Welsh  Identity Verified
United States
Local time: 15:42
Russian to English
+ ...
ABBYY Aligner Jan 24, 2018

I have it, but found that it requires a lot of manual adjustment. On the rare occasions when I need to align something, I use LF Aligner. But I never do huge jobs such as yours, so my experience is not that relevant.

 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

ABBYY Aligner







Trados Studio 2022 Freelance
The leading translation software used by over 270,000 translators.

Designed with your feedback in mind, Trados Studio 2022 delivers an unrivalled, powerful desktop and cloud solution, empowering you to work in the most efficient and cost-effective way.

More info »
Protemos translation business management system
Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!

The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.

More info »