Transit NXT translation memories w/ different number of source/target segments
Thread poster: Gary Hess
Gary Hess
Gary Hess  Identity Verified
Local time: 08:17
German to English
+ ...
Mar 16, 2023

I am trying to create a custom QA tool for my own use and was analyzing some .DEU and .ENG files. Sometimes there is a mismatch between the source and target files, e.g. source file has 30 segments and target file has 31 segments. I would assume that some other translator during the translation process split 1 segment into 2 segments (that would explain the discrepancy).

I have a technical question: How does Transit NXT know which segments belong to one another? I have looked at the
... See more
I am trying to create a custom QA tool for my own use and was analyzing some .DEU and .ENG files. Sometimes there is a mismatch between the source and target files, e.g. source file has 30 segments and target file has 31 segments. I would assume that some other translator during the translation process split 1 segment into 2 segments (that would explain the discrepancy).

I have a technical question: How does Transit NXT know which segments belong to one another? I have looked at the XML quite a bit, but I can't figure it out yet.

BTW: I loaded a pair of mismatched .DEU and .ENG files into XBench, but Xbench doesn't correctly align the mismatched segments either.

Thanks,
Gary
Collapse


 
wotswot
wotswot  Identity Verified
France
Local time: 08:17
Member (2011)
French to English
Misaligned language pairs Mar 16, 2023

What I do is open the two files in two separate windows of a powerful text editor (like Notepad ++), place them side by side then find and delete the offending segment.
Segment lines begin with where n is a number, and end with .


 
wotswot
wotswot  Identity Verified
France
Local time: 08:17
Member (2011)
French to English
Follow-up to my previous message Mar 16, 2023

Segment lines begin with Seg SegID=n (where n is the segment's number) and end with /Seg

[Edited at 2023-03-16 16:19 GMT]


 
Gerald Dennett
Gerald Dennett  Identity Verified
United Kingdom
Local time: 07:17
German to English
+ ...
Re-align Mar 16, 2023

You need to perform an alignment on the offending pair of files. Otherwise the pair will be ignored in any TM.
Gerald


 
Gary Hess
Gary Hess  Identity Verified
Local time: 08:17
German to English
+ ...
TOPIC STARTER
How to do this automatically? Mar 16, 2023

I should have said that I want to write a program to recognize and interpret the mismatch automatically. I can edit the file manually, but there must be something inside the XML that points to the correct alignment. I want to figure out how Transit NXT manages this misalignment.

 
Hans Lenting
Hans Lenting
Netherlands
Member (2006)
German to Dutch
Reverse engineering Mar 17, 2023

Gary Hess wrote:

I should have said that I want to write a program to recognize and interpret the mismatch automatically. I can edit the file manually, but there must be something inside the XML that points to the correct alignment. I want to figure out how Transit NXT manages this misalignment.


Did you already create a project with one segment and split this segment, to see what happens in the xml? Silly question perhaps, since you seem to know how to write code...


 
Gary Hess
Gary Hess  Identity Verified
Local time: 08:17
German to English
+ ...
TOPIC STARTER
Maybe it's really an error... Mar 17, 2023

I tried your idea on a project (joining and splitting some segments to look at the results). The number of segments is actually never mismatched after these steps. So maybe the files in question do indeed have an error.

Thanks for all the suggestions!


 


To report site rules violations or get help, contact a site moderator:

Moderator(s) of this forum
Maya Gorgoshidze[Call to this topic]

You can also contact site staff by submitting a support request »

Transit NXT translation memories w/ different number of source/target segments






TM-Town
Manage your TMs and Terms ... and boost your translation business

Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.

More info »
Protemos translation business management system
Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!

The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.

More info »