Pages in topic: [1 2] > | Identifying/highlighting internal repetitions in a Word document Thread poster: RobinB
| RobinB United States Local time: 10:37 German to English
Hi, CAT tools (e.g. Trados) generally have an analysis function that indicates how many segments in a Word document repeat internally (absolute and percentage segment repetitions). However, all you get here is an indication that segments repeat, not *where* they repeat. We haven't been able to find a way to indicate where segments repeat in a large document, e.g. by highlighting or similar, either in CAT tools or other utilities such as DeltaView. Does anyb... See more Hi, CAT tools (e.g. Trados) generally have an analysis function that indicates how many segments in a Word document repeat internally (absolute and percentage segment repetitions). However, all you get here is an indication that segments repeat, not *where* they repeat. We haven't been able to find a way to indicate where segments repeat in a large document, e.g. by highlighting or similar, either in CAT tools or other utilities such as DeltaView. Does anybody have an idea how to solve this problem? Are there any add-ons or utilities out there that can mark repeated segments or sentences in a document? TIA, Robin ▲ Collapse | | | Endre Both Germany Local time: 17:37 English to German DVX allows for highlighting and/or text colouring in Word | Sep 1, 2006 |
With Atril's DéjàVu X, you have the possibility of exporting a document with colour indication (background highlight or font colour) of different segment types: duplicates, 100% matches, fuzzy matches, etc. I think this is limited to Word or RTF documents, though. Endre | | | Maybe with the Word function "replace"? | Sep 1, 2006 |
Hi Robin, I came myself across quite recently this possibility. You type the word or sentence/segment or whatever you look for and then under "replace by" (Extras->Suchen->Ersetzen->Ersetzen durch) you repeat the same word/sentence and so on and add perhaps an asterisk or so. Then looking for the asterisk, you find the repetitions themselves. But you will still have to highlight them yourself... | | | At least within DVX they are marked | Sep 1, 2006 |
When you have imported the job into DVX and do the word count (with "Count duplicate rows" activated), all of the duplicated rows are marked by a colour bar on the left. (I think the default colour is grey, but I have changed my colour to magenta red). I know that it is possible to select just the duplicate rows in the DVX interface (so I can see instantly the 504 duplicate rows in a 682 row project I finished earlier today). It would therefore be possible to show the duplica... See more When you have imported the job into DVX and do the word count (with "Count duplicate rows" activated), all of the duplicated rows are marked by a colour bar on the left. (I think the default colour is grey, but I have changed my colour to magenta red). I know that it is possible to select just the duplicate rows in the DVX interface (so I can see instantly the 504 duplicate rows in a 682 row project I finished earlier today). It would therefore be possible to show the duplicates in the "External view", which is a table export format showing the source and target text in 2 columns. I'm not sure off-hand whether they are colour-coded by default in this export format, but if not, there are tricks to do so (e.g. select just the duplicates, mark them all as "locked", and they are then colour coded as such). This is not, of course, a colour-coded version of the original file in the original layout, but it does show the duplicates in a format that can be examined in Word. To handle the "External view" format comfortably, it is easiest if you splash out on the "Workgroup" edition, which will set you back about 2500 euros. It is normally possible to get just about the same result with the "Professional" edition (about 900 euros), but it sometimes needs a couple of extra steps to do so. Of course, you could go for the 30 day free demo (no functional limitations, you just have to get an activation code from Atril, which may take a couple of days if you catch them at a busy period). ▲ Collapse | |
|
|
RobinB United States Local time: 10:37 German to English TOPIC STARTER Word replace? | Sep 1, 2006 |
Hi Christel, Thanks for your suggestion, but it's predicated on your knowing what you're looking for. What I'm confronted with is a 180-page document that Trados tells me has around 10% internal repetitions. I don't know what's repeated, or where, and that's what I need to know... But thanks again. Robin | | | RobinB United States Local time: 10:37 German to English TOPIC STARTER
Endre, Thanks. So what you're saying is that DVX will highlight repetitions in a "virgin" document, i.e. one for which there are otherwise zero TM hits (because there are no 100% or fuzzy matches in any memory). Is that right? Robin | | | Endre Both Germany Local time: 17:37 English to German Virgin or (pre)translated documents | Sep 1, 2006 |
RobinB wrote: So what you're saying is that DVX will highlight repetitions in a "virgin" document, i.e. one for which there are otherwise zero TM hits (because there are no 100% or fuzzy matches in any memory). Is that right? That's possible, yes – and it is the easiest task of all; you can get DVX to export the text with colour information after pretranslation or translation as well. Depending on your goals, you may need to do some SQL tweaking. But just marking and exporting internal repetitions is very easy -- and by "exporting" I mean export of the source document with its original formatting; not DVX's "External View" feature (a two-column export without formatting), which Victor correctly pointed out as another alternative. Endre
[Bearbeitet am 2006-09-01 14:06] | | |
RobinB wrote: Hi Christel, Thanks for your suggestion, but it's predicated on your knowing what you're looking for. What I'm confronted with is a 180-page document that Trados tells me has around 10% internal repetitions. I don't know what's repeated, or where, and that's what I need to know... But thanks again. Robin I understood you knew what the repetitions are but wanted to know where exactly they are... | |
|
|
Heinrich Pesch Finland Local time: 18:37 Member (2003) Finnish to German + ... Write a macro | Sep 1, 2006 |
IN Word a macro would search for strings in a text-file (from a TM) in the document and would mark the found strings somehow. Some agencies have such tools, they send Word-files, where known sentences are marked strike-through. But I have no idea what tool they use. Such a macro would be easy to write if on would know the technique Regards Heinrich | | | Pre-translate with different color | Sep 1, 2006 |
If you use Trados you could do the following: * Export all repetitions. * Create a new empty TM and import the repetition file. * Pre-translate a copy of the document with that TM and a specific color for the 100%. This will give you a color overview of the repetitions. Hope this helps. Best regards, Cecilia
[Edited at 2006-09-01 21:07] | | | Lucica Abil (X) Romania Local time: 18:37 Italian to Romanian | RobinB United States Local time: 10:37 German to English TOPIC STARTER Trados workaround | Sep 1, 2006 |
Cecilia, Many thanks for the tip - it's a workaround, certainly, but it does seem to do the job reasonably effectively. I'll keep looking for a non-CAT solution, though (Word add-in or separate tool), preferably something that produces a report similar to a DeltaView file comparison. Thanks again, Robin | |
|
|
RobinB United States Local time: 10:37 German to English TOPIC STARTER
Victor, Endre, Thanks for the info on DVX - looks interesting, but I think an excessive investment of time (and subsequently money) for what I think should (at least in theory) be a relatively simple routine. I'll keep looking for a Word add-in or other tool that will do the job. Robin | | | RobinB United States Local time: 10:37 German to English TOPIC STARTER
Heinrich, Thanks for the suggestion, though I'm not sure that Word macros are *ever* easy And the problem with the file I'm dealing with at the moment is that it is entirely virgin as far as TM is concerned, i.e. there *is* no TM. Robin | | | RobinB United States Local time: 10:37 German to English TOPIC STARTER
Lucia, I've tried out TextSTAT in the past, and it's certainly not useful for what I need. As a statistical analysis tool, I'd classify it as a toy system, as it doesn't even come anywhere close to what much older systems (such as System Quirk from the mid-1990s) have to offer in terms of functionality and granularity of analysis. But thanks anyway. Robin | | | Pages in topic: [1 2] > | To report site rules violations or get help, contact a site moderator: You can also contact site staff by submitting a support request » Identifying/highlighting internal repetitions in a Word document Trados Business Manager Lite | Create customer quotes and invoices from within Trados Studio
Trados Business Manager Lite helps to simplify and speed up some of the daily tasks, such as invoicing and reporting, associated with running your freelance translation business.
More info » |
| TM-Town | Manage your TMs and Terms ... and boost your translation business
Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.
More info » |
|
| | | | X Sign in to your ProZ.com account... | | | | | |