Is anyone else having problems with MemoQ's handling of html or xml?

Forum: MemoQ support
Topic: Is anyone else having problems with MemoQ's handling of html or xml?
Poster: Thomas T. Frost

I used MemoQ 2015 to translate a page on my own website, and I have other pages to translate. However, after exporting the first translation, I noticed a number of problems with MQ's handing of html:

1. Code formatting destroyed

Html and xml code have two levels of formatting:
1. The formatting rendered to the end user by (typically) a browser.
2. The formatting of the code made by the programmer so he or she can overview the code. Line breaks and indentations are used for this.

Html and xml are line-oriented formats. Editors typically only display the first 80 characters of a line, although some have 'wrap line' options. Hence, many programmers try to keep their line lengths to what can be seen on the screen.

MQ does not respect any of level 2 formatting (of the code). Here are some examples:

Example 1

Source:

Export:

Example 2

Source:

Export:

Example 3

Source:

Export:

The carefully organised formatting intended to make the code easy to maintain is destroyed.

Another problem appears when one translates the texts, and MQ outputs as long a line as necessary to hold the text within a given set of

tags, meaning it cannot be displayed on the programmer's screen, and he has to manually insert as many line breaks as necessary to be able to read the text.

Their default html filter has the following option checked by default:
"Break segment at preserved newline characters: Check this check box to make memoQ start a new segment whenever it encounters a newline character in the HTML text, so that the newline character will be preserved in all cases."

But that's not how MQ behaves.

I reported this to Kilgray support on 25 July. So far, they haven't even admitted that it's a bug.

2. Html symbols not preserved

In example 3 above, one can see that the html symbol has been replaced with ©. That's indeed how it should be rendered, but keeping such characters as symbols in the code can prevent poor rendering caused by various software in the other end that doesn't respect all standards. It's not the translator's or the CAT tool's job to make the decision to change the symbols anyway. One has the option to export all characters that can be represented as symbols as symbols, but that outputs é as etc., not just those characters that were originally symbols.

Kilgray has not admitted that this is not desirable either.

3. Code page changed

I noticed that even though the source had a code page declaration, they changed it to UTF-8.

Their default filter does not have the following option checked:

"Use this codepage even if there is a different declaration in the file: Check this check box to enforce the import codepage selection. Use this when you suspect that the encoding declaration in the HTML file is incorrect or inconsistent. This check box is not checked by default."

So this looks like another bug, but they haven't admitted that yet.

UTF-8 may well be a better choice; it's just not the CAT tool's role to decide that.

4. tag syntax changed

By default, they will change . That is correct syntax for xml but not html.

However, unchecking the option

"Enforce empty tags: Check this check box to treat old-style tags as empty tags. Normally, these would be imported as opening tags, but with this setting memoQ will import them as XML-style empty tags in all cases – so you won't get rogue XML warnings when confirming segments in the document."

fixes this even though the explanation is totally cryptic and incomprehensible.

Am I the only one bothered by this, and does any other CAT tool do these things properly?

Fortunately, this happened on my own website, but I would have been very unhappy as a paying client to receive the mess MQ created in return, and I would require the translator to clean it up and re-establish the original formatting. Depending on the number of files, this clean-up task could take several hours. It took me 1-2 hours to clean up just one page on my own site.

What this means is that MQ is useless for html and xml, as I could not return such a shambles to a client.

Has anyone else had such problems?

How do other CAT tools handle html?

[Edited at 2015-08-10 13:47 GMT]

Is anyone else having problems with MemoQ's handling of html or xml?

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112