replacing '--' to faciliate translation (was Re: translation HowTo - is there such an animal)
Øyvind A. Holm
sunny at sunbase.org
Tue Jul 19 11:48:25 CDT 2005
On 2005-07-18 09:41:30 Lorenz wrote:
> Lorenz wrote:
> > in src/nb/METHOD you write about working around problems with '--'
> > within xml comments.
>
> I still can't find any mentioning of this problem in any xml source I
> could find, but libxslt complains about it for sure.
Yes, xmllint(1) too. I suppose it’s because it’s legal to include "<"
and ">" in the comments, so a bad combination of those would terminate
the comment and mess up the XML.
> So we need a solution, though I would like to eliminate the script
> solution (replacing/reinstating '--').
>
> I thought about replacing '--' by '––', and convince the
> authors of the English version to do so too.
Heh… In fact, that has been suggested already. :)
http://red-bean.com/pipermail/svnbook-dev/2005-June/000755.html
It could maybe work, but there are a couple of drawbacks with this
method. Firstly, it would burden the main authors who have to remember
to write those entities every time which would clutter the original
English text, for example things like line 345 in src/en/book/ch04.xml.
And if the English svnbook poets choose to comment out something in the
middle of the text, those <!-- --> dashes would also have to be
“entitised” this way. Either the comment would have to be removed, or
the escaping would have to be done in the translated files only, leaving
the comment in the English files intact. This could involve some manual
work, or maybe result in still having the script solution around. Of
course, this is only a problem if the comment is inside a paragraph, or
you can avoid commenting out their comment.
Anyway, it would need to be something else than – as it is the
same as the "–" (U+2013) character which would lead to syntactical
errors. There could be defined a special double dash entity in the book,
though.
> That would not eliminate the merging/compiling problems at once, at
> least not for translations already running.
>
> Already running translations would have to catch up with this change
> in the English version before they could eliminate the
> replacing/reinstating steps.
> New translations could start with the according revision of the
> English text, or handle the change on the first update/merge.
>
> What do you think?
Personally I don’t think this doubledash thing is any annoyance at all,
because the conversion is fully automatic. Both operations — "make
commitmode" and "make editmode" — takes 1.5 seconds each to run, and
then the files are ready. Because the "ﳢ"/ﳢ character is unique,
the Makefile can do whatever it needs to those characters, for example
the "make sync" operation first automatically removes the characters
before the merge, making the English commented-out blocks identical to
the English files to avoid conflicts in those lines, and after the merge
is done, the characters are automatically put into place, making the XML
files valid again. The biggest problem with merging is if conflicts
occur — r1362 and r1465, say no more. :) But revolutions like that are
pretty rare and most conflicts are pretty easy to solve.
So in the end I believe replacing all the "--" in the English text would
lead to more work than just leaving them in place, as all the 2511
occurences of double dashes first would have to be replaced, and then
the authors and translators would have to stick to these entities every
time a "--" comes around.
Cheers,
Øyvind A. Holm
--
#!/bin/bash
for f in 1 2 3; do
PREF=http://musthave.sunbase.org/Stallman/stallman${f}c
wget $PREF.sub ; mplayer -cache 8192 -sub stallman${f}c.sub $PREF.mpeg
done
More information about the svnbook-dev
mailing list