[PATCH] Discuss character set restrictions in book

Charles Bailey bailey.charles at gmail.com
Mon Apr 4 08:30:19 CDT 2005

The patch below has sat without comment on the main dev list for about
a month, so I think I haven't made any egregious technical errors. 
Therefore, I'd like to submit it to the book developers for

Charles Bailey
Lists: bailey _dot_ charles _at_ gmail _dot_ com
Other: bailey _at_ newman _dot_ upenn _dot_ edu

On Wed, 23 Feb 2005 16:32:24 -0500, Charles Bailey
<bailey.charles at gmail.com> wrote:
> Attached is a patch to Book Chapter 3 which expands one of the
> sidebars to discuss how Subversion handles text and path names, and
> therefore what is(n't) allowed in these types of input.  I've waffled
> a fair amount over where to put this -- on the one hand, I thought
> it'd be good to have it in a common enough place that new users (like
> me) would see it before finding problems the hard way; on the other,
> it can get to be a fairly picky and/or fragmented topic.  I tried to
> find a middle ground.  The location early in the book's "detailed"
> description of Subversion seemed reasonable, rather than placing it in
> introductory section, where the detail is less fine, or the developer
> reference, where users might not think to look.  I didn't think any of
> the issues merited a full section on their own, so I tried to combine
> (conflate?) them as smoothly as I could into a sidebar.  The revised
> version does bury the note about "/trunk" a bit, but it gets detailed
> discussion in chapter 4, and the original sidebar was really just a
> pointer to that discussion.

Explain character set restrictions for text and path names.
* docs/book/book/ch03.xml:
  Expand sidebar to discuss character encoding,  and hence restrictions
  on legal characters, for text and path names.

Index: book/ch03.xml
--- book/ch03.xml       (revision 13123)
+++ book/ch03.xml       (working copy)
@@ -312,13 +312,55 @@

-      <title>Repository Layout</title>
+      <title>What's in a name?</title>
+      <para>Subversion tries hard not to limit the type of data you
+      can place under version control.  The contents of files and
+      property values are stored and transmitted as binary data, and
+      the <xref linkend="svn-ch-7-sect-2.3.2"/> tells you how
+      to give Subversion a hint that <quote>textual</quote> operations
+      don't make sense for a particular file.  There are a few places,
+      however, where Subversion places restrictions on information it
+      stores.</para>

-      <para>If you're wondering what <literal>trunk</literal> is all
-        about in the above URL, it's part of the way we recommend
-        you lay out your Subversion repository which we'll talk a lot
-        more about in <xref linkend="svn-ch-4"/>.</para>
+      <para>Subversion handles text internally as UTF-8 encoded
+      Unicode.  As a result, certain items which are
+      inherently <quote>textual</quote>, such as property names, path
+      names, and log messages, can only contain legal UTF-8
+      characters.  It also provides a minimum requirement for use of the
+      <literal>svn:mime-type</literal> property: if a file's contents
+      aren't compatible with UTF-8, you should mark it as a binary
+      file.  Otherwise, Subversion will attempt to merge differences
+      using UTF-8, which is likely to leave garbage in the
+      file.</para>

+      <para>In addition, path names are used as XML attribute values
+      in WebDAV exchanges, as well in as some of Subversion's
+      housekeeping files.  This means that path names can only contain
+      legal XML (1.0) characters.  Subversion also prohibits
+      TAB, CR, and LF in path names, so they aren't broken up
+      in diffs, or in the output of commands like
+      <xref linkend="svn-ch-9-sect-1.2-re-log"/> or
+      <xref linkend="svn-ch-9-sect-1.2-re-status"/>.</para>
+      <para>While it may seem like a lot to remember, in practice
+      these limitations are rarely a problem.  As long as your
+      locale settings are compatible with UTF-8, and you don't use
+      control characters in path names, you should have no trouble
+      communicating with Subversion.  The command line client adds an
+      extra bit of help: it will automatically escape legal
+      characters as needed in URLs you type to create <quote>legally
+      correct</quote> versions for internal use.</para>
+      <para>Experienced users of Subversion have also developed a set
+      of <quote>best practice</quote> conventions for laying out paths
+      in the repository.  While these aren't strict requirements like
+      the syntax described above, they help to organize frequently
+      performed tasks.  The <literal>/trunk</literal> part of the URL
+      above is one of these conventions; we'll talk a lot more about
+      it and related recommendations in <xref
+      linkend="svn-ch-4"/>.</para>

     <para>Although the above example checks out the trunk directory,

## End of patch ##

More information about the svnbook-dev mailing list