Ben's review of chapter 5

Ben Collins-Sussman sussman at red-bean.com
Sun Feb 25 21:47:25 CST 2007


Wow, this took a long time to read.  :-)


* Broad, sweeping suggestions

  * looking at the chapter's TOC, The "Administrator's Toolkit"
    section lists svnlook, svnadmin, svndumpfilter.  Some are
    presented in detail, some not.  I think svnsync needs to be added
    to the list -- at least have its own section, a few sentences
    explaining what it is, and then a xref to the 'replication'
    section.  svnsync is definitely a "tool" in the admin's toolkit,
    right?  At least, if svndumpfilter is, then svnsync should be, no?
    Replicating repositories is just as much an admin's job as
    filtering their dumps.

 * Migrating Repository data Elsewhere

      There are 2 pieces of info missing from this section, that I'd
      like to see:

         - remind users that while a dumpfile is mostly human-readable
           and appears to be plain text, it ain't.  don't try to run
           text-based tools against it and expect things to work
           smoothly.

         - the 2nd paragraph is the section might sorta lead a newbie
           to think that moving from a 'previous' version of
           subversion to a newer version necessitates a dump and
           load.  Even though our release notes scream that this isn't
           the case, it would be nice for the book to reiterate the
           promise that a 1.X->1.Y upgrade never requires a
           dump/load.  I see newbies assuming this all the time.  :-(

  * Repository Backup

      There's a big piece of info I'd like to see in this section
      too.  For years I've seen users ask whether to use 'svnadmin
      dump' or 'svnadmin hotcopy' to make a full backup, and not
      understand the difference between them, or the main tradeoff
      (speed of execution vs. portability).  Can we have a sidebar on
      this or something?  It's a really common FAQ that needs a
      best-practice recommendation.  (I know that this section doesn't
      even mention the possibliity of 'svnadmin dump' for full
      backups, but somehow users still get this idea anyway, as if it
      were the standard way to do things!)



* Nitpicky suggestions

  * The Subversion Repository, Defined:

      - The description of the 'dav' directory seems odd to me;  it
        sounds like Apache and mod_dav_svn are separate things that
        need to store housekeeping data.  I suggest just simplifying
        it to either Apache *or* mod_dav_svn.

  * Planning your Repository Organization

     - "Some folks don't like the fact that even though no changes
       have been made to their project lately, the youngest revision
       number for the repository keeps climbing because other projects
       are actively adding new revisions."

       This issue is personal pet-peeve crusade of mine.  The current
       wording sounds like we're agreeing that "yes, it's potentially
       a problem that HEAD changes even when you're not committing",
       and we don't really believe that, do we?  If we're going to
       bring up this issue at all, maybe we can do it in a way that
       makes it clear that while some people may fear this behavior,
       it's actually a *non*-problem, and not a valid reason to choose
       one vs. many repositories?


     - "tags, which is a directory of branches that are created, and
       perhaps destroyed, but never changed"

       While that's a description of subversion's particular
       implementation of tags, it's not a description of what a tag is
       in general.  In the same sentence, you define branches somewhat
       abstractly, so we should define tags abstractly as well --
       "snapshots of trees" or something.

     - "repository browsing UIs"  -- UI is a bit jargony, maybe just
       "interface"?

   * Repository Data Store Comparison Table

     This is a great table!  But I have a suggestion to make it easier
     to parse.  I notice that the 'feature' column really has two
     halves -- a general category, and a specific sub-feature.  Can we
     split this column hierarchically somehow?  For example, when
     reading the table, I'd like to see "reliability" only once, and
     then "data integrity" and "sensitivity to interruptions" as
     bullets underneath reliability.   Does docbook even allow this?

     Also, yeah, after years of testing, I think it's been proven over
     and over that BDB and FSFS read operations are nearly identical
     in speed.  I'd make the 'performance: checkouts' row say that
     they're effectively equal.

       - "performs any necessary recover" -- err, 'recovery'?

   * Implementing Repository Hooks

     - "run in advance of a high-level repository operation"... I
       think 'high-level' can be struck without losing any meaning.
       It's kind of a meaningless phrase.

   * Berkeley DB Recovery

      "4. Restart the Subversion server" sounds like you might be
      suggesting that they reboot the computer.  Maybe say "restart
      the server process".


    * Repository Replication

       "The svnsync program, which is new to the 1.4.0 release of
       Subversion..."  .... how about just '1.4 release'.  We don't
       use the patch number in any other mentions of svn versions
       elsewhere in the book.

       "All you need is read access to the source repository; commit
       access and revision property modification access to the
       destination repository."  --> Can you either make this two
       separate sentences, or put 'and' between the halves?  The stuff
       after semicolon doesn't even have a verb right now.

       This whole new section kicks butt!  Nice job!

    * Repository Backup

      "If you are regularly syncronizing a read-only mirror" --> typo,
      left out the 'h' in 'synchronizing'.




More information about the svnbook-dev mailing list