[svnbook commit] r2686 - trunk/src/en/book

cmpilato noreply at red-bean.com
Fri Feb 16 03:35:04 CST 2007


Author: cmpilato
Date: Fri Feb 16 03:35:03 2007
New Revision: 2686

Modified:
   trunk/src/en/book/ch-repository-admin.xml

Log:
Fill in the 'Repository Replication' section.  Yeesh.

Modified: trunk/src/en/book/ch-repository-admin.xml
==============================================================================
--- trunk/src/en/book/ch-repository-admin.xml	(original)
+++ trunk/src/en/book/ch-repository-admin.xml	Fri Feb 16 03:35:03 2007
@@ -2242,7 +2242,374 @@
     <sect2 id="svn.reposadmin.maint.replication">
       <title>Repository Replication</title>
 
-      <para></para>
+      <para>There are several scenarios in which it is quite handy to
+        have a Subversion repository whose version history is exactly
+        the same as some other repository's.  Perhaps the most obvious
+        one is the maintenance of a simple backup repository, used
+        when the primary repository has become inaccessible due to a
+        hardware failure, network outage, or other such annoyance.
+        Other scenarios include deploying mirror repositories to
+        distribute heavy Subversion load across multiple servers, use
+        as a soft-upgrade mechanism, and so on.</para>
+
+      <para>The <command>svnsync</command> program, which is new to
+        the 1.4.0 release of Subversion, provides all the
+        functionality required for maintaining a read-only mirror of a
+        Subversion repository.</para>
+
+      <screen>
+$ svnsync help
+general usage: svnsync SUBCOMMAND DEST_URL  [ARGS & OPTIONS ...]
+Type 'svnsync help <subcommand>' for help on a specific subcommand.
+Type 'svnsync --version' to see the program version and RA modules.
+
+Available subcommands:
+   initialize (init)
+   synchronize (sync)
+   copy-revprops
+   help (?, h)
+$
+</screen>
+
+      <para><command>svnsync</command> works by essentially asking the
+        Subversion server to <quote>replay</quote> revisions, one at a
+        time.  It then uses that revision information to mimic a
+        commit of the same to another repository.  Neither repository
+        needs to be locally accessible to
+        <command>svnsync</command>—its parameters are repository
+        URLs, and it does all its work through Subversion's repository
+        access interfaces.  All you need is read access to the source
+        repository; commit access and revision property modification
+        access to the destination repository.</para>
+
+      <note>
+        <para>When using <command>svnsync</command> against a remote
+          source repository, the Subversion server for that repository
+          must be running Subversion version 1.4 or better.</para>
+      </note>
+
+      <para>Assuming you already have a source repository that you'd
+        like to mirror, the next thing you need is an empty target
+        repository which will actually serve as that mirror.  This
+        target repository can use either of the available filesystem
+        data-store back-ends (see <xref
+        linkend="svn.reposadmin.basics.backends" />), but it must not
+        yet have any version history in it.  The protocol via which
+        <command>svnsync</command> communicates revision information
+        is highly sensitive to mismatches between the versioned
+        histories contained in the source and target repositories.
+        For this reason, while <command>svnsync</command> cannot
+        <emphasis>demand</emphasis> that the target repository be
+        read-only,
+        <footnote>
+          <para>In fact, it can't truly be read-only, or
+            <command>svnsync</command> itself would have a tough time
+            copying revision history into it.</para>
+        </footnote>
+        allowing the revision history in the target repository to
+        change by any mechanism other than the mirroring process is a
+        recipe for disaster.</para>
+
+      <warning>
+        <para>Do <emphasis>not</emphasis> modify a mirror repository
+          in such a way as to cause its version history to deviate
+          from that of the repository it mirrors.  The only commits
+          and revision property modifications that ever occur on that
+          mirror repository should be those performed by the
+          <command>svnsync</command> tool.</para>
+      </warning>
+
+      <para>Another requirement of the target repository is that the
+        <command>svnsync</command> process be allowed to modify
+        certain revision properties.  <command>svnsync</command>
+        stores its bookkeeping information in special revision
+        properties on revision 0 of the destination repository.
+        Because <command>svnsync</command> works within the framework
+        of that repository's hook system, the default state of the
+        repository (which is to disallow revision property changes;
+        see <xref linkend="svn.ref.reposhooks.pre-revprop-change" />)
+        is insufficient.  You'll need to explicitly implement the
+        pre-revprop-change hook, and your script must allow
+        <command>svnsync</command> to set and change its special
+        properties.  With those provisions in place, you are ready to
+        start mirroring repository revisions.</para>
+
+      <tip>
+        <para>It's a good idea to implement authorization measures
+          which allow your repository replication process to perform
+          its tasks while preventing other users from modifying the
+          contents of your mirror repository at all.</para>
+      </tip>
+
+      <para>Let's walk through the use of <command>svnsync</command>
+        in a somewhat typical mirroring scenario.  We'll pepper this
+        discourse with practical recommendations which you are free to
+        disregard if they aren't required by or suitable for your
+        environment.</para>
+
+      <para>As a service to the fine developers of our favorite
+        version control system, we will be mirroring the public
+        Subversion source code repository and exposing that mirror
+        publicly on the Internet, hosted on a different machine than
+        the one on which the original Subversion source code
+        repository lives.  This remote host has a global configuration
+        which permits anonymous users to read the contents of
+        repositories on the host, but requires users to authenticate
+        in order to modify those repositories.  (Please forgive us for
+        glossing over the details of Subversion server configuration
+        for the moment—those are covered thoroughly in <xref
+        linkend="svn.serverconfig" />.)  And for no other reason than
+        that it makes for a more interesting example, we'll be driving
+        the replication process from a third machine, the one which
+        we currently find ourselves using.</para>
+
+      <para>First, we'll create the repository which will be our
+        mirror.  This and the next couple of steps do require shell
+        access to the machine on which the mirror repository will
+        live.  Once the repository is all configured, though, we
+        shouldn't need to touch it directly again.</para>
+
+      <screen>
+$ ssh admin at svn.example.com
+svn> svnadmin create /path/to/repositories/svn-mirror
+svn>
+</screen>
+
+      <para>At this point, we have our repository, and due to our
+        server's configuration, that repository is now
+        <quote>live</quote> on the Internet.  Now, because we don't
+        want anything modifying the repository except our replication
+        process, we need a way to distinguish that process from other
+        would-be committers.  To do so, we use a dedicated username
+        for our process.  Only commits and revision property
+        modifications performed by the special username
+        <literal>syncproc</literal> will be allowed.</para>
+
+      <para>We'll use the repository's hook system both to allow the
+        replication process to do what it needs to do, and to enforce
+        that only it is doing those things.  We accomplish this by
+        implementing two of the repository event
+        hooks—pre-revprop-change and start-commit.  Our
+        <filename>pre-revprop-change</filename> hook script is found
+        in <xref
+        linkend="svn.reposadmin.maint.replication.pre-revprop-change"
+        />, and basically verifies that the user attempting the
+        property changes is our <literal>syncproc</literal> user.  If
+        so, the change is allowed; otherwise, it is denied.</para>
+
+      <example id="svn.reposadmin.maint.replication.pre-revprop-change">
+        <title>Mirror repository's pre-revprop-change hook script</title>
+
+        <programlisting>
+#!/bin/sh 
+
+USER="$3"
+
+if [ "$USER" = "syncproc" ]; then exit 0; fi
+
+echo "Only the syncproc user may change revision properties" >&2
+exit 1
+</programlisting>
+      </example>
+
+      <para>That covers revision property changes.  Now we need to
+        ensure that only the <literal>syncproc</literal> user is
+        permitted to commit new revisions to the repository.  We do
+        this using a <filename>start-commit</filename> hook scripts
+        like the one in <xref
+        linkend="svn.reposadmin.maint.replication.start-commit"
+        />.</para>
+
+      <example id="svn.reposadmin.maint.replication.start-commit">
+        <title>Mirror repository's start-commit hook script</title>
+
+        <programlisting>
+#!/bin/sh 
+
+USER="$2"
+
+if [ "$USER" = "syncproc" ]; then exit 0; fi
+
+echo "Only the syncproc user may commit new revisions" >&2
+exit 1
+</programlisting>
+      </example>
+
+      <para>After installing our hook scripts and ensuring that they
+        are executable by the Subversion server, we're finished with
+        the setup of the mirror repository.  Now, we get to actually
+        do the mirroring.</para>
+
+      <para>The first thing we need to do with
+        <command>svnsync</command> is to register in our target
+        repository the fact that it will be a mirror of the source
+        repository.  We do this using the <command>svnsync
+        initialize</command> subcommand.</para>
+
+      <screen>
+$ svnsync initialize http://svn.example.com/svn-mirror \
+                     http://svn.collab.net/repos/svn \
+                     --username syncprop --password syncpass
+Copied properties for revision 0.
+$
+</screen>
+
+      <para>Our target repository will now remember that it is a
+        mirror of the public Subversion source code repository.
+        Notice that we provided a username and password as arguments
+        to <command>svnsync</command>—that was required by the
+        pre-revprop-change hook on our mirror repository.</para>
+
+      <note>
+        <para>The URLs provided to <command>svnsync</command> must
+          point to the root directories of the target and source
+          repositories, respectively.  The tool does not handle
+          mirroring of repository subtrees.</para>
+      </note>
+
+      <para>And now comes the fun part.  With a single subcommand, we
+        can tell <command>svnsync</command> to copy all the
+        as-yet-unmirrored revisions from the source repository to the
+        target.
+        <footnote>
+          <para>Be forewarned that while it will take only a few
+            seconds for the average reader to parse this paragraph and
+            the sample output which follows it, the actual time
+            required to complete such a mirroring operation is, shall
+            we say, quite a bit longer.</para>
+        </footnote>
+        The <command>svnsync synchronize</command> subcommand will
+        peek into the special revision properties previously stored on
+        the target repository, and determine what repository it is
+        mirroring and that the most recently mirrored revision was
+        revision 0.  Then it will query the source repository and
+        determine what the latest revision in that repository is.
+        Finally, it asks the source repository's server to start
+        replaying all the revisions between 0 and that latest
+        revision.  As <command>svnsync</command> get the resulting
+        response from the source repository's server, it begins
+        forwarding those revisions to the target repository's server
+        as new commits.</para>
+
+      <screen>
+$ svnsync synchronize http://svn.example.com/svn-mirror \
+                      --username syncprop --password syncpass
+Committed revision 1.
+Copied properties for revision 1.
+Committed revision 2.
+Copied properties for revision 2.
+Committed revision 3.
+Copied properties for revision 3.
+…
+Committed revision 23406.
+Copied properties for revision 23406.
+Committed revision 23407.
+Copied properties for revision 23407.
+Committed revision 23408.
+Copied properties for revision 23408.
+</screen>
+
+      <para>Of particular interest here is that for each mirrored
+        revision, there is first a commit of that revision to the
+        target repository, and then property changes follow.  This is
+        because the initial commit is performed by (and attributed to)
+        the user <literal>syncproc</literal>, and datestamped with the
+        time as of that revision's creation.  Also, Subversion's
+        underlying repository access interfaces don't provide a
+        mechanism for setting arbitary revision properties as part of
+        a commit.  So <command>svnsync</command> follows up with an
+        immediate series of property modifications which copy all the
+        revision properties found for that revision in the source
+        repository into the target repository.  This also has the
+        effect of fixing the author and datestamp of the revision
+        to match that of the source repository.</para>
+
+      <para>Also noteworthy is that <command>svnsync</command>
+        performs careful bookkeeping that allows it to be safely
+        interrupted and restarted without ruining the integrity of the
+        mirrored data.  If a network glitch occurs while mirroring a
+        repository, simply repeat the <command>svnsync
+        synchronize</command> command and it will happily pick up
+        right where it left off.  In fact, as new revisions appear in
+        the source repository, this is exactly what you to do
+        in order to keep your mirror up-to-date.</para>
+
+      <para>There is, however, one bit of inelegance in the process.
+        Because Subversion revision properties can be changed at any
+        time throughout the lifetime of the repository, and don't
+        leave an audit trail that indicates when they were changed,
+        replication processes have to pay special attention to them.
+        If you've already mirror the first 15 revisions of a
+        repository and someone then changes a revision property on
+        revision 12, <command>svnsync</command> won't know to go back
+        and patch up its copy of revision 12.  You'll need to tell it
+        to do so manually by using (or with some additionally tooling
+        around) the <command>svnsync copy-revprops</command>
+        subcommand, which simply re-replicates all the revision
+        properties for a particular revision.</para>
+
+      <screen>
+$ svnsync copy-revprops http://svn.example.com/svn-mirror 12 \
+                        --username syncprop --password syncpass
+Copied properties for revision 12.
+$
+</screen>
+
+      <para>That's repository replication in a nutshell.  You'll
+        likely want some automation around such a process.  For
+        example, while our example was a pull-and-push setup, you
+        might wish to have your primary repository push changes to one
+        or more blessed mirrors as part of its post-commit and
+        post-revprop-change hook implementations.  This would enable
+        the mirror to be up-to-date in as near to realtime as is
+        likely possible.</para>
+
+      <para>Also, while it isn't very commonplace to do so,
+        <command>svnsync</command> does gracefully mirror repositories
+        in which the user as whom it authenticates only has partial
+        read access.  It simply copies only the bits of the repository
+        that it is permitted to see.  Obviously such a mirror is not
+        useful as a backup solution.</para>
+
+      <para>As far as user interaction with repositories and mirrors
+        goes, it <emphasis>is</emphasis> possible to have a single
+        working copy that interacts with both, but you'll have to jump
+        through some hoops to make it happen.  First, you need to
+        ensure that both the primary and mirror repositories have the
+        same repository UUID (which is not the case by default).  You
+        can set the mirror repository's UUID by loading a dump file
+        stub into it which contains the UUID of the primary
+        repository, like so:</para>
+
+      <screen>
+$ cat - <<EOF | svnadmin load --force-uuid dest
+SVN-fs-dump-format-version: 2
+
+UUID: 65390229-12b7-0310-b90b-f21a5aa7ec8e
+EOF
+$
+</screen>
+        
+      <para>Now that the two repositories have the same UUID, you can
+        use <command>svn switch --relocate</command> to point your
+        working copy to whichever of the repositories you wish to
+        operate against, a process which is described in <xref
+        linkend="svn.ref.svn.c.switch" />.  There is a possible danger
+        here, though, in that if the primary and mirror repositories
+        aren't in close synchronization, a working copy up-to-date
+        with and pointing to the primary repository will, if relocated
+        to point to an out-of-date mirror, become confused about the
+        apparent sudden loss of revisions it fully expects to be
+        present.</para>
+
+      <para>Finally, be aware that the revision-based replication
+        provided by <command>svnsync</command> is only
+        that—replication of revisions.  It does not include such
+        things as the hook implementations, repository or server
+        configuration data, uncommitted transactions, or information
+        about user locks on repository paths.  Only information
+        carried by the Subversion repository dump file format is
+        available for replication.</para>
 
     </sect2>
 




More information about the svnbook-dev mailing list