Review of Chapter 6: Server Configuration

Sat Feb 24 18:29:12 CST 2007

On 2/23/07, C. Michael Pilato <cmpilato at red-bean.com> wrote:

> I really, really don't like how the chapter starts off.  Right off the
> bat, "Table 6.1. Network Server Comparison" slams readers with all kinds
> of industry and Subversion buzzwords before they even have a chance to
> get acquainted with what a Subversion server is and provides.  At the
> very least it should be moved down into the "Choosing the Best Server
> Configuration" section.  The list-filled sections "The Apache HTTP
> Server", "The svnserve Server", and "svnserve over SSH" all tell you
> things you might want to know when trying to choose which server to
> implement.  But they precede the section called "Choosing the Best
> Server Configuration".

Just to give you a background on the rationale for the current structure:

For our first edition book, we only had the simple table that lists &
compares server features.  The only text accompanying the table was
something like "here are the facts, now choose whatever server is best
for you."

For years, I've been getting criticism from users@ and #svn about
this.  They say that this table isn't friendly enough.  The typical
feedback was "yeah, the table is really useful, but it still doesn't
tell me which server to use;  I want a human being giving me personal
recommendations, explaining tradeoffs of why I might want (or not
want) to use a particular server."   It sort of ties into our 2nd
edition theme of "recommend best practices."

So my response was to leave the feature comparison table as is, but
then add a bunch of 'human' analysis afterwards.  Why would you want
(or not want) to use a server?  What do we personally recommend for
specific situations?

That said, I'm very much open to your re-structuring suggestions.  :-)

> I think the little bit of text that's in the "Overview" sect1 should be
> worked into the chapter introduction.  There's precious little detail
> there, and we shouldn't need a section called "overview" when that's
> what the chapter introduction is actually for.

But you've lost me here.  You've now twice expressed frustration that
we're not giving enough general information about servers, but I don't
know what this information would be.  At the moment, we present these
bits of info:

  - we're going to explain how to make your repository available over a network
  - svn has an abstract network layer with 2 current implementations
  - one server is apache, which speaks HTTP, the other is svnserve,
which speaks a custom protocol.

What other information would go into a chapter introduction?  You
suggest that users should "get acquainted with what a Subversion
server is and provides", but doesn't this list cover that?  Are there
other bits of info we should add?  Are you suggesting we describe the
type of RA operations a client can perform against a server?  Are you
suggesting that we describe the categories in the feature-table?
("Subversion servers authenticate users, can authorize which sections
of the repository they're allowed to access, etc."?)

> So, then, "Choosing a
> Server Configuration" should be the first sect1, and it should contain
> the list-filled why/why-not sections (as subsections), followed by the
> server comparison table.

This section -- "Choosing a Server Configuration" -- is currently
written as if you had already read the feature table and why/why-not
lists, and are now looking for human analysis and recommendations to
make sense of the list of facts.  However, I think it would be nearly
impossible to make specific recommendations if the users haven't seen
anything about the features of the servers... it's putting the cart
before the horse.

So how about we meet halfway?  Lemme try this:

  * Expand the introduction a bit more (if i can?!)
  * Show the feature table
  * Present the personal recommendations (based on the feature table)
  * Put the why/why-not lists as subsections of the recommendation
section, as you suggest.

I'll do this in a forthcoming commit.

>
> Minor nit:  there is a pretty consistent notion of there being two
> servers -- apache and svnserve.  There's two data columns in the
> comparison table, two sect1's to cover each of the servers.  Yet there
> are three list-filled why/why-not sections (BTW, I'm not using
> "list-filled" as an insult, merely as descriptive).  Why does svnserve
> get bifurcated into with- and without-SSL here but not elsewhere?  Would
> it make sense to consistently treat svn:// and svn+ssh:// as peers
> instead of mild deviations under a common concept?
>

This is a recurring theme -- a tension between 'what the world should
be' versus 'how things really are'.  I mean yes, there are only two
actual server binaries.  Yes, if you're a developer, you only think
about these two servers, and the universe is neatly split between
them.  The whole svn+ssh:// thing is just a minor invocation detail
within the svnserve half of the universe.

But from the users' point of view, svn:// and svn+ssh:// are vastly
different things.  They have utterly different authentication and
authorization infrastructures.  The security implications are
different (mock accounts versus system accounts), as are the potential
headaches related to permissions.  When someone comes into #svn and
asks about choosing svn:// versus svn+ssh://, the latter is never
presented as just a "mild deviation" of the former -- it's its own
beast to consider, with its own set of why/why-not tradeoffs.

So while svn:// and svn+ssh:// both happen to be speaking the same
protocol, that's an implementation detail, and the similarity really
ends there.  An administrator needs to consider them as disparate
deployment options... and that's why I've got the why/why-not lists in
3 categories, rather than 2.

Along that line, you're right, it's odd that the why/why-not has 3
categories, and everything else has 2.  But to reconcile that
difference, my instinct is to move everything to 3, not to 2.  I think
it will make things clearer to administrators -- rather than focus on
"2 servers", we focus on "3 use cases".

Maybe I'll just make the fetaure-comparison table into 3 columns,
rather than 2, so it's clear that the table is about use-cases, not
executables.

> Also, I'd prefer consistent ordering.  Either always mention Apache
> first, or always mention svnserve first.  Don't care which; just want
> the consistency.  (I don't know why.  Consistency has never been a theme
> for me or anything.  /me whistles and looks to and fro nervously.)

OK, will do.

>
> In "Choosing the Best Server Configuration", I'd point out that often
> Apache is a good choice if you already have Apache as part of your
> existing infrastructure, or if other requirements (such as repository
> browsing ala ViewVC) are driving a need for Apache.  In other words, if
> you got Apache already (and already running), why bother setting up
> svnserve, too unless it brings some justifiable benefit?
>

Agreed.

> Why should someone run svnserve as a Windows service instead of as a
> daemon? (I know why, but the book the say.)

Aha!  Thanks, fixed.

> Perhaps the daemon and
> Windows Service modes should be handled in the same section, since they
> are effectively the same high-level concept -- a long-lived svnserve
> process listening for input.  No strong opinion there

Eh, they're all in the 'invocation' section, which seems like enough
of a subdivision to me.  I think it would be too much categorization
to divide the 'invocation' section into short-lived vs. long-lived.

>
> I think the stuff about alternate tunnels (svn+joessh) should either be
> broken into is own section, or the section it is in should be renamed to
> speak generally about tunnels (instead of SSH specifically).
>

Ah, good idea, I've renamed the whole section to "Tunning over SSH",
since that's what it's really about.

> This might be overkill, but sections like the Apache "Authentication
> Options" which indicate that without additional server configury the
> repository is committable by everyone don't take into account that the
> repos admin might have, as part of the knowledge gained in chapter 5,
> used hook scripts to prevent as much.  You don't need to dive into
> details here, but a nod to that possibility might not hurt.

Done.

>
> "Again, the runtime servers file allows you to automate this challenge
> on a per-host basis."  Again?  Did we mention the servers file
> previously in this section?

Yes, we did.  Look back a few paragraphs.  :-)

>
> Lose "Best practice: " from "Best practice: do you really need
> path-based access control?"

Done.

>
> "consider that the Subversion project itself has always a notion of who
> is allowed to commit where" -- I think there's a missing "had".

Fixed!

>
> "…and that's pretty much all there is to it."  I think this quip can be
> removed unceremoniously to the great benefit of the section in which it
> resides.  No need to provide a wrap-up sentence there at all.
>

Yeah, I hate that quip too.  Why did I write that years ago?  Hm.  Killed.