Thoughts on Chapter 5
C. Michael Pilato
cmpilato at red-bean.com
Sat Feb 24 18:00:00 CST 2007
Brian W. Fitzpatrick wrote:
> OK. This chapter covers a *ton* of data about an arcane subject and
> it's a nice fluid read, but reading the chapter end-to-end, I felt
> like I had to wade through a *ton* of BDB minutiae that 99% of the
> repository admins won't ever have to deal with. I don't have a
> solution in mind for this, but I found it to be distracting and wonder
> if we can't better title sections that are BDB specific so that FSFS
> admins don't have to read all the way through just to find out that it
> doesn't apply to them.
Yes, that was something we talked about doing -- I simply forgot to do
> You may need to take some of these comments with a grain of salt as I
> personally don't recommend that people use bdb at all. Aren't we
> going to prescribe one over the other?
Even if we prescribe one, the book should still contain the information
necessary to assist those who chose the other one. I personally would
still stay on this side of an outright prescription of FSFS; but yeah, I
think we should be able to say something like, "These days, most folks
choose FSFS for its flexibility in various deployment scenarios and ease
> "Planning Your Repository Organization":
> - one other reason to have separate repositories is when you have
> completely different types of data in each project: eg, one project
> has source code, and another has 100MB Photoshop files in it.
Really? Why is that? (I can't quickly think of a reason why that would
> - The last example of repository organization is one that I've rarely
> seen used. Shouldn't we recommend that most folks use the 1st example
> for multiple projects in a single repo (i.e., I'm not seeing a lot of
> "prescription" here, but mostly "description"
I'm not sure how you missed the prescription-ness of that section. And
I do think it useful to point out "the other way" (which yes, does still
> "Choosing a Data Store":
> In the table:
> - "Scalability: repository size": I don't understand what this
> means--does this mean that fsfs repositories take up less space on
> disk or that you can't use it for repositories with tons of data (and
> if it's the latter, I think it's incorrect--Apache uses fsfs).
That could be more clear, yes? I'm pretty sure that when Ben added
this, he was talking about space consumed on disk.
> - "Performance: Isn't BDB < 10% faster than FSFS in checking out the
> latest revision? I thought ghudson mailed stats on this to the list
> that showed it's a negligible difference.
I'll have to dig that up to verify. (Does anyone else on this list have
a pointer to some stats?)
> -We should note that BDB has an extra dependency: BDB itself
> - Also, doesn't FSFS deal better with mixed repository access
> mechanisms (http:// + svn://)? Should we mention this?
Well, it deals better mixed access by different *OS users*. BDB has no
problem doing http:// + svn:// if httpd and svnserve run as the same
user. But I dunno how to make this fit into a smallish table. :-)
> - Footnote starting "Berkely DB requires": Maybe mention that *no*
> remote filesystem implementation currently does this right?
Why? It's flatly untrue.
> - BDB & FSFS subsections: Maybe these could be divided into a
> "summary" and "gritty details" part? I really doubt that most admins
> give a hoot that BDB directory mods are O(n^2) and FSFS's are O(n).
Oh, I'm happy to toss that little bit altogether.
> - FSFS subsection: fsfs really isn't "immature" any more, and it's
> been stress tested a lot. I'd say that this paragraph is mostly FUD
> and should go.
Agreed. (Though, it's hard not to remember two relatively recent data
lossage bugs in the backend ... something we've never had with BDB.)
> "Creating the Repository"
> - Maybe move the 1st tip up a little bit?
> - Make the Warning more threatening? We had some dude on the #svn
> channel talking about how he edited one of his rev files (I am *not*
> - 1st footnote: I used to agree that the inability to obliterate a
> rev is a feature, but after talking to dozens of people in various
> roles (open source, closed source, including the BSD dudes), I now
> think that it *is* a missing feature. FreeBSD *can't* have to do
> something that would require thousands of people to recheckout huge
> working copies (eg the ports tree).
> "removing dead transaction"
> - Isn't this BDB only? I thought these were no-ops in fsfs...
Gosh, I should hope not. It has file-based transactions, too, which
could get left around on disk. svnadmin rmtxns doesn't have an
BDB-specific code in it.
> "repository recovery":
> - This should be specified as BDB specific in the title
> "repository replication":
> - The 'svn>' prompt confused me--I thought it was some sort of weird
> svn shell at first.
Yeah, that's not necessary. I'll drop it.
> - using the username 'syncprop' in your examples is extremely
> confusing--reminds me of properties. Can't we use harry or sally?
I thought I used "syncproc" (as in "synchronization process"). I don't
want to use harry or sally because I go out the way to recommend that
you setup a custom user for sync stuffs. Oops! I see now that
sometimes I typed "syncprop" by accident. Will fix. Maybe I'll just
make everything use "syncuser", which is more clear.
C. Michael Pilato <cmpilato at red-bean.com>
"The Christian ideal has not been tried and found wanting. It has
been found difficult; and left untried." -- G. K. Chesterton
More information about the svnbook-dev