Thoughts on Chapter 5

Sun Mar 4 15:27:42 CST 2007

C. Michael Pilato wrote:
> C. Michael Pilato wrote:
>> Brian W. Fitzpatrick wrote:
>>> You may need to take some of these comments with a grain of salt as I
>>> personally don't recommend that people use bdb at all.  Aren't we
>>> going to prescribe one over the other?
>> Even if we prescribe one, the book should still contain the information
>> necessary to assist those who chose the other one.  I personally would
>> still stay on this side of an outright prescription of FSFS; but yeah, I
>> think we should be able to say something like, "These days, most folks
>> choose FSFS for its flexibility in various deployment scenarios and ease
>> of administration."
> 
> So, I didn't make a clear prescription, because my conscience won't
> allow me to do so.  The memory of data-losing FSFS bugs as recent as
> Subversion 1.3 is far too clear in my mind.  But I *did* state that
> today both backends should be deemed reliable.  I added a comparison
> table entry "Reliability: data integrity" which points out that newer
> FSFS should be great, and BDB is also great but only if properly
> deployed.  And I also explained why FSFS is pretty much the (correct)
> choice everyone makes today, without falsely slandering Berkeley DB.  To
> my knowledge, there has never been a data lossage bug in Berkeley DB
> that didn't turn out to be a problem with the deployment configuration.

Catching up on my email here.

Several years back with my public svn repository, IIRC on Red Hat 9 
using the system's Berkeley DB, I would get a hosed repository once a 
month or so, and svnadmin recover and BDB recover didn't work.  My 
recovery method was to shut down Apache and rsync the last hot backup 
over the repository path that Apache used.

So while this isn't a data loss bug of the sort that FSFS gets where you 
can have a single lost revision, in this case, without the hot backups, 
I would have lost the entire repository, at least with the BDB tools I 
knew how to use.  So I would consider it a data loss bug.

This raises the point that the failure scenarios are different.  I never 
suffered from an internal data consistency issue with BDB, just the 
entire repository :)  With FSFS, you can have the entire repository be 
up, but find out later that you have a bad revision, hence the 
recommendation for 'svnadmin validate' runs every so often.

Regards,
Blair