the importance of inventories

Tom Lord lord at emf.net
Tue Apr 15 13:51:29 CDT 2003



    me:

    >> That's an important usage case for even SCM tool designers to
    >> consider:  I export a tree as a source distribution from my SCM
    >> repository to a tar bundle.   Joe picks it up and installs it
    >> somewhere.  


    > From: "Sander Striker" <striker at apache.org>

    > Ok, so where are his inventory tags then?  He installed from a
    > source ball, plain source, no add-ons.

More precisely, whatever add-ons are there to record inventory tags
should be easy to manipulate with few special purpose tools and able
to be imported and exported usefully from various source control
regimens.

We're presuming here that Joe has installed a copy of the programs for
generating and applying changesets.   It's reasonable to say that,
along with those, he gets some tools for managing inventory tags --
but with the qualification that such tools should be small, simple,
and not tied to a particular SCM system.  (As should be the changeset
tools generally).

Where inventory tags are in-play, arch currently uses two mechanisms,
as you are probably aware:  implicit tags stored in comments in source
files, and explicit tags stored in .arch-ids subdirectories (for
tagging directories themselves, binary files and other files that you
don't want to carry implicit tags, and symbolic links).

In arch terms -- I'm talking about factoring out the arch commands
`add', `delete', `move', `inventory', and `tree-lint' -- and making
such modifications to those commands as is desirable to make them 
palatable to other SCM systems.

Factoring those commands out of arch is no big deal.   They are not
layered on other elements of arch.

An interesting question, imo, is whether or not those mechanisms can
be put to good use within svn (as I suspect they can).


    > > Later on:  Joe might have modified his tree.   I, meanwhile, need to
    > > extract a changeset from my SCM world to send to Joe, who is going to
    > > apply it without access to my repository.

    > In other words, a patch against a certain revision (the one that was
    > called a release and packaged in a tarball which Joe picked up, etc).

Right ... although the tree that Joe has may no longer be identical to
that release.   If we impose intermediate parties on the chain of
possession leading up to Joe --- he may very well not have access to
the release.


    > > So he needs inventory tags.

    > And how is that going to work when he doesn't start with inventory
    > tags in the first place?

?

To be more explicit: in order to design changesets, some notion of
logical file identity that is independent of any repository or SCM
system is needed.   So there's a big hard problem of designing SCM
problems -- but there's a smaller and separable problem of design
logical file identity as part of designing changesets.   A reasonable
way (the only way I see) to design logical file identity is to define
it as a set of conventions for the layout and content of source trees
and source files.   So when Joe gets his tar bundle, that tar bundle
includes tags.


    > > There's countless variations on that scenario.  (E.g., perhaps Joe is
    > > comparing two trees and sending me back a changeset.)

    > Ah, a patch against a certain revision...

Not "revision" in the SCM sense.   There's no reason to presume that
the two trees Joe is comparing have any identity in an SCM namespace.


    > > Given the need for this portable, repository independent notion of
    > > logical file identity, a wise SCM design direction is to think about
    > > layering SCM on-top of the portable file identity -- rather than
    > > trying to make a big monolith where you can't separate out logical
    > > file identity from repository access.

    > I won't bite...  But Thom, please stop these pin pricks.

It's not a pin prick -- it's a recommendation.   It's something people
in svn-land can actually, practically, do.


    > [...] 
    > > That's a separate but important problem.  In addition to a
    > > "changesets" list, we could have a list for "global-revision-names"
    > > designed to be portable between SCM systems.

    > >     > Agree, the initial filename is not unique.  Which leaves either the current
    > >     > filename or a tag (which darcs will never support any more than I'll keep
    > >     > track of arch's repository format ID).
    > > 
    > > Part of the goal here is to work on standards for interoperability.

    > Lets start with a _simple_ tree patch format.  And a simple
    > implementation that can generate/apply those.

    > Currently I get the feeling this list is aiming to implement a
    > full-fledged changeset engine...

Arch's mkpatch/dopatch have a simple format and implementation.  It
needs things like syntax changes but that's pretty superficial.

Arch's mkpatch/dopatch are factorable away from arch.   They depend on
almost nothing else in arch -- not in any serious way.

Arch's mkpatch/dopatch are not a "full-fledged changeset engine",
although it's good evidence about the quality of their design that a
"full-fledged changeset engine" can be layered on top of them.

-t





More information about the changesets mailing list