rev 32 - trunk/gc

jimb at sanpietro.red-bean.com jimb at sanpietro.red-bean.com
Sat Aug 2 16:23:34 CDT 2003


Author: jimb
Date: 2003-08-02 16:23:31 -0500 (Sat, 02 Aug 2003)
New Revision: 32

Added:
   trunk/gc/generic-map.h
Removed:
   trunk/gc/gc-map.h
Log:
Rename gc-map.h to generic-map.h.


Deleted: trunk/gc/gc-map.h
===================================================================
--- trunk/gc/gc-map.h	2003-08-02 21:22:00 UTC (rev 31)
+++ trunk/gc/gc-map.h	2003-08-02 21:23:31 UTC (rev 32)
@@ -1,361 +0,0 @@
-/* map.h --- tracking GC'd memory
-   Jim Blandy <jimb at red-bean.com> --- July 2003  */
-
-#ifndef MINOR_GC_MAP_H
-#define MINOR_GC_MAP_H
-
-/* The GC map is a table mapping every heap object's address onto a
-   gc_page structure describing the page the object lives in.
-   This structure says which generation the objects it contains belong
-   to, and is also where we record doting objects.
-
-   A "doting object" is an object in one generation that points to an
-   object in a younger generation.  A "doting page" is a page on which
-   a doting object starts.  Doting objects can be quite large, and
-   cover many pages, but only the page on which a doting object starts
-   is a doting page.
-
-   Function of the GC Map ============================================
-
-   In more detail, here are the jobs the GC map needs to do:
-
-   - The whole idea of generational garbage collection is to usually
-     collect only part of the heap.  Occasionally, you'll need to do a
-     full collection, but if you can focus your time on portions of
-     the heap that contain more garbage, then that time will be more
-     productive, and free up more memory for the mutator to waste.
-
-     But to restrict collection to a limited portion of the heap, the
-     collector needs to be able to find all pointers from the
-     uncollected portion into the collected portion: these act as
-     additional roots for the partial collection.
-
-     Since, in practice, pointers from older objects to younger
-     objects are rare, we can reduce the amount of bookkeeping needed
-     here by, when collecting generation G, always collecting all
-     generations younger than G as well.  This means we only need to
-     track pointers in objects in older generations to objects in
-     younger generations --- the rare kind.  These are the doting
-     objects.
-
-     How do we track such pointers?  Since a newly allocated object
-     can only be initialized with pointers to existing objects, an
-     object can become a doting object only by mutation.  Thus, every
-     bit of code that mutates a heap object in Minor needs to include
-     a "write barrier": code that allows the GC to check whether a
-     doting object has been created, and record it in the GC map, for
-     the collector to use in finding roots for partial collections.
-
-   - We also need to be able to quickly determine which generation an
-     object belongs to, to recognize when a pointer points out of the
-     portion of the heap we're collecting.
-
-   - When we've finished collecting, we need to be able to find all
-     the pages belonging to now-empty "from" spaces, to free them.
-
-
-   Dynamically vs. Statically Allocated Objects ======================
-
-   There are two ways objects can come into existence:
-
-   - The mutator can allocate them in the usual way, with 'cons',
-     'make-vector', etc.
-
-   - Executable files and shared libraries may contain objects,
-     constructed at compile-time, linked by the system linker, and
-     introduced into memory by the kernel doing an 'exec' or the
-     dynamic linker.
-
-   In the first case, code generated by Minor, or hand-written for
-   Minor, handles the allocation, so it can follow whatever
-   conventions we find useful.
-
-   But in the second case, Minor has only limited control over the
-   allocation.  Minor can ensure that all the heap objects in a
-   particular executable or shared library appear in one contiguous
-   chunk, not interleaved with other sorts of non-heap data --- from C
-   code, say.  But the GC has no way to find out at run time where
-   each executable/shared library's chunk of heap objects is.  (I
-   think we'd need a custom linker script, or some messy stuff based
-   on the C++ static initializer support, but, bleah.)  This means
-   that the GC can't reliably free up such memory for re-use; it can't
-   tell where Minor heap objects end and foreign non-heap objects
-   begin.  That, in turn, means that the GC might as well never
-   relocate such objects, or even bother to collect them at all ---
-   much better to simply ignore them, except to track old->young
-   pointers.
-
-   So, when we allocate fresh pages for a thread to allocate from, we
-   mark them in the GC map as belonging to generation zero, the
-   youngest generation.  And when we allocate pages to hold objects
-   the collector is promoting from one generation to the next, we
-   record the appropriate generation for them as well.  But we assume
-   that all other pages belong to generation seven, the "immortal
-   generation".  Any objects that we find here must have come from
-   executable files or shared libraries.  Other objects are never
-   promoted into the immortal generation --- they come to rest in
-   generation six.
-
-   Note that when we load a .o file ourselves --- say, when we load a
-   module previously compiled by the ahead-of-time compiler --- that's
-   Minor code turning that stream of bytes into objects and
-   procedures, not the kernel or the dynamic linker.  Since the
-   allocation is under our code's control, we can place the .o file's
-   objects (and procedures) in any generation we want.  So loading .o
-   files falls in the first category of allocation, not the second.
-
-
-   Mutators' Interface to the GC Map ===================================
-
-   The mutators' write barrier code does not access the GC map
-   directly.  Instead, mutator threads simply construct store lists
-   --- lists of every object they've ever mutated --- and hand them to
-   the collector when needed.  When a collection starts, the collector
-   records the potentially doting objects mentioned in each thread's
-   store list in the GC map, and then throws the store lists away.
-   This indirect arrangement has the following advantages:
-
-   - Mutators don't need to know about the GC map structure.  It's way
-     too complex to be part of a stable ABI.  The GC map remains
-     strictly internal to the GC.
-
-   - The overhead of the write barrier is the allocation of one pair.
-     The pair is allocated in the new object area, so the cache is
-     always hot.
-
-   - Since this map is only updated and consulted by the GC, it
-     doesn't compete for registers and cache with real mutator code at
-     every store operation; it only gets involved when a GC is about
-     to happen, which trashes both of those things anyway.
-
-   - Since store lists are per-thread, we never have to think about
-     synchronization when building them.
-
-   - Since mutator threads never access the GC map directly, we don't
-     have to worry about synchronization when accessing it, either.  */
-
-
-/* For every 4kb page managed by the garbage collector, we have an
-   instance of the following structure.
-
-   (Since there is an instance of this structure for every page, it
-   needs to be kept small.  8b : 4kb :: 1 : 512.)
-
-   From the "premature optimization is the root of all evil" dept:
-
-   At the machine code level, fetching an (unsigned) bit field
-   turns into:
-   - a memory reference to fetch the word containing the bit field,
-   - a mask, to get rid of bits that don't belong to the field, and
-   - a right shift, to put the bitfield's least significant bit at
-     the right end of the register.
-
-   But note that a lot of fields in this struct are indices within a
-   page, or portions of page addresses.  So the first thing we're
-   going to do with such values is shift them left again, to multiply
-   by 8 (for first_doting_object and last_doting_object) or by 4k (for
-   next_doting_page and next_generation_page).  So the compiler could
-   combine the right shift of the field fetch and the left shift of
-   the multiply into a single operation, net left or net right.
-   
-   We can do even better: if we make sure that the right shift (the
-   bitfield's position within the word) and the left shift (the log2
-   of the factor we need to multiply it by to get a page offset or a
-   page address) are the *same*, then the shifts cancel each other
-   out, and all we need to do is fetch and mask.
-
-   So first_doting_object, last_doting_object, next_doting_page, and
-   next_generation_page are all aligned this way.  Since page
-   addresses and offsets within a page are disjoint portions of an
-   address word, things fit together pretty nicely.  */
-struct gc_page
-{
-  /* The following three fields should all pack into a single 32-bit
-     word.  */
-
-  /* The generation to which the objects in this page belong.  Zero is
-     the youngest generation.  Seven is the "dummy generation", used
-     for memory areas we haven't allocated a separate gc_page
-     arary for yet.  */
-  unsigned generation : 3;
-
-  /* If this is a doting page, this is the offset within this page of
-     the start of the first doting object that begins on this page ---
-     divided by eight.  If this is not a doting page, then
-     last_doting_object == 0 and first_doting_object > 0.
-
-     4k / 8 == 512, so we need nine bits for this field.  To find all
-     the doting pointers, we start here and scan until
-     last_doting_object.  */
-  unsigned first_doting_object : 9;
-
-  /* All the pages that contain doting objects are kept in a
-     singly-linked list; there is one list per generation.  This field
-     is the link in that list: the address of the next such page in
-     this generation, divided by 4k.  For the last page in the chain,
-     this field is zero.  */
-  unsigned next_doting_page : 20;
-
-  /* The following three fields should all pack into a single 32-bit
-     word.  */
-
-  /* Unused bits!  */
-  unsigned : 3;
-
-  /* If this is a doting page, this is the offset within this page of
-     the start of the last doting object that begins on this page ---
-     divided by eight.  If this is not a doting page, then this is
-     zero.  */
-  unsigned last_doting_object : 9;
-
-  /* All the pages in a generation are kept in a singly-linked list.
-     All free pages are kept in a list, too.  This is the link in
-     those lists.  This is the address of the next page in the list
-     --- divided by 4k.  If this is the last page in the list, this is
-     zero.  */
-  unsigned next_generation_page : 20;
-};
-
-
-/* The map of all pages is a two-level tree.  Given a 32-bit address
-   ADDR, the 'struct gc_page' for that page is:
-
-      mn__gc_map[(ADDR >> 22) & 0x3ff][(ADDR >> 12) & 0x3ff]
-
-   In other words, we use the top ten bits of the object's address to
-   index the top-level array, yielding a pointer to a second-level
-   array; then we use the next ten bits to index into that array,
-   yielding a gc_page structure for a particular page.
-
-   Initially, before we've allocated any heap pages at all, every
-   entry in mn__gc_map points to the same second-level array
-   object: mn__gc_immortal_pages.  This creates the appearance of a
-   fully populated tree, with a gc_page struct for every 4k
-   page in the IA-32's 32-bit address space --- even though
-   mn__gc_map and mn__gc_immortal_pages occupy only 1k * 4b +
-   1k * 8b == 12kb.
-
-   As we allocate pages for newly allocated objects, or for to-spaces
-   during collection, we need to record these allocations in the map.
-   Since mn__gc_immortal_pages is (potentially) shared by many
-   top-level array entries, we handle things in a copy-on-write
-   fashion: when the gc_page struct we want to tweak is
-   actually an element of mn__gc_immortal_pages, we allocate a fresh
-   second-level table, initialize it to be a copy of
-   mn__gc_immortal_pages, change the appropriate entry in the
-   top-level array to point to it, and then tweak the appropriate
-   gc_page.  So as the program runs, we dedicate map memory
-   only to the interesting parts, without making any assumptions about
-   where in the address space malloc/mmap will give us pages from.
-
-   As described above, GNU/Linux doesn't tell us which regions of an
-   executable or shared library contain heap objects: we just
-   occasionally find heap references to objects on pages we've never
-   touched before.  So the initial state of an gc_page struct
-   has to be appropriate for such objects.  This tells us several
-   things:
-
-   - The pages' generation should be the immortal generation ---
-     generation seven.
-
-   - The pages (initially) contain no doting objects.  Objects in
-     executables or shared libraries may only point to objects in
-     other executables or shared libraries, since they were linked by
-     the static linker: otherwise, the static linker would have
-     complained about unresolved references.
-
-   - The pages will never be freed.  We don't scavenge objects from
-     executables or shared libraries: we can't be sure where the
-     regions of heap objects start and end, so we couldn't free the
-     area for reuse after the live objects have been copied out of
-     them anyway.  So next_generation_page and first_contiguous don't
-     need to be initialized to anything special.
-
-   Thus, the default gc_page struct looks like this:
-
-    {
-      generation = 7,
-      first_doting_object = 1,
-      next_doting_page = 0,
-      last_doting_object = 0,
-      next_generation_page = 0
-    }
-
-   Every element of mn__gc_immortal_pages looks like that.
-
-   Of course, the mutator may create doting objects in executables and
-   shared libraries, so it's not the case that every executable or
-   shared object page will always look like this.  But initially, this
-   is fine.
-
-   Since the write barrier records the offsets of the first and last
-   doting objects in a page, and the GC never looks outside that
-   range, things will work correctly even if the initial or tail end
-   of a page holds non-heap objects.  So if the linker concatenates
-   the .minor.data section build by the Minor compiler with the .data
-   or .bss section built by the C compiler, for example, things will
-   be fine.
-
-   Problems would arise if non-heap objects were interleaved with heap
-   objects on a page: if first_doting_object happened to end up
-   pointing before some non-heap objects, and last_doting_object
-   happened to end up pointing after them, then the scan for doting
-   pointers would end up sweeping through non-heap objects.
-
-   However, that sort of interleaving can't happen:
-
-   - Within a single executable or shared library, the static linker's
-     normal behavior is to concatenate all the .minor.data sections,
-     without interleaving other sections.  So we don't have to worry
-     about intra-exec or intra-shared library interleavings.
-
-   - The IA-32 ABI requires that ELF load segments be aligned on 4kb
-     page boundaries.  This means that two non-empty data segments
-     can't appear on the same page.  So we don't have to worry about
-     inter-executable or inter-shared library interleavings, either.
-
-   Using the oldest generation, generation 7, as the "immortal"
-   generation means that the collector's test for whether to scavenge
-   an object doesn't need a special case to recognize immortal
-   objects.  The obvious way to write the test, "Is this object's
-   generation less than or equal to the oldest generation we're
-   collecting?" will correctly decline to traverse immortal objects.
-   Since the collector asks this of every object it touches, it's
-   important for this test to be fast.  */
-extern struct gc_page *mn__gc_map[1 << 10];
-
-/* The array of immortal pages.  */
-extern struct gc_page mn__gc_immortal_pages[1 << 10];
-
-
-/* The 'struct gc_page' object for ADDR.  */
-#define GC_PAGE(addr)                                           \
-  (mn__gc_map[((unsigned int) (addr) >> 22) & 0x3ff]            \
-                    [((unsigned int) (addr) >> 12) & 0x3ff])
-
-
-/* A single heap generation.  */
-struct gc_generation
-{
-  /* The base address of the first page in this generation, or zero if
-     the generation contains no pages.  This is invalid in the
-     immortal generation.  */
-  void *first_generation_page;
-
-  /* The base address of the first doting page in this generation.
-     Zero if the generation contains no doting pages.  */
-  void *first_doting_page;
-
-  /* How many collections we've done since the last time we collected
-     any generations older than this.  */
-  int collections;
-};
-
-
-/* The table of all generations.  Generation zero is the youngest
-   generation.  Generation 7 is the immortal generation, for pages in
-   executables and shared libraries (actually, for any page we didn't
-   allocate ourselves).  */
-extern struct gc_generation mn__gc_generations[8];
-
-#endif /* MINOR_GC_MAP_H */

Copied: trunk/gc/generic-map.h (from rev 31, trunk/gc/gc-map.h)





More information about the Minor mailing list