rev 32 - trunk/gc
jimb at sanpietro.red-bean.com
jimb at sanpietro.red-bean.com
Sat Aug 2 16:23:34 CDT 2003
Author: jimb
Date: 2003-08-02 16:23:31 -0500 (Sat, 02 Aug 2003)
New Revision: 32
Added:
trunk/gc/generic-map.h
Removed:
trunk/gc/gc-map.h
Log:
Rename gc-map.h to generic-map.h.
Deleted: trunk/gc/gc-map.h
===================================================================
--- trunk/gc/gc-map.h 2003-08-02 21:22:00 UTC (rev 31)
+++ trunk/gc/gc-map.h 2003-08-02 21:23:31 UTC (rev 32)
@@ -1,361 +0,0 @@
-/* map.h --- tracking GC'd memory
- Jim Blandy <jimb at red-bean.com> --- July 2003 */
-
-#ifndef MINOR_GC_MAP_H
-#define MINOR_GC_MAP_H
-
-/* The GC map is a table mapping every heap object's address onto a
- gc_page structure describing the page the object lives in.
- This structure says which generation the objects it contains belong
- to, and is also where we record doting objects.
-
- A "doting object" is an object in one generation that points to an
- object in a younger generation. A "doting page" is a page on which
- a doting object starts. Doting objects can be quite large, and
- cover many pages, but only the page on which a doting object starts
- is a doting page.
-
- Function of the GC Map ============================================
-
- In more detail, here are the jobs the GC map needs to do:
-
- - The whole idea of generational garbage collection is to usually
- collect only part of the heap. Occasionally, you'll need to do a
- full collection, but if you can focus your time on portions of
- the heap that contain more garbage, then that time will be more
- productive, and free up more memory for the mutator to waste.
-
- But to restrict collection to a limited portion of the heap, the
- collector needs to be able to find all pointers from the
- uncollected portion into the collected portion: these act as
- additional roots for the partial collection.
-
- Since, in practice, pointers from older objects to younger
- objects are rare, we can reduce the amount of bookkeeping needed
- here by, when collecting generation G, always collecting all
- generations younger than G as well. This means we only need to
- track pointers in objects in older generations to objects in
- younger generations --- the rare kind. These are the doting
- objects.
-
- How do we track such pointers? Since a newly allocated object
- can only be initialized with pointers to existing objects, an
- object can become a doting object only by mutation. Thus, every
- bit of code that mutates a heap object in Minor needs to include
- a "write barrier": code that allows the GC to check whether a
- doting object has been created, and record it in the GC map, for
- the collector to use in finding roots for partial collections.
-
- - We also need to be able to quickly determine which generation an
- object belongs to, to recognize when a pointer points out of the
- portion of the heap we're collecting.
-
- - When we've finished collecting, we need to be able to find all
- the pages belonging to now-empty "from" spaces, to free them.
-
-
- Dynamically vs. Statically Allocated Objects ======================
-
- There are two ways objects can come into existence:
-
- - The mutator can allocate them in the usual way, with 'cons',
- 'make-vector', etc.
-
- - Executable files and shared libraries may contain objects,
- constructed at compile-time, linked by the system linker, and
- introduced into memory by the kernel doing an 'exec' or the
- dynamic linker.
-
- In the first case, code generated by Minor, or hand-written for
- Minor, handles the allocation, so it can follow whatever
- conventions we find useful.
-
- But in the second case, Minor has only limited control over the
- allocation. Minor can ensure that all the heap objects in a
- particular executable or shared library appear in one contiguous
- chunk, not interleaved with other sorts of non-heap data --- from C
- code, say. But the GC has no way to find out at run time where
- each executable/shared library's chunk of heap objects is. (I
- think we'd need a custom linker script, or some messy stuff based
- on the C++ static initializer support, but, bleah.) This means
- that the GC can't reliably free up such memory for re-use; it can't
- tell where Minor heap objects end and foreign non-heap objects
- begin. That, in turn, means that the GC might as well never
- relocate such objects, or even bother to collect them at all ---
- much better to simply ignore them, except to track old->young
- pointers.
-
- So, when we allocate fresh pages for a thread to allocate from, we
- mark them in the GC map as belonging to generation zero, the
- youngest generation. And when we allocate pages to hold objects
- the collector is promoting from one generation to the next, we
- record the appropriate generation for them as well. But we assume
- that all other pages belong to generation seven, the "immortal
- generation". Any objects that we find here must have come from
- executable files or shared libraries. Other objects are never
- promoted into the immortal generation --- they come to rest in
- generation six.
-
- Note that when we load a .o file ourselves --- say, when we load a
- module previously compiled by the ahead-of-time compiler --- that's
- Minor code turning that stream of bytes into objects and
- procedures, not the kernel or the dynamic linker. Since the
- allocation is under our code's control, we can place the .o file's
- objects (and procedures) in any generation we want. So loading .o
- files falls in the first category of allocation, not the second.
-
-
- Mutators' Interface to the GC Map ===================================
-
- The mutators' write barrier code does not access the GC map
- directly. Instead, mutator threads simply construct store lists
- --- lists of every object they've ever mutated --- and hand them to
- the collector when needed. When a collection starts, the collector
- records the potentially doting objects mentioned in each thread's
- store list in the GC map, and then throws the store lists away.
- This indirect arrangement has the following advantages:
-
- - Mutators don't need to know about the GC map structure. It's way
- too complex to be part of a stable ABI. The GC map remains
- strictly internal to the GC.
-
- - The overhead of the write barrier is the allocation of one pair.
- The pair is allocated in the new object area, so the cache is
- always hot.
-
- - Since this map is only updated and consulted by the GC, it
- doesn't compete for registers and cache with real mutator code at
- every store operation; it only gets involved when a GC is about
- to happen, which trashes both of those things anyway.
-
- - Since store lists are per-thread, we never have to think about
- synchronization when building them.
-
- - Since mutator threads never access the GC map directly, we don't
- have to worry about synchronization when accessing it, either. */
-
-
-/* For every 4kb page managed by the garbage collector, we have an
- instance of the following structure.
-
- (Since there is an instance of this structure for every page, it
- needs to be kept small. 8b : 4kb :: 1 : 512.)
-
- From the "premature optimization is the root of all evil" dept:
-
- At the machine code level, fetching an (unsigned) bit field
- turns into:
- - a memory reference to fetch the word containing the bit field,
- - a mask, to get rid of bits that don't belong to the field, and
- - a right shift, to put the bitfield's least significant bit at
- the right end of the register.
-
- But note that a lot of fields in this struct are indices within a
- page, or portions of page addresses. So the first thing we're
- going to do with such values is shift them left again, to multiply
- by 8 (for first_doting_object and last_doting_object) or by 4k (for
- next_doting_page and next_generation_page). So the compiler could
- combine the right shift of the field fetch and the left shift of
- the multiply into a single operation, net left or net right.
-
- We can do even better: if we make sure that the right shift (the
- bitfield's position within the word) and the left shift (the log2
- of the factor we need to multiply it by to get a page offset or a
- page address) are the *same*, then the shifts cancel each other
- out, and all we need to do is fetch and mask.
-
- So first_doting_object, last_doting_object, next_doting_page, and
- next_generation_page are all aligned this way. Since page
- addresses and offsets within a page are disjoint portions of an
- address word, things fit together pretty nicely. */
-struct gc_page
-{
- /* The following three fields should all pack into a single 32-bit
- word. */
-
- /* The generation to which the objects in this page belong. Zero is
- the youngest generation. Seven is the "dummy generation", used
- for memory areas we haven't allocated a separate gc_page
- arary for yet. */
- unsigned generation : 3;
-
- /* If this is a doting page, this is the offset within this page of
- the start of the first doting object that begins on this page ---
- divided by eight. If this is not a doting page, then
- last_doting_object == 0 and first_doting_object > 0.
-
- 4k / 8 == 512, so we need nine bits for this field. To find all
- the doting pointers, we start here and scan until
- last_doting_object. */
- unsigned first_doting_object : 9;
-
- /* All the pages that contain doting objects are kept in a
- singly-linked list; there is one list per generation. This field
- is the link in that list: the address of the next such page in
- this generation, divided by 4k. For the last page in the chain,
- this field is zero. */
- unsigned next_doting_page : 20;
-
- /* The following three fields should all pack into a single 32-bit
- word. */
-
- /* Unused bits! */
- unsigned : 3;
-
- /* If this is a doting page, this is the offset within this page of
- the start of the last doting object that begins on this page ---
- divided by eight. If this is not a doting page, then this is
- zero. */
- unsigned last_doting_object : 9;
-
- /* All the pages in a generation are kept in a singly-linked list.
- All free pages are kept in a list, too. This is the link in
- those lists. This is the address of the next page in the list
- --- divided by 4k. If this is the last page in the list, this is
- zero. */
- unsigned next_generation_page : 20;
-};
-
-
-/* The map of all pages is a two-level tree. Given a 32-bit address
- ADDR, the 'struct gc_page' for that page is:
-
- mn__gc_map[(ADDR >> 22) & 0x3ff][(ADDR >> 12) & 0x3ff]
-
- In other words, we use the top ten bits of the object's address to
- index the top-level array, yielding a pointer to a second-level
- array; then we use the next ten bits to index into that array,
- yielding a gc_page structure for a particular page.
-
- Initially, before we've allocated any heap pages at all, every
- entry in mn__gc_map points to the same second-level array
- object: mn__gc_immortal_pages. This creates the appearance of a
- fully populated tree, with a gc_page struct for every 4k
- page in the IA-32's 32-bit address space --- even though
- mn__gc_map and mn__gc_immortal_pages occupy only 1k * 4b +
- 1k * 8b == 12kb.
-
- As we allocate pages for newly allocated objects, or for to-spaces
- during collection, we need to record these allocations in the map.
- Since mn__gc_immortal_pages is (potentially) shared by many
- top-level array entries, we handle things in a copy-on-write
- fashion: when the gc_page struct we want to tweak is
- actually an element of mn__gc_immortal_pages, we allocate a fresh
- second-level table, initialize it to be a copy of
- mn__gc_immortal_pages, change the appropriate entry in the
- top-level array to point to it, and then tweak the appropriate
- gc_page. So as the program runs, we dedicate map memory
- only to the interesting parts, without making any assumptions about
- where in the address space malloc/mmap will give us pages from.
-
- As described above, GNU/Linux doesn't tell us which regions of an
- executable or shared library contain heap objects: we just
- occasionally find heap references to objects on pages we've never
- touched before. So the initial state of an gc_page struct
- has to be appropriate for such objects. This tells us several
- things:
-
- - The pages' generation should be the immortal generation ---
- generation seven.
-
- - The pages (initially) contain no doting objects. Objects in
- executables or shared libraries may only point to objects in
- other executables or shared libraries, since they were linked by
- the static linker: otherwise, the static linker would have
- complained about unresolved references.
-
- - The pages will never be freed. We don't scavenge objects from
- executables or shared libraries: we can't be sure where the
- regions of heap objects start and end, so we couldn't free the
- area for reuse after the live objects have been copied out of
- them anyway. So next_generation_page and first_contiguous don't
- need to be initialized to anything special.
-
- Thus, the default gc_page struct looks like this:
-
- {
- generation = 7,
- first_doting_object = 1,
- next_doting_page = 0,
- last_doting_object = 0,
- next_generation_page = 0
- }
-
- Every element of mn__gc_immortal_pages looks like that.
-
- Of course, the mutator may create doting objects in executables and
- shared libraries, so it's not the case that every executable or
- shared object page will always look like this. But initially, this
- is fine.
-
- Since the write barrier records the offsets of the first and last
- doting objects in a page, and the GC never looks outside that
- range, things will work correctly even if the initial or tail end
- of a page holds non-heap objects. So if the linker concatenates
- the .minor.data section build by the Minor compiler with the .data
- or .bss section built by the C compiler, for example, things will
- be fine.
-
- Problems would arise if non-heap objects were interleaved with heap
- objects on a page: if first_doting_object happened to end up
- pointing before some non-heap objects, and last_doting_object
- happened to end up pointing after them, then the scan for doting
- pointers would end up sweeping through non-heap objects.
-
- However, that sort of interleaving can't happen:
-
- - Within a single executable or shared library, the static linker's
- normal behavior is to concatenate all the .minor.data sections,
- without interleaving other sections. So we don't have to worry
- about intra-exec or intra-shared library interleavings.
-
- - The IA-32 ABI requires that ELF load segments be aligned on 4kb
- page boundaries. This means that two non-empty data segments
- can't appear on the same page. So we don't have to worry about
- inter-executable or inter-shared library interleavings, either.
-
- Using the oldest generation, generation 7, as the "immortal"
- generation means that the collector's test for whether to scavenge
- an object doesn't need a special case to recognize immortal
- objects. The obvious way to write the test, "Is this object's
- generation less than or equal to the oldest generation we're
- collecting?" will correctly decline to traverse immortal objects.
- Since the collector asks this of every object it touches, it's
- important for this test to be fast. */
-extern struct gc_page *mn__gc_map[1 << 10];
-
-/* The array of immortal pages. */
-extern struct gc_page mn__gc_immortal_pages[1 << 10];
-
-
-/* The 'struct gc_page' object for ADDR. */
-#define GC_PAGE(addr) \
- (mn__gc_map[((unsigned int) (addr) >> 22) & 0x3ff] \
- [((unsigned int) (addr) >> 12) & 0x3ff])
-
-
-/* A single heap generation. */
-struct gc_generation
-{
- /* The base address of the first page in this generation, or zero if
- the generation contains no pages. This is invalid in the
- immortal generation. */
- void *first_generation_page;
-
- /* The base address of the first doting page in this generation.
- Zero if the generation contains no doting pages. */
- void *first_doting_page;
-
- /* How many collections we've done since the last time we collected
- any generations older than this. */
- int collections;
-};
-
-
-/* The table of all generations. Generation zero is the youngest
- generation. Generation 7 is the immortal generation, for pages in
- executables and shared libraries (actually, for any page we didn't
- allocate ourselves). */
-extern struct gc_generation mn__gc_generations[8];
-
-#endif /* MINOR_GC_MAP_H */
Copied: trunk/gc/generic-map.h (from rev 31, trunk/gc/gc-map.h)
More information about the Minor
mailing list