[Minor] [minor commit] r669 - trunk/include/minor

Karl Fogel kfogel at red-bean.com
Tue Oct 24 14:07:27 CDT 2006


I found the hash iteration interface a bit confusing (though very clearly
explained).  In the interfaces I'm accustomed to, the 'iterator' object
is distinct from an individual iteration, in other words it's the same
object across all iterations.  That style tends to look like this:

   mn_call_t *C = [...];
   mn_hash_table_t *H = [...];
   mn_hash_iterator_t *i;
   void *key, *value;

   i = mn_hash_iterator (C, H);
   while (MN_HASH_CONTINUE (mn_hash_next (i, &key, &value)))
     {
       /* Could have passed NULL for 'key' or 'value' or both, if
          didn't care about that parameter. */

       [...] ;
     }

   mn_hash_free_iterator (i);

I've invented the macro MN_HASH_CONTINUE() so I can hand wave on what
the possible return values of mn_hash_next() should be :-).

I'm not sure this way is better than what you proposed, however if you're
going to go with your way, you might want to call the iteration objects
"mn_hash_iteration_t" instead of "mn_hash_iterator_t", since the latter
implies a long-lived object that iterates across all the entries.

Independent of the above:

I think the prefix for all Minor hash table functions and types should be
"mn_hash_" consistently, never "mn_hash_table_".  Or the other way
around -- just not a mixture of the two, because with the mixture, the
writer constantly stops to think "Oh, is this one with or without the
'table_' bit?"

Also, some systems permit NULL values in hash tables, which allows
them to be predicate systems without distraction: if the key is there,
predicate is true, if the key is not there, predicate is false, and NULL
is the natural value to use when we don't care a whit about the value.

Of course, this necessitates some API adjustments: you need a function
just for removing an entry from the hash table, your iteration interface
needs an "out-of-band" way to indicate that iteration is over, etc.

I'm not sure there's a right answer here -- I mildly prefer systems that
allow NULL values, but obviously it's never a showstopper -- but in case
you were still considering the question, note that the alternative iteration
interface given above uses a function's return value as the out-of-band
data; so, at least iteration would be ready to handle NULL values.

Finally, regarding your comment beginning "I'm torn: ...", don't be torn! :-)
Your decision to make an opaque, type-independent interface from the start
seems eminently sane to me, anyway.

-K

On 10/24/06, jimb at red-bean.com <jimb at red-bean.com> wrote:
> Author: jimb
> Date: Tue Oct 24 05:25:45 2006
> New Revision: 669
>
> Modified:
>    trunk/include/minor/hash-tables.h
>
> Log:
> First cut at hash table iteration interface.
>
>
> Modified: trunk/include/minor/hash-tables.h
> ==============================================================================
> --- trunk/include/minor/hash-tables.h   (original)
> +++ trunk/include/minor/hash-tables.h   Tue Oct 24 05:25:45 2006
> @@ -48,4 +48,87 @@
>     If SRC and DEST are not both hash tables, abort.  */
>  void mn_hash_table_merge (mn_call_t *, mn_ref_t *dest, mn_ref_t *src);
>
> +
> +/* A hash table iterator represents a subset of entries in a hash
> +   table.  There are five essential operations used with iterators:
> +
> +   - Given a hash table, you can call mn_hash_table_iter to get an
> +     iterator representing all its entries.
> +
> +   - If an iterator represents a non-empty subset of a table's
> +     entries, it has a "current entry".  Given the iterator and the
> +     table, you can get the current entry's key and value with
> +     mn_hash_iter_key and mn_hash_iter_value.  If the iterator is
> +     empty, these functions return NULL.
> +
> +     The hash table implementation gets to pick the "current entry"
> +     from the iterator's subset however it pleases.  However, a given
> +     iterator's current entry doesn't change unless the hash table
> +     itself does.
> +
> +   - Given an iterator I, you can get a new iterator J whose subset is
> +     the same as I, but with the current entry removed.
> +     mn_hash_iter_next does this, and frees I.
> +
> +   - You can free an iterator with mn_free_hash_iter.
> +
> +   This means you can loop over the entries of a hash table H in a
> +   call C with code like this:
> +
> +     mn_hash_iter_t *i;
> +     mn_ref_t *key;
> +
> +     for (i = mn_hash_table_iter (C, H);
> +          key = mn_hash_iter_key (C, H, i);
> +          i = mn_hash_iter_next (C, H, i))
> +       ...;
> +     mn_free_hash_iter (C, H, i);
> +
> +   You could just as well use mn_hash_iter_value in the loop's
> +   condition.
> +
> +   Adding entries to or removing entries from a hash table doesn't
> +   invalidate iterators on that hash table, but it may change the
> +   subsets they refer to arbitrarily, which is just as bad.  (These
> +   operations could cause the hash table to be resized, which scatters
> +   entries about arbitrarily.  As far as I can tell, Python
> +   dictionaries have the same restriction, so I'm assuming it's not
> +   fatal; if that's not so, let me know.)
> +
> +   I'm torn: in the current implementation of hash tables, iterators
> +   are just integers; we cast back and forth between mn_hash_iter_t *
> +   and an integer type, so there's really no storage behind it at all.
> +   mn_free_hash_iter is a no-op.  But that may not always be so, and
> +   we don't want to commit to a particular representation; hash tables
> +   with buckets, for example, would need something more complex.  */
> +
> +/* The type of an iterator over a hash table.  */
> +typedef struct mn_hash_iter_t mn_hash_iter_t;
> +
> +/* Return an iterator representing all entries in TABLE.  If TABLE is
> +   not a hash table, abort.  */
> +mn_hash_iter_t *mn_hash_table_iter (mn_call *, mn_ref_t *table);
> +
> +/* If ITER's subset of TABLE is not empty, return the key / value of
> +   ITER's current entry.  If ITER's subset is entry, return NULL.  If
> +   TABLE is not a hash table, abort.  */
> +mn_ref_t *mn_hash_iter_key (mn_call_t *, mn_ref_t *table,
> +                            mn_hash_iter_t *iter);
> +mn_ref_t *mn_hash_iter_value (mn_call_t *, mn_ref_t *table,
> +                              mn_hash_iter_t *iter);
> +
> +/* Free ITER, and return a new iterator representing the same subset
> +   of TABLE's entries that ITER did, but with ITER's current entry
> +   removed.  */
> +mn_hash_iter_t *mn_hash_iter_next (mn_call_t *, mn_ref_t *table,
> +                                   mn_hash_iter_t *iter);
> +
> +/* Return a new iterator representing the same subset of TABLE as ITER.  */
> +mn_hash_iter_t *mn_copy_hash_iter (mn_call_t *, mn_ref_t *table,
> +                                   mn_hash_iter_t *iter);
> +
> +/* Free ITER.  */
> +void mn_free_hash_iter (mn_call_t *, mn_ref_t *table, mn_hash_iter_t *iter);
> +
> +
>  #endif /* MN_HASH_H */
>
> _______________________________________________
> Minor mailing list
> Minor at red-bean.com
> http://www.red-bean.com/mailman/listinfo/minor
>



More information about the Minor mailing list