Guile Mailing List Archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: avl-trees vs. hashes



Jay Glascoe <jglascoe@jay.giss.nasa.gov> writes:

 > 1.  not storing the key's hash in the entry would be a big win for
 > deletion in certain, relatively common situations.  For example, if the
 > keys and values you use are small (relative to the size of the [big]
 > integer hash object), like small strings, integers, booleans, then
 > 
 > (hash (key . value))
 > 
 > is much larger than
 > 
 > (key . value)
 > 
 > and guile will have a much easier time gc'ing the smaller of the two.

What do you mean, "for deletion"?  Because of the GC, more memory =
more CPU = slower in general (because more memory gets scanned), so
including the hash value makes the program slower in general.

OTOH, all hash operations requiring looking for items might be faster
because you can test for hash equality before testing key equality &
the hash equality test might be faster.

 > 2.  not storing the hash would be a big loss for resize since it's
 > certainly faster (err, at least I hope it is) to do this
 > 
 > index = (scm_num2long(SCM_CAR(entry), "foo", "bar")) % vec_len;
 > 
 > rather than
 > 
 > hash = my_fave_hasher(SCM_CAR(entry));
 > index = hash % vec_len;
 > 
 > but then again, the speed of the hasher depends on the size of the key
 > you're hashing.  So if you've small keys...

I can't imagine the hasher being faster than just looking up the
number, so it should always be faster to look them up than to
recompute all the hashes.  However, it might be a minor point for
small keys because of the other work needed when rehashing.

But, using the above trick it can make all ops that require locating
an object faster (as mentioned above).

-- 
Harvey J. Stein
BFM Financial Research
hjstein@bfr.co.il

Guile Home | Main Index | Thread Index