Guile Mailing List Archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Scheme style auto-resizing hashtable
Jay Glascoe <jglascoe@jay.giss.nasa.gov> writes:
> On 18 Oct 1998, Harvey J. Stein wrote:
>
> > What about with a bigger hash table size, such as 2999 or 1499 or
> > 1091? As far as I recall, TCL's (& thus STk's) hash tables resize
> > when a 4th element is inserted into a bucket.
> >
>
> these hashtables resize, the vector's length doubles, when a bucket
> exceeds the max-bucket-size. (make-hashtab 0) is a very bad idea.
>
> >
> > It'd also be interesting to run your speed test in STk.
I did a very rough comparison of Guile's hash tables vs STk's
hash tables. Of course, they're hard to compare because STk & guile
are different interpreters. But, the hash table insertion timings are
different enough & the overhead timings are close enough that it seems
that STk's hash tables are faster than Guile's (on the order of 20%
faster when computing runtime-overhead, and on the order of 50% when
doing runtime-gc-overhead, assuming (gc-run-time) returns time in
milliseconds).
Here're the details:
hashtest.scm:
(debug-disable 'debug)
(define (profile:clock)
(/ (get-internal-run-time)
internal-time-units-per-second))
(define-macro (timeit form)
(let ((g1 (gensym))
(g2 (gensym)))
`(let* ((,g2 ())
(,g1 (profile:clock)))
,form
(set! ,g2 (profile:clock))
(display "Elapsed time : ")
(display (- ,g2 ,g1))
(display " GC time : ")
(display (gc-run-time))
(newline)
(display "GC Stats : ")
(display (gc-stats))
(newline))))
(define (make-hash-table) (make-vector 1009))
(define (do-test func lines number-times)
(do ((i 1 (+ i 1)))
((>= i number-times))
(let ((mydict (make-hash-table)))
(for-each (func mydict) lines))))
(define (insert-entry mydict)
(lambda (line)
(let* ((len (string-length line))
(top (inexact->exact (/ len 10)))
(key (string->symbol (substring line 0 top))))
(hash-set! mydict key line))))
(define (port->list p)
(let loop ((ln (read-line p))
(l '()))
(cond ((not (eof-object? ln))
(loop (read-line p) (cons ln l)))
(else (reverse l)))))
hashtest.stk:
(define (do-test func lines number-times)
(do ((i 1 (+ i 1)))
((>= i number-times))
(let ((mydict (make-hash-table)))
(for-each (func mydict) lines))))
(define (insert-entry mydict)
(lambda (line)
(let* ((len (string-length line))
(top (inexact->exact (/ len 10)))
(key (string->symbol (substring line 0 top))))
(hash-table-put! mydict key line))))
(define (port->list p)
(let loop ((ln (read-line p))
(l '()))
(cond ((not (eof-object? ln))
(loop (read-line p) (cons ln l)))
(else (reverse l)))))
Guile run:
;;; Setup:
guile> (load "hashtest.scm")
guile> (define l (port->list (open-input-file "x.ps")))
;;; Test insertions:
guile> (timeit (do-test insert-entry l 10))
Elapsed time : 2.27 GC time : 76
GC Stats : ((gc-time-taken . 76) (cells-allocated . 59384) (cell-heap-size . 98304) (bytes-malloced . 168731) (gc-malloc-threshold . 227016) (cell-heap-segments (1075433480 . 1075171336) (1075961864 . 1075437576)))
guile> (timeit (do-test insert-entry l 10))
Elapsed time : 2.2 GC time : 111
GC Stats : ((gc-time-taken . 111) (cells-allocated . 87620) (cell-heap-size . 98304) (bytes-malloced . 181773) (gc-malloc-threshold . 227016) (cell-heap-segments (1075433480 . 1075171336) (1075961864 . 1075437576)))
guile> (timeit (do-test insert-entry l 10))
Elapsed time : 2.24 GC time : 155
GC Stats : ((gc-time-taken . 155) (cells-allocated . 48070) (cell-heap-size . 98304) (bytes-malloced . 164513) (gc-malloc-threshold . 227016) (cell-heap-segments (1075433480 . 1075171336) (1075961864 . 1075437576)))
;;; Test overhead:
guile> (timeit (do-test (lambda (h) (lambda (i) #f)) l 10))
Elapsed time : 0.18 GC time : 164
GC Stats : ((gc-time-taken . 164) (cells-allocated . 32618) (cell-heap-size . 98304) (bytes-malloced . 151438) (gc-malloc-threshold . 227016) (cell-heap-segments (1075433480 . 1075171336) (1075961864 . 1075437576)))
guile> (timeit (do-test (lambda (h) (lambda (i) #f)) l 10))
Elapsed time : 0.140000000000001 GC time : 168
GC Stats : ((gc-time-taken . 168) (cells-allocated . 85879) (cell-heap-size . 98304) (bytes-malloced . 167586) (gc-malloc-threshold . 227016) (cell-heap-segments (1075433480 . 1075171336) (1075961864 . 1075437576)))
guile> (timeit (do-test (lambda (h) (lambda (i) #f)) l 10))
Elapsed time : 0.18 GC time : 176
GC Stats : ((gc-time-taken . 176) (cells-allocated . 69803) (cell-heap-size . 98304) (bytes-malloced . 159520) (gc-malloc-threshold . 227016) (cell-heap-segments (1075433480 . 1075171336) (1075961864 . 1075437576)))
guile>
STk run:
;;; Setup:
STk> (load "hashtest.stk")
#[undefined]
STk> (define l (port->list (open-input-file "x.ps")))
#[undefined]
;;; Test insertions:
STk> (time (do-test insert-entry l 10))
;; Time: 1810.00ms
;; GC time: 580.00ms
;; Cells: 554247
#[undefined]
STk> (time (do-test insert-entry l 10))
;; Time: 1800.00ms
;; GC time: 560.00ms
;; Cells: 553911
#[undefined]
;;; Test overhead:
STk> (time (do-test (lambda (h) (lambda (i) #f)) l 10))
;; Time: 140.00ms
;; GC time: 70.00ms
;; Cells: 88769
#[undefined]
STk> (time (do-test (lambda (h) (lambda (i) #f)) l 10))
;; Time: 150.00ms
;; GC time: 80.00ms
;; Cells: 88757
#[undefined]
--
Harvey J. Stein
BFM Financial Research
hjstein@bfr.co.il
Guile Home |
Main Index |
Thread Index