Slow regexps.

Harvey J. Stein (hjstein@bfr.co.il)
Mon, 17 Aug 1998 12:43:24 +0300

I just looked over the regexp code (posix-regexp.c) & noticed the
following:

1. It seems the SCM_* macros often expand to rather complicated
expressions. Consider SCM_ROCHARS(str) in regex-posix.c. You have
a test, posibly an addition & a few indirections. In
scm_regexp_exec, there are lots of them used, some of which are
used over & over again.
2. The return from regexec is packaged up as a vector (malloccing the
vector, repeated use of SCM_MAKINUM & SCM_VELTS, etc) then the
original space used is freed.

Wouldn't it help performance in general if:

1. Repeated usages of these things were converted to variable
assignments followed by repeated accesses of the variable instead,
and
2. In scm_regexp_exec in particular, wouldn't it help performance to
package up the return from regexec as an opaque type with accessors
for the components so that the vector doesn't have to be built each
time?

-- 
Harvey J. Stein
BFM Financial Research
hjstein@bfr.co.il