This is the mail archive of the guile@sourceware.cygnus.com mailing list for the Guile project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: more guile for perl refugees (split, join)


On Mon, 26 Jun 2000, thi wrote:

> did you know guile has module `(ice-9 string-fun)'?  it would be
> interesting to see a performance study between your implementation and
> another one using those procedures.

I did rummage through (ice-9 string-fun) before writing my "split"
routine, and didn't find any immediately useful substitutes.  The closest
looked like it might be the commented-out "with-regexp-parts".


Benchmarking my "split" with the regular expression that is probably most
common in most perl programs, the one-or-more-whitespace-characters,
"[ \t\n]+" might primarily be testing regcomp/regexec's optimization of
that case.  It might well be worth writing a special case split-whitespace
routine to suplement string-fun.

Another perl-related comparison: has anyone thought about a module that
uses read-hash-extend to compile constant regular expressions once at read
time?  Sure you can (make-regexp) outside of a critical loop, but that
could put the regular expression string far away from the compiled
regexp's use.

Trying not to be a "lazy bastard," I poked about in the source and figured
out how read-hash-extend works well enough to throw this together, and
made a few notes.  

Then while trying to write a little documentation, I discovered #., as in:
	#.(make-regexp "foo")
Is this standard scheme or a guile extension?

Anyway, maybe these ramblings still have some tutorial use:


(use-modules (ice-9 regex))

; Use the read-hash-extend facility to add a syntax for constant
; regular expressions that are to be compiled once when read in,
; instead of during the normal flow of execution.   This can let loops
; that repeatedly use a constant regexp be optimized without moving the
; expression's definition far away from its use.
;
; With this hash-extension, these two expressions behave identicaly:
;
; (let ((r (make-regexp "de+"))) (regexp-exec r "abcdeeef"))
; (regexp-exec #+"de+" "abcdeeef")
;
;
(read-hash-extend #\+ (lambda (c port)
		  (let ((s (read port)))
		    (if (string? s)
			(make-regexp s)
			(error "syntax error; #+<string> expected")))))
;
; (very poorly written) general notes on read-hash-extend in liu of real
; documentation.
;
; The read-hash-extend procedure takes two arguments, a character and a
; procedure.   The procedure is stored in a hash table keyed on the
; character.
; Later, when guile's reader encounters a token beginning with '#' 
; followed by a character that it doesn't otherwise recognize, it calls
; the hash-extend procedure associated with the character with two 
; arguments, the character and the reader's current input port.
; The procedure should call (read port) to consume guile tokens as
; necessary to implement its new syntax.  The procedure should return a
; single guile object which will be the value of the new #-syntax.
;
; Among the characters NOT available for use with read-hash-extend because
; they are reserved for other guile/RnRS syntax are:
;  most alphabetic characters
;  *{\!(&'.
; 






Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]