This is the mail archive of the guile@cygnus.com mailing list for the guile project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: Scheme style auto-resizing hashtable (fwd)


On Fri, 6 Nov 1998, Maciej Stachowiak wrote:

> 
> jglascoe@jay.giss.nasa.gov writes:
> > 
> * `ref' is short for `reference' or `dereference'. 

which means a lot of different things to a lot of different people.  To a
Perl hacker it means this:

$hash_ref = \%hash;  # reference
%hash = %$hash_ref;  # dereference

> * `vector-set!' is entirely different from plain old `set!' because it
> modifies a vector, not a variable, and I don't see why this is
> confusing at all. You may as well ask "a[x] = y;" is completely
> different from "a = y;" in C.

*(a + x) = y;
*a = y;

is consistent.

> * It's not clear to me why your criticism of "delete!" does not apply
> equally to "remove!". However, in general the reason it is obvious it
> does not mean "delete the whole data structure" is because there are
> no such operations in Scheme.

$ python
Python 1.5.1 (#1, Oct  2 1998, 23:15:39) [C] on aix4
Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam
>>> d = {"one": 1, "two": 2}
>>> d
{'one': 1, 'two': 2}
>>> del d["one"]
>>> d
{'two': 2}
>>> del d
>>> d
Traceback (innermost last):
  File "<stdin>", line 1, in ?
NameError: d

<d has been deleted>

> The reason these are preferable to insert!/lookup/remove! is because
> they are parallel with the operations on Scheme basic types, as
> defined in R5RS, a standards document that is a lot harder to change
> than random data structure implementations.

point taken.

> For that matter, one could make up similar arguments against
> insert!/lookup/remove!, e.g.: 
> 
> * insert! implies adding a new element, but it might in fact be
> changing an existing element. set! expresses this much better

okay.

> Really, the choices between these are close to arbitrary, hence
> consistency with existing practice should be the deciding factor.

well, let's wait and see what other collection data structure authors
decide to use.  I'd like to be consistent with their naming conventions.

> > > Also there
> > > should probably be a `dictionary-add!' which inserts a given key-value
> > > cons cell, guaranteeing to share the memory.
> > 
> IMO, it makes a lot more sense to make adding a single key-value pair
> a primitive rather than to make adding a whole list of them a
> primitive.

right.  I agree.

> > *sigh* okay, you're the second person to criticize my spelling  ;)
> > I said "foreach" was good because, if we have a MOP OO system, then it
> > wouldn't collide with "for-each".  But, heck, why not invite Scheme's
> > basic list type to take part in the MOP?
> > 
> > > Although this might seem nice as a convenience,
> > > 
> > > (dictionary-insert-alist! my-dict my-assoc)
> > > 
> > > is barely shorter than the (IMO more clear)
> > > 
> > > (for-each (lambda (x) (dictionary-add! my-dict x)) my-assoc)
> > > 
> > > > 	dictionary-consume-alist!  -- delete alist while
> > > > 				   -- inserting pairs (to conserve memory)
> > > 
> > > I don't understand how this can possibly work. You can't "delete" an
> > > object in Scheme per se, you can only remove all references to it, at
> > > which point it gets GC'd. 
> > 
> > By doing set-c[ad]r!'s, I can reduce any alist down to (<last pair>).  I
> > then do (set-car! list '()) and (set-cdr! list '()) to get it down to
> > ((())).  The dictionary is left with sole ownership of the all the pairs
> > (unless somebody else is holding on to one of the pairs).
> > 
> 
> As I said in another message, dropping all references to the alist
> (e.g. set!'ing the variable that refers to it to '(), or better yet
> not storing it in a lasting variable at all) achieves the same effect
> much more efficiently.
> 
> 
> > If you do
> > 
> > (dictionary-consume-alist! dictionary big-huge-alist)
> > 
> > then, as "big-huge-alist" is deleted and "dictionary" is filled, the
> > dictionary will use the memory no longer used by the alist. 
> > 
> 
> It can only possibly reuse the cons cells that represent the
> associations, not the ones that form the list structure, which will
> become garbage. Sharing the association cells does not require
> destroying the alist.

wrong.  It'll grab the memory used by the cons cells making up the list.

(cons a (cons b (cons c '())))  <->  (a b c)

After (set-cdr! list (cdr list))

(cons a (cons c '()))           <->  (a c)

the (cons b *bleh*) cons cell is free for GC.  Guile may not GC it right
away, but if the list is very big, then it will before the whole
"consume-list!" operation is done and "dictionary" will gladly use the
memory. 

> > > `dictionary-stats' is gratuitous enough as it is that there don't need
> > > to be two of them :-)

it really is, big-nose  :)

> > yes, but someone with a "dictionaryx" may be seriously interested in
> > knowing how well their hasher function is working.

desperate argument.

> 
> Then make dictionary-stats do what dictionary-more-stats would do.
> 

okay.  I imagine "more-stats" will be an O(n) procedure (with a big old
constant attached), but it won't be called that often.  Also, any
"dictionary/hash-table" documentation should probably tuck the description
of this thing away with a in-depth description of "dictionaryx"'s. 


>  - Maciej
> 

	Jay Glascoe
	jglascoe@jay.giss.nasa.gov