[socal-piggies] Treaps vs heaps for best m of n problems

Darius Bacon darius at accesscom.com
Mon Jan 18 18:42:20 PST 2010


> From: Dan Stromberg <strombrg at gmail.com>
>    2. For the other application (not yet on the web) named gprog, I
>       believe the sometimes useful trick of storing a minheap of the m
>       largest values wouldn't work because you often need to evict old,
>       weak values and also often need to get a copy of the strongest
>       value - that is, you need both the least and the greatest to
>       obtain a safe blocksize choice and a new blocksize addition). 
>       gprog is a sort of bandwidth meter for pipes - a GUI one that uses
>       a cache oblivious algorithm to avoid becoming a bottleneck... So
>       it forms lots of guesses as to good transfer size and keeps around
>       the best m guesses out of n (where n is the number of block size
>       guesses tried) for later   It also benefits from being able to
>       detect collisions between the randomly generated current transfer
>       size, and a previously-guessed transfer size.

Ah, so I was missing context -- sorry. It's neat how the treap can
handle this well while still relatively simple to code.

I wonder how it compares to a dedicated double-ended-priority-queue
structure like
http://www.diku.dk/forskning/performance-engineering/Jesper/heaplab/heapsurvey_html/node11.html
but you might have covered that too and I really should have come to
the meeting! I get the impression it'd use less space by omitting the
random priorities, and run updates in comparable time, but
deterministically and with O(1) for min and max; OTOH it's probably
not worth the bother coding up when a balanced-tree structure like
treaps has more uses -- like detecting collisions in this case.

> I've today added a third mode to "highest" that uses heapq (not yet on 
> the web); a pair of performance graphs should complete within a few 
> days.  /Initial/ indications are that, at least for varied n, heapq 
> yields an implementation that's in between the insort and treap 
> implementations in performance, though closer to the insort times than 
> the treap times.

Interesting.

Darius




More information about the socal-piggies mailing list