[socal-piggies] Treaps vs heaps for best m of n problems

Dan Stromberg strombrg at gmail.com
Sat Jan 16 10:56:34 PST 2010


It was suggested that a heap or priority queue might work well for what 
I was using treaps for.

Reminder: n is the number of files examined, m is the number of the 
largest files you're interested in, and m is often much smaller than n.

IINM, a heap would allow a nice O(n*log(n)) algorithm, but wouldn't work 
that well for an O(n*log(m)) - because I believe that finding and 
evicting the smallest value in a max heap is O(n).  In "highest", the 
smallest value is evicted after an insertion, to keep the datastructure 
from becoming larger than it needs to be - we can do this, because we 
know we only care about the 1000 (or whatever) largest values.

AFAIK, a priority queue is an abstract data type that can be implemented 
in a variety of ways behind the scenes, including using a heap or treap.

It's a good question/suggestion.





More information about the socal-piggies mailing list