[socal-piggies] Treaps vs heaps for best m of n problems
Dan Stromberg
strombrg at gmail.com
Sat Jan 16 10:56:34 PST 2010
It was suggested that a heap or priority queue might work well for what
I was using treaps for.
Reminder: n is the number of files examined, m is the number of the
largest files you're interested in, and m is often much smaller than n.
IINM, a heap would allow a nice O(n*log(n)) algorithm, but wouldn't work
that well for an O(n*log(m)) - because I believe that finding and
evicting the smallest value in a max heap is O(n). In "highest", the
smallest value is evicted after an insertion, to keep the datastructure
from becoming larger than it needs to be - we can do this, because we
know we only care about the 1000 (or whatever) largest values.
AFAIK, a priority queue is an abstract data type that can be implemented
in a variety of ways behind the scenes, including using a heap or treap.
It's a good question/suggestion.
More information about the socal-piggies
mailing list