Last week, I read a blog post that shows to find the unique elements in an array, sorting an array is faster than building a hash table. At the first glance I thought, ya, the random memory access cost is high. But when I sat down writing down the simple formula for 1000,000 elements for nlogn, I could believe that random memory access could be 20 times slower than sequential access.
I did remeasure the random memory access to exercise the cache miss penalty, it is 2 or 3 times penalty comparing sequential memory access. So why building a hash table is so slow?
I redid the experiment with the following:
Each element is 8 bytes.
1. sorting the 1000,000 elements takes K time.
2. the same dataset, building a hash table takes 2K time with std::unordered_map.
3. after building the hash table, do a complete lookup, it takes 3/4K time.
4. wrote my own hashtable similar to google dense hash, with % as the hash function, it takes 3/4K time too.
5. optimized my hashtable function, replace % with bit operation &, it takes 1/2K time.
So the poor performance of the hash claimed is the combination of poor implementation, poor hash function selection, and random memory access effect.
The std::unordered_set really has poor performance at least for 8 byte objects.
Next step, I will test the unordered_map performance with real (key, val) pairs with big val.
Friday, May 26, 2017
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment