Monday, May 29, 2017

Visting Pigeon Point Light House

Piaw missed the ocean view Jacuzzi at the hostel. Do we planned a trip to visit there.  There are a few routes to get there. The one I like most starting from the intersection of 35(skyline blvd) and 84(La Honda Rd), following La Honda Rd to the south, then change to Pescadero Creek Rd.

A lot of motorcyclists and cyclists start from the intersection. It is an unbeatable drive through the redwood forest. It is on my list to do a bike tour there soon!

There is a little farm called Harley Farms Goat Diary along Pescadero. It is a sweet stop to rest and let kids play around. Another nice stop at Arcangeli Grocery Co - Norm’s Market. It has the best artichoke bread. There is a wonderful picnic area behind the shop too. And the beach is not that far from there. It needs luck to have a sunny clear sky though.

Don't forget to make reservation to use the Jacuzzi if you stay overnight at the hostel at Pigeon Point Light House.

I have a previous post about visiting half noon bay. If you stay overnight at the hostel, you can go up to visit half noon bay on the second day. The related post is here:

Friday, May 26, 2017

Hash or Sort

Last week, I read a blog post that shows to find the unique elements in an array, sorting an array is faster than building a hash table.  At the first glance I thought, ya, the random memory access cost is high. But when I sat down writing down the simple formula for 1000,000 elements for nlogn, I could believe that random memory access could be 20 times slower than sequential access.

I did remeasure the random memory access to exercise the cache miss penalty, it is 2 or 3 times penalty comparing sequential memory access. So why building a hash table is so slow?

I redid the experiment with the following:
Each element is 8 bytes.
1. sorting the 1000,000 elements takes K time.
2. the same dataset, building a hash table takes 2K time with std::unordered_map.
3. after building the hash table, do a complete lookup, it takes 3/4K time.
4. wrote my own hashtable similar to google dense hash, with % as the hash function, it takes 3/4K time too.
5. optimized my hashtable function, replace % with bit operation &, it takes 1/2K time.

So the poor performance of the hash claimed is the combination of poor implementation, poor hash function selection, and random memory access effect.

The std::unordered_set really has poor performance at least for 8 byte objects.

Next step, I will test the unordered_map performance with real (key, val) pairs with big val.