Friday, February 22, 2013

Hadoop! 8-node cluster

I've been following this blog:

http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/

I'm using Mac Mini's to build a Hadoop cluster.  Big Data, and all that.

After successfully figuring out how to image the Ubuntu Server install from one mac mini to many, I've got a 9-node cluster running tip-top.

The WordCount example algorithm is running at ~22 minutes to parse a 20GB file.  It's a nearly linear progression from a 2 and 4-node cluster.

The recent breakthrough with the multi-node (more than 2 node) cluster is that each /etc/hosts file needs to have ALL the nodes in it... not just the master and itself.

Once I did that, the instructions work and life is grand :)
Thanks to Michael Noll for putting this together AND maintaining it for those of us making the first inroads to Hadoop and Big Data.

I'm off to build an 81-node cluster, and start pushing some real data through it.

No comments:

Post a Comment