One of the first Cassandra tickets I worked on had me reviewing some code that visualized the node ring. Properly testing the code required that I run a cluster.
But I didn't have access to a cluster. Neither did I feel like creating a virtual cluster by building a VM and cloning it several times. What I wanted was to run several instances of Cassandra on a single machine with multiple interfaces, all pointed at the same compiled code (without multiple svn checkouts).
The Cassandra wiki explains how to tweak Cassandra settings by editing cassandra.in.sh, but doesn't explain what needs to be done to run concurrent instances.
It turned out not to be too difficult. I figured it might be daunting enough to Cassandra noobs (of whom we're seeing more of lately due to some great exposure), that a blog post might be helpful.
This tutorial assumes that you'll want to run multiple instances of Cassandra on code built by ant and not a standalone jar. I am also assuming that you are a) just playing around, or b) intend to do some development. This is not a tutorial explaining how Cassandra should be run in production.
Note: I apologize for the way this looks. Blogger is not a friend of ordered lists.
- Make sure you've got aliases to localhost (e.g.: 127.0.0.2, 127.0.0.3, etc.). Mac OS X doesn't have this enabled by default, so you'll have to manually create aliases:
sudo ifconfig lo0 alias 127.0.0.2 up
sudo ifconfig lo0 alias 127.0.0.3 up
- Decide where you're going to keep things. You can keep them with your code, but that just isn't neat. Pick a directory somewhere, call it $cass_stuff.
- Then, for each node in your little cluster, do this:
- From your svn checkout, copy the conf directory into $cass_stuff. You can rename it to something like conf0 (or conf1, etc.). I'll assume $conf from here on out.
- Copy bin/cassandra.in.sh to $cass_stuff. Give it a name that helps you associate it with the conf directory you just created (node0.in.sh or whatever).
- Open node0.in.sh in an editor and make the following changes:
- Hardcode cassandra_home to the location of your trunk. This will give you the flexibility to run Cassandra from anywhere.
- Set CASSANDRA_CONF to the conf directory you just created.
- In the JVM_OPTS change the jdwp address= setting. The default is 8888, but you should include the unique IP you chose for this node along with the port, e.g.: 127.0.0.2:8888. Not specifying a host causes the debugger to bind to 0.0.0.0:8888 and you'll have port binding problems when you bring up more than one node.
- pick a unique port for com.sun.management.jmxremote.port, but make sure you have at least one node listening on 8080 since all the Cassandra tools assume JMX is listening there. Unfortunately, you can't pick the JMX host, 0.0.0.0 is assumed. I was under the impression this could be changed by specifying java.rmi.server.hostname, but had no luck going down that road. (Please leave a comment if you figure out a way for this to work, but I think it might be hopeless.)
- Open $cass_stuff/$conf/storage-conf.xml in an editor and make the following changes:
- specify unique locations for CommitLogDirectory and DataFileDirectory. Don't bother with CalloutLocation or StagingFileDirectory.
- replace ListenAddress with the IP of your host.
- replace RPCAddress with the IP of your host.
One downside to this approach is that if you're tracking trunk, it is your responsibility to make sure you notice changes to the default storage-conf.xml and cassandra.in.sh and apply them to your environments.