Was that a nightmare. They refused to talk to each other. It would have been easier if I had had any confidence in the VM network, but since I didn’t know if that was failing or not, there were many more variables.
The answer was simple, but “simple” does not mean “easy”. There are five ways to refer to one’s own self: DNS name, external IP, localhost, 127.0.0.1, and 0.0.0.0.
I thought I would be clever and name the VMs so I put them in each other’s
/etc/hosts file. That looks like this:
-e ZOO_SERVERS="server.1=stack1:2888:3888;2181 server.2=stack2:2888:3888;2181 server.3=stack3:2888:3888;2181"
Docker doesn’t like that. Apparently, it doesn’t use the host’s DNS so it does not see
/etc/hosts. On the bright side, the error messages were good (“no such host”). Moving on:
-e ZOO_SERVERS="server.1=126.96.36.199:2888:3888;2181 server.2=192.168.2.23:2888:3888;2181 server.3=192.168.2.25:2888:3888;2181"
No, I don’t know why they allocated out-of-order. This also fails (with “failed to bind” errors), but far more subtly. After much searching of the Intertubes, I was giving up and pasted the errors into my notes so that I could pick up where I left off. Then I noticed that the connection failures from all three machines were to themselves. Ah hah! Docker doesn’t see its own machine’s external IP. That’s easy to fix:
-e ZOO_SERVERS="server.1=localhost:2888:3888;2181 server.2=192.168.2.23:2888:3888;2181 server.3=192.168.2.25:2888:3888;2181"
With appropriate adjustment on the other server’s configuration. This also fails, but with “connection refused” errors. More searching ensued. It turns out, using “localhost” does something weird to Docker and it gets confused about where IPs are pointing. So, let’s use 127.0.0.1. Well, no. The article said to use 0.0.0.0, which I had never heard of, before. But, I was giving up, remember? Why not?
-e ZOO_SERVERS="server.1=0.0.0.0:2888:3888;2181 server.2=192.168.2.23:2888:3888;2181 server.3=192.168.2.25:2888:3888;2181"
Also adjusted on a per-server basis. And Voila! (although what small violins have to do with anything, I’m not sure) Of course, that’s not all I tried. That’s just the path to success. There were many less travelled roads taken. I think that’s bad advice. For posterity, the
zooup script (for the first server):
#!/bin/bash source ~/.bashrc title "Zookeeper: zoo1" # create the network (should already be there with an error for this) netup # this could be bad, but it works for dev docker stop zoo1 # cannot recreate with the same name (docker-compose does this on "down") # do it on up so the logs are laying about should I want them docker rm zoo1 # names don't resolve in the container; use IPs # use 0.0.0.0 NOT the external IP (throws bind exceptions) or localhost (breaks the other server connections) docker run --name zoo1 \ --restart always \ --net=host-net \ -p 3888:3888 \ -p 2181:2181 \ -p 2888:2888 \ -v ~/zoo/data:/data \ -v ~/zoo/config:/config \ -e ZOO_SERVERS="server.1=0.0.0.0:2888:3888;2181 server.2=192.168.2.23:2888:3888;2181 server.3=192.168.2.25:2888:3888;2181" \ -e ZOO_CFG_EXTRA="electionPortBindRetry=0" \ -e ZOO_MY_ID="1" \ -e ZOO_STANDALONE_ENABLED="false" \ zookeeper
electionPortBindRetry=0 tells it not to give up trying to find the other servers. Zero means “try forever”. If you don’t set it, you must bring all the other servers up within 10 seconds (5 * 2000msec) of each other. Why, yes, I had that problem, too. I don’t think stand-alone matters, but I added it trying to figure out the “connection refused” errors (maybe the other servers didn’t want to play).
Update: I’ve been thinking about this. The “not me” server configuration is where to go and what port to connect to when you get there, which is where those servers are listening. The “me” server configuration looks the same, but has a different meaning: It’s where to listen for incoming connections, not where to send outgoing ones.
Docker creates a virtual network, which is inside the virtual network created by Virtual Box. The outgoing routing is handled properly via whatever routing configuration the two of them create. The listening side needs to pick an interface. localhost/127.0.0.1 is not an IP address, it is representing a (virtual) NIC. This is the “inside Docker” NIC. We don’t want to listen there because the other servers are not connecting from there. We want to listen on the “Docker/VM” NIC, but there is no way of specifying that, because it does not have a stable IP. 0.0.0.0 works because it means “any/all NICs”. And that’s exactly what causes the problem trying to run multiple ZooKeeper cluster nodes on one VM: They cross-talk because of that “any” interface. If one could specify which one, it would be fine.
One would think a unique set of ports could fix that, but the configuration is faulty. One can uniquely map the Docker/VM port using the -p option. However, the “me” configuration does not override the ports; it just breaks things. The configuration file allows overriding the 2888 and 2181 ports, but not the 3888 port.
This is why a full cluster can run on one VM via docker-compose: Using the container name as the network address lets Docker assign the correct listening NIC.