Scalability and Daughterships

It is possible to run daughtership processes that read their configuration from a mothership process, and then act as that mothership would. This can provide advantages in some configurations, such as superclusters comprised of subclusters (interconnected with relatively fast networks) connected together across relatively slow networks. If each subcluster has a single dedicated daughtership process, only the daughtership need connect to the mothership to collect data; the other processes in the subcluster can get their connection information from the daughtership.

Daughterships can be arranged in something of a matriarchal hierarchy, with one grandmothership having multiple daughterships, and each of those daughterships also having daughterships of its own. This might provide startup performance advantages for very-large-scale clusters. (Note that only one grandmothership can exist in any configuration; there must be one and only one mothership at the top of the hierarchy.)

Running a Daughtership Process

To create a daughtership process, set appropriate environment variables, then run the daughtership.py Python script:

$ export CRMOTHERSHIP=mothermachine:port
$ export CRDAUGHTERSHIP=localhost:port
$ python daughtership.py

The daughtership will connect to the specified mothership, and will request and read all of the configuration information from that mothership. The mothership will, in turn, register the existence of the daughtership (so that dynamic configuration information will be shared with the daughtership).

The daughtership will then start listening for connections just as a mothership would. Other processes (say, a crserver) can be told to connect to the daughtership (by setting the CRMOTHERSHIP environment variable to point at the daughtership, rather than the mothership) and can operate (with a few limitations) just as though they were connected to the mothership.

Daughtership Limitations

Dynamic host matching does work with daughterships, but performance suffers. When a daughtership needs to resolve a dynamic host, it must ask that its mothership resolve the host on its behalf. If that mothership is also a daughtership, it must pass the request to its mother, until the grandmothership is reached.
Once the grandmothership resolves the dynamic host, it sends the information to all its daughterships, who register the information and then continue to pass the information to their daughterships, etc.
So each dynamic host matched at a daughtership requires at least a round trip to the mothership, and possibly more. For optimum performance, all dynamic hosts should connect directly to the grandmothership; daughterships should be reserved for use in static host configurations or subconfigurations.
Processes intended to communicate with each other must both contact the same daughtership or mothership. The motherships and daughterships are each capable of brokering connections, so that if a server and its associated client both connect, any of the motherships or daughterships can manage the connection.
But information on pending connections is not shared between daughterships and motherships. (If it were, every connection request and every connection matching would require considerable communication between motherships and daughterships, defeating any potential performance improvements gained by having the mothership/daughtership hierarchy.)
So if a particular client and a particular server needed to connect with each other, they would first connect to a mothership (or daughtership) which would broker the connection. If they connect to the same mothership or daughtership, everything works. But if they happen to connect to different motherships or daughterships, both of them will hang, waiting for the other one to connect (which will never happen).