If I were to design a huge distributed system whose throughput should scale linearly with the number of subscribers and number of channels in the system, which would be better ?
1) Redis Cluster (only for Redis 3.0 alpha, if its in cluster mode, you can publish in one node and subscribe in another completely different node, and the messages will propagate and reach you). The complexity of Publish is O(N+M), where N is the number of subscribed clients and M is the number of subscribed patterns in the system, but how does it scale when in a Redis Cluster ? I accept educated guesses on this.
2) ZeroMQ since 3.x, it does server-side filtering, so it also has some time complexity there, but I have not seen anything about it in the documentation. If I wanted to scale it, I could just have lots of servers publishing to whatever channels, and each subscriber would connect to all the servers, and subscribe for the desired channel. That seems nice.
So which of those is better for horizontal scaling of a huge publisher system ? What are other solutions I should look into ? Remember, I want to minimize latency and throughput, but being able to scale horizontally.
You want to minimize latency, I guess. The number of channels is irrelevant. The key factors are the number of publishers and number of subscribers, message size, number of messages per second per publisher, number of messages received by each subscriber, roughly. ZeroMQ can do several million small messages per second from one node to another; your bottleneck will be the network long before it's the software. Most high-volume pubsub architectures therefore use something like PGM multicast, which ZeroMQ supports.