Cluster discovery with multiple Apache Kafka clusters

When you have multiple Kafka clusters for load balancing or failover purposes, a client can find a suitable cluster either using a DNS server or using a load balancer.

A common architectural pattern for Kafka is to have redundant Kafka clusters across multiple data centers so that applications can either load balance or fail over to the remote data center in case the local one becomes unavailable. Finding a suitable cluster by the client is called cluster discovery. If you connect to a cluster, you do not need all the brokers, one broker is enough. (However, to avoid a single point of failure, it is better to have multiple brokers.) This means that your discovery service should provide at least one active broker. The client gets metadata of all the cluster members after connecting to the broker it got from the service/mechanism. Getting brokers from the discovery service is only useful for the bootstrap phase of the connection. The client must connect to all the brokers directly to operate properly.

The two options for cluster discovery in case of multiple Apache Kafka clusters are using DNS records or load balancers. Cluster discovery using DNS records is the simpler solution.