What are Nodegroups in MySQL Cluster
A node group is a set of data nodes that is in charge of several fragments of data in the MySQL Cluster. A node group is allocated based on the setting of NoOfReplicas in a config.ini. So if NoOfReplicas = 3, each node group would have 3 data nodes in it.
Data itself is stored in segments of data called fragments. Each data node has a single fragment of which it is the master. These fragments are then replicated synchronously to all of the other nodes in the node group. In the event of a node failure, another one in the node group will automatically assume the master role for that fragment.
Node groups are assigned based on the identifier that each node is given in an ascending linear fashion. For example, if you have NoOfReplicas = 2, then the nodes with ID 1 and 2 would be in the one group, ID 3 and 4 would be in another, and so on for all of the nodes.
You can see which node is in which nodegroup based on the output of show from within the ndb_mgm client.
ndb_mgm> show
Cluster Configuration
---------------------
[ndbd(NDB)] 4 node(s)
id=5 @<ip1> (Version: 4.1.8, Nodegroup: 0)
id=10@<ip2> (Version: 4.1.8, Nodegroup: 0, Master)
id=20@<ip3> (Version: 4.1.8, Nodegroup: 1)
id=30@<ip4> (Version: 4.1.8, Nodegroup: 1)
...</ip4></ip3></ip2></ip1>
Notice in the above example the “Nodegroup” identifier for each data node. Nodes 5 and 10 are of the same group, whereas 20 and 30 are part of a different node group. 5 and 10 were grouped together because they were the smallest and nodes 20 and 30 because they were the next smallest.
Which node group a node belongs to becomes very important in a node failure situation. For example, if node 5 failed in the above setup. Node 10 would automatically take over as the master of the fragment that node 5 had. If there is an additional failure, it needs to come from a different node group. So node 20 or 30 could fail and the cluster could still continue running (assuming the arbitrator decided). However, if node 10 failed, then the entire node group has failed and the cluster will shutdown.
Due to the use of the nodegroups in failover, sometimes you might wish to control which nodes are members of particular node groups. For example, node group members should always be on different physical servers. That way if the server shuts down and you lose multiple nodes, they are all from different node groups, so the cluster can continue on without them.
To control which node group a data node is a member, you can use the above information about configuration files and node ids. An example configuration file that shows this is below.
[MGM]
HostName=[HOST0]
[NDB DEFAULT]
NoOfReplicas=2
[NDB]
Id=3
HostName=[HOST1]
[NDB]
Id=5
HostName=[HOST1]
[NDB]
Id=4
HostName=[HOST2]
[NDB]
Id=6
HostName=[HOST2]
[MYSQLD]
In this example, there are 4 data nodes that reside on 2 physical servers. We want to control the node groups to assure that if one physical server is shutdown the entire cluster won’t be shut down by the arbitrator. The node groups are then split onto two separate physical servers to ensure this will occur.