Tuesday, July 31, 2018

High Availability with MySQL Cluster, Setup From Command Line (II)

In the first delivery of this series of posts, delivered for whom who are interested to understand the basics of MySQL Cluster "by examples", I wrote about installing MySQL Cluster with a python utility called ndb_setup-py, which offers a nice web graphical interface to define and start our brand new cluster.

In this post I will share an example to do everything from scratch and manually, instead. Doing things manually is always the best recommendation to learn everything about processes life cycle through their:
  • Initialization
  • Administration (start/stop/reconfigure)
  • Monitoring (logs/counters/status)
  • Troubleshooting 
The resulting topology I'd like to setup is composed of 2 data nodes, 1 management node and 1 SQL node, as in the following picture:

Let's get down to business and deal today with processes initialization; I will discuss about setting up a MySQL Cluster environment from scratch and step by step on a Linux machine.
  1. Download generic binaries release for Linux
  2. File system preparation
  3. Initialize and start the management node
  4. Initialize and start the data nodes
  5. Initialize and start the SQL node
  6. Monitor the status 
In order to run a MySQL Cluster deployment, I will need to create:
  • A cluster configuration file, usually named config.ini. This is unique for all cluster processes.
  • The SQL node configuration file, usually named my.cnf. Must have one per SQL node.

1. Download generic binaries release for Linux

My favorite MySQL Cluster install method is using binaries, as to the purpose of learning how to install and administer a cluster it is not needed to install RPMs and their related scripts to start/stop a process as a service. Hence I recommend to download binaries by choosing "Linux - Generic" from related download page

2. File system preparation

Processes need to store stuff on the disk: not only to achieve durability, but also to store logs and other support files necessary to perform their duties. So every process will have a data directory. Plus, every process must define a nodeid, which is an integer number used to identify a process. Therefore, for simplicity, I recommend to create 4 folders under the desired path and name them with the nodeid I will set later in the configuration file. Important note: in order to make things easier, I will setup a whole cluster on my machine, while this is not the recommended topology in a real production environment, where data nodes and the management node should all run on dedicated hosts.
  • 49. This folder will store data directory for the management node
  • 1. This folder will store the data directory for data node 1
  • 2. This folder will store the data directory for data node 2 (replica of data node 1)
  • 50. This will store the data directory for the SQL node
Now that the four directories are created, let the fun begin!

3. Initialize and start the management node

Before starting the management node for the first time, I need to setup a config.ini file. Let's use this bare simple one.

[NDB_MGMD]
HostName=127.0.0.1
DataDir=/home/mortensi/cluster/49
NodeId=49

[NDBD]
NodeId=1
HostName=127.0.0.1
DataDir=/home/mortensi/cluster/1

[NDBD]
NodeId=2
HostName=127.0.0.1
DataDir=/home/mortensi/cluster/2

[MYSQLD]
HostName=127.0.0.1
NodeId=50

This configuration file, the easiest possible (all configuration parameter have default values), is telling that:
  • Node ids are: 49 for the management node, 1 and 2 for the data nodes and 50 for the SQL node (the mysqld instance)
  • We're setting up a topology where all processes are co-located on the same localhost machine
  • We will use those folders we created as data directory.
Let's start the management node now:

ndb_mgmd --config-file=/home/mortensi/cluster/config.ini --configdir=/home/mortensi/cluster/49 --initial &

If no error is reported, we can check if the management process is running with ndb_mgm command line client tool like this:

ndb_mgm -e show
Connected to Management Server at: localhost:1186
Cluster Configuration
---------------------
[ndbd(NDB)]     2 node(s)
id=1 (not connected, accepting connect from 127.0.0.1)
id=2 (not connected, accepting connect from 127.0.0.1)

[ndb_mgmd(MGM)] 1 node(s)
id=49 @127.0.0.1 (mysql-5.7.22 ndb-7.6.6)

[mysqld(API)]   1 node(s)
id=50 (not connected, accepting connect from 127.0.0.1)

This output is telling that our 2 data nodes defined in the config.ini have not been started, but the management was started successfully. Finally, the SQL node is not connected, yet.

4. Initialize and start the data nodes

Now it's time to start the core processes for any cluster setup: data nodes! Data nodes can only be started after the management node, as they need to connect to it to retrieve the configuration of the cluster. Run the following:

ndbmtd --connect-string=localhost --initial --ndb-nodeid=1 &
ndbmtd --connect-string=localhost --initial --ndb-nodeid=2 &

Check again status with ndb_mgm:

ndb_mgm -e show
Connected to Management Server at: localhost:1186
Cluster Configuration
---------------------
[ndbd(NDB)] 2 node(s)
id=1 @127.0.0.1 (mysql-5.7.22 ndb-7.6.6, Nodegroup: 0, *)
id=2 @127.0.0.1 (mysql-5.7.22 ndb-7.6.6, Nodegroup: 0)

[ndb_mgmd(MGM)] 1 node(s)
id=49 @127.0.0.1 (mysql-5.7.22 ndb-7.6.6)

[mysqld(API)] 1 node(s)
id=50 (not connected, accepting connect from 127.0.0.1)

We're done! Now that the cluster is up and running, we only need to connect through the SQL node and play! Remember that future data nodes restart, must be done without --initial option, otherwise data nodes data directories will be erased!

5. Initialize and start the SQL node

To finalize this cluster installation, we need to initialize SQL node data directory as follows (must indicate the path to the binary release and to the brand new datadir candidate folder):

mysqld --initialize-insecure --datadir="/home/mortensi/cluster/50" --basedir="/export/home/mortensi/software/mysql-cluster-gpl-7.6.6-linux-glibc2.12-x86_64/" --user=mortensi

Now that the SQL node has been initialized, let's use this simple my.cnf configuration file to start it:

[mysqld]
ndbcluster=on
ndb_nodeid=50
datadir=/home/mortensi/cluster/50
basedir=/export/home/mortensi/software/mysql-cluster-gpl-7.6.6-linux-glibc2.12-x86_64/
socket=/home/mortensi/cluster/50/mysqld.sock
log_error=/home/mortensi/cluster/50/mysqld.log

Save the file, and let's go:

mysqld --defaults-file=/home/mortensi/cluster/my.cnf &

You will now be able to connect:

mysql -h127.0.0.1 -uroot

And check your brand new cluster status!

mysql> select * from ndbinfo.nodes;
+---------+--------+---------+-------------+-------------------+
| node_id | uptime | status  | start_phase | config_generation |
+---------+--------+---------+-------------+-------------------+
|       1 |   1691 | STARTED |           0 |                 1 |
|       2 |   1690 | STARTED |           0 |                 1 |
+---------+--------+---------+-------------+-------------------+
2 rows in set (0.02 sec)

Cool, isn't it?

6. Monitor status

To complete this mini tutorial, let's just check status for this cluster as already done:

ndb_mgm -e show
Connected to Management Server at: localhost:1186
Cluster Configuration
---------------------
[ndbd(NDB)] 2 node(s)
id=1 @127.0.0.1 (mysql-5.7.22 ndb-7.6.6, Nodegroup: 0, *)
id=2 @127.0.0.1 (mysql-5.7.22 ndb-7.6.6, Nodegroup: 0)

[ndb_mgmd(MGM)] 1 node(s)
id=49 @127.0.0.1 (mysql-5.7.22 ndb-7.6.6)

[mysqld(API)] 1 node(s)
id=50 @127.0.0.1 (mysql-5.7.22 ndb-7.6.6)

All processes declared as belonging to the cluster (topology was defined in the config.ini file) are now up and running, mission accomplished!