Tuesday, July 24, 2018

High Availability with MySQL Cluster, a Quick How-To Guide for Dummies (I)

I have been playing with MySQL Cluster for some years now, and today I'd like to start writing a bit about it, how to set it up, configure, backup and also how to use it, as there's plenty of ways to drive operations towards the Cluster for brutal speed  and concurrency. MySQL Cluster, the open source in memory database from Oracle MySQL, is available for free from MySQL Cluster download page (Community version has GPL license).

But before starting with an overview of installation and setup, if you're new to MySQL Cluster, I would strongly recommend to have a look at this video.



What I find more interesting about MySQL Cluster, is that it is possible to have a setup running on commodity hardware (the bare laptop), as it can be configured to have a minimum footprint in terms of memory and storage requirements. About high availability, MySQL Cluster offers unique features to resist any single failure (it has no single point of failure: that means any process, communication link or hardware component may crash/fail but the cluster will still be available) while offering a consistent view of data (synchronous replication between data node, a unique feature in MySQL databases).

But.. what is MySQL Cluster?

MySQL Cluster is a cluster of processes, they are:
  1. ndb_mgmd management process. A MySQL Cluster setup must have at least one (can have more, for redundancy). It is recommended to execute it on a host where ndbmtd processes are not running, but does not need to be on a dedicated host, though. It consumes little resources and act as arbitrator to rule out eventual split brain situations and serves a number of functions. 
  2. ndbmtd data node process. Default configuration includes 2 data nodes processes, they implement the NDB (NDB stands for Network Database) storage engine and store data in memory. The recommendation is to execute one ndbmtd on its own host. Data nodes are replicated synchronously, hence any of these can crash and Cluster would survive and be serviceable.
  3. mysqld instances, aka SQL nodes. These are the classical MySQL Servers delivered with MySQL Cluster distribution (they differ from standard standalone distribution in that they are compiled to make use of NDB storage engine). They can be used to retrieve MySQL Cluster metrics, to execute SQL queries against the cluster (as we'll see this is not the only way to fetch data from it) and also are used to implement geographical replication (among other uses). SQL nodes can run on dedicated hosts but can also share host with data nodes or ndb_mgmd node. Best option is always away from a host running ndbmtd data nodes, as resources needed by SQL nodes can be variable depending on operations, then it's better not to add variable load to the host, which may affect data nodes performance (at least data nodes footprint is quite stable over time).
Full detailed summary is available at official documentation page.

How to install MySQL Cluster?

There are two main distributions: the RPM/DEB package and the binary compressed package, I usually prefer to get the binary one, as it is more flexible and also easier to manage for new users. With binary distribution, everything is contained within a folder. 

To get started quickly with MySQL Cluster, I would recommend to deploy a cluster on a single host using ndb_setup.py utility, available under the folder of binaries (bin). This command will open a graphical interface in a browser served by a minimalistic Python web server. It is really a simple tool and better suited to deploy all cluster processes at localhost (not the production recommended topology, but it's a good idea to start up quickly). This video shows a basic installation on multiple hosts (can be virtual machines as well).



MySQL Cluster documentation is pretty good at explaining the basics about the database, its specifications and tuning options, but I would also like to point to a couple of books:
  • Pro MySQL NDB Cluster, written by Jesper Wisborg Krogh and Mikiya Okuno, Support Engineers at Oracle MySQL. The book contemplates the basics and much more to get started and administer a MySQL Cluster database.
  • MySQL Cluster 7.5 Inside and Out, written by Mikael Ronstrom, NDB creator and Senior Architect at Oracle MySQL. It offers an in depth description of MySQL Cluster guts, together with the story of the product.
In the next chapter, I will share some basic scripts to start the cluster from command line. Stay tuned!