Saturday, November 2, 2019

Manage InnoDB Cluster using MySQL Shell Extensions

At times, when playing with different InnoDB Clusters for testing (I usually deploy all Group Replication instances on the same host on different ports) I find myself stopping the group and doing operations on every instance (e.g. a static reconfiguration). Or I may need to shutdown all instances at once. Scripting is the usual approach, but in addition, MySQL Shell offers a very nice (and powerful) way to integrate custom scripts into the Shell itself to manage an InnoDB Cluster. This is the purpose of MySQL Shell extensions, to create new custom reports and functions and have the flexibility to manage one or more instances at once. I found particularly practical the new plugin feature, introduced in MySQL Shell 8.0.17, that can aggregate reports and functions under the same umbrella: the plugin.

As an example of the things that are possible, I have modified Rene's great example so to stop Group Replication in a shot from MySQL Shell, and it's particularly easy, check the following script.

  1. Create directory  ~/.mysqlsh/plugins/ext/idc/
  2. Create there init.js script as follows, it will be loaded at MySQL Shell startup.

// Get cluster object, only if session is created
function get_cluster(session, context) {
  if (session) {
    try {
      return dba.getCluster();
    } catch (err) {
      throw "A session to a cluster instance is required: " + err.message
    }
  } else {
    throw "A session must be established to execute this " + context
  }
}


function stop_cluster() {
  var cluster = get_cluster(shell.getSession(), "function");
  var data = cluster.status();
  var topology = data.defaultReplicaSet.topology;
  var sess = shell.getSession()
  var uri = sess.getUri()
  var user = (uri.split('//')[1]).split('@')[0]

  // iterate through members in the cluster
  for (index in topology) {
if (topology[index].status=="ONLINE")
print("\n-----> " + topology[index].address + " is ONLINE and will be evicted from GR\n")

var sess = shell.connect(user + "@" + topology[index].address)
var result = sess.runSql("STOP GROUP_REPLICATION;")

//print(JSON.stringify(result, null, 4))
  }

  // Reconnect original session
  shell.connect(uri)
  return;
}

// Creates the extension object
var idc = shell.createExtensionObject()

// Adds the desired functionality into it
shell.addExtensionObjectMember(idc, "stopCluster", stop_cluster, {
  brief: "Stops GR on all nodes, secondaries first, primary instance the last.",
  details: [
    "This function will stop GR on all nodes.",
    "The information is retrieved from the cluster topology."]});

// Register the extension object as a global object
shell.registerGlobal("idc", idc, {
  brief:"Utility functions for InnoDB Cluster."})


The script defines stop_cluster function that is invoked with idc.stopCluster() and:

  1. Get the cluster object from the session (session against any member must be created beforehand)
  2. Fetch topology from cluster object
  3. Iterate through members belonging to the topology and get the address
  4. For every member, establish a session using same session user (e.g. root or whatever, it is a prerequisite to administer a cluster with the same user)
  5. Send command to stop Group Replication
  6. After iterating through all members, reset the original session
The script also creates an extension object, registers it as global object and adds the function so it can be invoked as follows:


It is also possible to restart the cluster with the built-in dba global object, with function dba.rebootClusterFromCompleteOutage(); 


So in short, it is possible to start and stop the cluster with one command and from the same MySQL Shell session. This is only a quick skeleton (can be improved e.g. like stopping GR starting by secondary instances, and the primary at last) to connect to instances and do operations, there is no limit to the number of things that are possible. Read more on LeFred's blog here
Blogger Tricks

Saturday, September 21, 2019

MySQL Server Performance Tuning: The Perfect Scalability

When data and concurrency grow, together with queries complexity, a standard configuration does not fit anymore. This content walks through the ways MySQL has to adapt to an ever-increasing amount of data in context of tuning, under the heading “multiplication”:
—Scaling up through partitioning and how partitioning can help manage/archive data
—How to relocate binlog, undo, redo, tablespaces, or disk temporary tables and more on different mount points to take advantage of multiple storages and tackle I/O bottlenecks
—All improvements to parallel slave replication
—A quick OS approach, to verify swapping and affinity tuning take the most out of the machine.

All of this with different approaches to monitor the instance to spot what parts of the system and what queries need to be considered, mainly using:

Sys Schema
Global Status
SHOW ENGINE INNODB STATUS


I presented this content at last Oracle Open World 2019 as a Hands-On Lab.

Friday, September 20, 2019

How To Bulk Import Data Into InnoDB Cluster?

If you need to do bulk importing into InnoDB Cluster, it is certainly possible to do so by using any of:


Unfortunately both imports will add load to instances and to channels interconnecting instances: data imported on the primary instance needs to be replicated to the rest of instances. And the bigger the data to import, the higher the load (and this could end up affecting performance). The import operation could be batched to reduce load, and Group Replication allows at least to throttle workload with flow control or to split messages in several smaller messages with group_replication_communication_max_message_size option.

How to import data into InnoDB Cluster?
But in case data to import is a whole table (MySQL 8 adds also flexibility to swap partitions and tables, may become handy), or data can be first loaded into an InnoDB table, there's simple way to have an arbitrary amount of data pushed to InnoDB Cluster, and it takes advantage of tablespaces copying feature. I made a quick test to import a table.

I created the table t5 on an arbitrary instance and added a few rows. Then exported as in the instructions (does nothing but flush it and create an auxiliary .cnf file for definition validation at import time, not mandatory to use it but recommended):

FLUSH TABLES t5 FOR EXPORT;

On the InnoDB Cluster setup, I created the table t5 with same definition from the primary, then again on the primary:

ALTER TABLE t5 DISCARD TABLESPACE;

This will remove the t5.ibd tablespace on all the 3 instances. And with a simple SELECT, I made sure that this is as expected:

mysql> select * from test.t5;
ERROR 1814 (HY000): Tablespace has been discarded for table 't5'

After that, I copied t5.ibd from the former instance under the related schema folder in *each* GR node.
Let's check initial GTID set:

mysql> select @@GLOBAL.GTID_EXECUTED;
+------------------------------------------------------------+
| @@GLOBAL.GTID_EXECUTED                                     |
+------------------------------------------------------------+
| 550fa9ee-a1f8-4b6d-9bfe-c03c12cd1c72:1-270:1000011-1000014 |
+------------------------------------------------------------+
1 row in set (0.00 sec)

Then on the primary, did:

mysql> ALTER TABLE t5 IMPORT TABLESPACE;
Query OK, 0 rows affected, 1 warning (0.03 sec)

I am lazy and did not perform validation using .cfg (more from the instructions):

mysql> show warnings;
+---------+------+-----------------------------------------------------------------------------------------------------------------------------------------+
| Level   | Code | Message                                                                                                                                 |
+---------+------+-----------------------------------------------------------------------------------------------------------------------------------------+
| Warning | 1810 | InnoDB: IO Read error: (2, No such file or directory) Error opening './test/t5.cfg', will attempt to import without schema verification |
+---------+------+-----------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

And all the tablespaces are loaded from local file system into the GR member. And new GTID set is:

mysql> select @@GLOBAL.GTID_EXECUTED;
+------------------------------------------------------------+
| @@GLOBAL.GTID_EXECUTED                                     |
+------------------------------------------------------------+
| 550fa9ee-a1f8-4b6d-9bfe-c03c12cd1c72:1-271:1000011-1000014 |
+------------------------------------------------------------+
1 row in set (0.00 sec)

Let's test it's all ok:

mysql> select * from test.t5;
+------+
| a    |
+------+
|    1 |
|    2 |
|    3 |
|    4 |
|    5 |
|    6 |
|    7 |
|  777 |
+------+
8 rows in set (0.00 sec)

So data will be available on the rest of the nodes at no bandwidth and protocol cost, only this is indeed replicated, from binlog.

# at 2714
#190919  1:34:34 server id 1  end_log_pos 2821  Query   thread_id=34    exec_time=0     error_code=0
SET TIMESTAMP=1568849674/*!*/;
ALTER TABLE t5 IMPORT TABLESPACE
/*!*/;
SET @@SESSION.GTID_NEXT= 'AUTOMATIC' /* added by mysqlbinlog */ /*!*/;
DELIMITER ;
# End of log file
/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;

Most important, I broke nothing!

mysql> SELECT MEMBER_ID, MEMBER_STATE FROM performance_schema.replication_group_members;
+--------------------------------------+--------------+
| MEMBER_ID | MEMBER_STATE |
+--------------------------------------+--------------+
| 43a67ea3-e1a0-11e7-8a9a-0021f66e910e | ONLINE |
| 45ab082d-e1a0-11e7-8b73-0021f66e910e | ONLINE |
| 48be28c0-e1a0-11e7-8c19-0021f66e910e | ONLINE |
+--------------------------------------+--------------+
3 rows in set (0.00 sec)


To wrap up, instead of loading GB or TB into InnoDB Cluster and have the cluster replicate massive amount of rows, this trick can push your data at no cost.
Comments are welcome!

Tuesday, July 31, 2018

High Availability with MySQL Cluster, Setup From Command Line (II)

In the first delivery of this series of posts, delivered for whom who are interested to understand the basics of MySQL Cluster "by examples", I wrote about installing MySQL Cluster with a python utility called ndb_setup-py, which offers a nice web graphical interface to define and start our brand new cluster.

In this post I will share an example to do everything from scratch and manually, instead. Doing things manually is always the best recommendation to learn everything about processes life cycle through their:
  • Initialization
  • Administration (start/stop/reconfigure)
  • Monitoring (logs/counters/status)
  • Troubleshooting 
The resulting topology I'd like to setup is composed of 2 data nodes, 1 management node and 1 SQL node, as in the following picture:

Let's get down to business and deal today with processes initialization; I will discuss about setting up a MySQL Cluster environment from scratch and step by step on a Linux machine.
  1. Download generic binaries release for Linux
  2. File system preparation
  3. Initialize and start the management node
  4. Initialize and start the data nodes
  5. Initialize and start the SQL node
  6. Monitor the status 
In order to run a MySQL Cluster deployment, I will need to create:
  • A cluster configuration file, usually named config.ini. This is unique for all cluster processes.
  • The SQL node configuration file, usually named my.cnf. Must have one per SQL node.

1. Download generic binaries release for Linux

My favorite MySQL Cluster install method is using binaries, as to the purpose of learning how to install and administer a cluster it is not needed to install RPMs and their related scripts to start/stop a process as a service. Hence I recommend to download binaries by choosing "Linux - Generic" from related download page

2. File system preparation

Processes need to store stuff on the disk: not only to achieve durability, but also to store logs and other support files necessary to perform their duties. So every process will have a data directory. Plus, every process must define a nodeid, which is an integer number used to identify a process. Therefore, for simplicity, I recommend to create 4 folders under the desired path and name them with the nodeid I will set later in the configuration file. Important note: in order to make things easier, I will setup a whole cluster on my machine, while this is not the recommended topology in a real production environment, where data nodes and the management node should all run on dedicated hosts.
  • 49. This folder will store data directory for the management node
  • 1. This folder will store the data directory for data node 1
  • 2. This folder will store the data directory for data node 2 (replica of data node 1)
  • 50. This will store the data directory for the SQL node
Now that the four directories are created, let the fun begin!

3. Initialize and start the management node

Before starting the management node for the first time, I need to setup a config.ini file. Let's use this bare simple one.

[NDB_MGMD]
HostName=127.0.0.1
DataDir=/home/mortensi/cluster/49
NodeId=49

[NDBD]
NodeId=1
HostName=127.0.0.1
DataDir=/home/mortensi/cluster/1

[NDBD]
NodeId=2
HostName=127.0.0.1
DataDir=/home/mortensi/cluster/2

[MYSQLD]
HostName=127.0.0.1
NodeId=50

This configuration file, the easiest possible (all configuration parameter have default values), is telling that:
  • Node ids are: 49 for the management node, 1 and 2 for the data nodes and 50 for the SQL node (the mysqld instance)
  • We're setting up a topology where all processes are co-located on the same localhost machine
  • We will use those folders we created as data directory.
Let's start the management node now:

ndb_mgmd --config-file=/home/mortensi/cluster/config.ini --configdir=/home/mortensi/cluster/49 --initial &

If no error is reported, we can check if the management process is running with ndb_mgm command line client tool like this:

ndb_mgm -e show
Connected to Management Server at: localhost:1186
Cluster Configuration
---------------------
[ndbd(NDB)]     2 node(s)
id=1 (not connected, accepting connect from 127.0.0.1)
id=2 (not connected, accepting connect from 127.0.0.1)

[ndb_mgmd(MGM)] 1 node(s)
id=49 @127.0.0.1 (mysql-5.7.22 ndb-7.6.6)

[mysqld(API)]   1 node(s)
id=50 (not connected, accepting connect from 127.0.0.1)

This output is telling that our 2 data nodes defined in the config.ini have not been started, but the management was started successfully. Finally, the SQL node is not connected, yet.

4. Initialize and start the data nodes

Now it's time to start the core processes for any cluster setup: data nodes! Data nodes can only be started after the management node, as they need to connect to it to retrieve the configuration of the cluster. Run the following:

ndbmtd --connect-string=localhost --initial --ndb-nodeid=1 &
ndbmtd --connect-string=localhost --initial --ndb-nodeid=2 &

Check again status with ndb_mgm:

ndb_mgm -e show
Connected to Management Server at: localhost:1186
Cluster Configuration
---------------------
[ndbd(NDB)] 2 node(s)
id=1 @127.0.0.1 (mysql-5.7.22 ndb-7.6.6, Nodegroup: 0, *)
id=2 @127.0.0.1 (mysql-5.7.22 ndb-7.6.6, Nodegroup: 0)

[ndb_mgmd(MGM)] 1 node(s)
id=49 @127.0.0.1 (mysql-5.7.22 ndb-7.6.6)

[mysqld(API)] 1 node(s)
id=50 (not connected, accepting connect from 127.0.0.1)

We're done! Now that the cluster is up and running, we only need to connect through the SQL node and play! Remember that future data nodes restart, must be done without --initial option, otherwise data nodes data directories will be erased!

5. Initialize and start the SQL node

To finalize this cluster installation, we need to initialize SQL node data directory as follows (must indicate the path to the binary release and to the brand new datadir candidate folder):

mysqld --initialize-insecure --datadir="/home/mortensi/cluster/50" --basedir="/export/home/mortensi/software/mysql-cluster-gpl-7.6.6-linux-glibc2.12-x86_64/" --user=mortensi

Now that the SQL node has been initialized, let's use this simple my.cnf configuration file to start it:

[mysqld]
ndbcluster=on
ndb_nodeid=50
datadir=/home/mortensi/cluster/50
basedir=/export/home/mortensi/software/mysql-cluster-gpl-7.6.6-linux-glibc2.12-x86_64/
socket=/home/mortensi/cluster/50/mysqld.sock
log_error=/home/mortensi/cluster/50/mysqld.log

Save the file, and let's go:

mysqld --defaults-file=/home/mortensi/cluster/my.cnf &

You will now be able to connect:

mysql -h127.0.0.1 -uroot

And check your brand new cluster status!

mysql> select * from ndbinfo.nodes;
+---------+--------+---------+-------------+-------------------+
| node_id | uptime | status  | start_phase | config_generation |
+---------+--------+---------+-------------+-------------------+
|       1 |   1691 | STARTED |           0 |                 1 |
|       2 |   1690 | STARTED |           0 |                 1 |
+---------+--------+---------+-------------+-------------------+
2 rows in set (0.02 sec)

Cool, isn't it?

6. Monitor status

To complete this mini tutorial, let's just check status for this cluster as already done:

ndb_mgm -e show
Connected to Management Server at: localhost:1186
Cluster Configuration
---------------------
[ndbd(NDB)] 2 node(s)
id=1 @127.0.0.1 (mysql-5.7.22 ndb-7.6.6, Nodegroup: 0, *)
id=2 @127.0.0.1 (mysql-5.7.22 ndb-7.6.6, Nodegroup: 0)

[ndb_mgmd(MGM)] 1 node(s)
id=49 @127.0.0.1 (mysql-5.7.22 ndb-7.6.6)

[mysqld(API)] 1 node(s)
id=50 @127.0.0.1 (mysql-5.7.22 ndb-7.6.6)

All processes declared as belonging to the cluster (topology was defined in the config.ini file) are now up and running, mission accomplished!

Wednesday, July 25, 2018

Top 5 Things to Consider to Get Started with MySQL Performance Tuning

Today I'll share a few slides I prepared last year for a presentation delivered at Oracle Open World. This is a quick and easy hands-on lab for fresh MySQL Server DBAs. I chose 5 among the most relevant topics when tuning and scaling a MySQL Server using InnoDB tables.

In particular, in this hands-on, I will talk about:
  • Scaling connections
  • The threads model
  • InnoDB REDO log
  • InnoDB Buffer Pool
  • The Execution plan


MySQL Performance Tuning 101 from Mirko Ortensi

Rate at which MySQL is delivering new features and improvements is impressive, in fact MySQL Server 8 boosts performance in many fields, especially regarding InnoDB REDO logging. Hence while the rest of recommendations still apply for new MySQL Server 8, the tuning REDO log flushing strategy is not mandatory anymore to achieve an improved throughput starting from MySQL 8. Improvements in such a field are mentioned by this blog post edited by Dimitri Kravtchuk, MySQL Performance Architect at Oracle MySQL (Twitter).

For an overview of optimization techniques, I recommend having a look at official documentation.