Migrate a MySQL Table to Redis

MySQL to Redis

In this post I will list a few options to migrate a table from MySQL to Redis logically. I will run examples using the classical world database. The world database can be downloaded here, so if you would like to try the examples, just import it in your MySQL instance.

mysql> use world;
Database changed
mysql> show tables;
+-----------------+
| Tables_in_world |
+-----------------+
| city            |
| country         |
| countrylanguage |
+-----------------+
3 rows in set (0.01 sec)

There are several ways to go ahead and import a table into Redis, let’s go through some of them. In the following examples I will choose a Hash data type, the most similar thing to a traditional table row in Redis.

Redis Hashes are maps between string fields and string values, so they are the perfect data type to represent objects (e.g. A User with a number of fields like name, surname, age, and so forth)

Export in CSV format using SELECT INTO OUTFILE, import with Python

In order to export data from MySQL and import it into Redis, we can use MySQL SELECT INTO OUTFILE facility, which generates an output file formatted as desired. In particular, I will make sure that the fields are separated by commas. Let’s try it:

mysql> SELECT * INTO OUTFILE '/tmp/city.csv' FIELDS TERMINATED BY ',' ENCLOSED BY '"' LINES TERMINATED BY '\n' FROM world.city;
ERROR 1290 (HY000): The MySQL server is running with the --secure-file-priv option so it cannot execute this statement

Oh, MySQL default configuration disallows exporting data to the file system. Let’s configure the my.cnf configuration file and fix it:

[mysqld]
secure_file_priv=/tmp

Now I can finally export:

mysql> SELECT * INTO OUTFILE '/tmp/city.csv' FIELDS TERMINATED BY ',' ENCLOSED BY '"' LINES TERMINATED BY '\n' FROM world.city;
Query OK, 4079 rows affected (0.01 sec)

Here is an example:

"1","Kabul","AFG","Kabol","1780000"
"2","Qandahar","AFG","Qandahar","237500"
"3","Herat","AFG","Herat","186800"
"4","Mazar-e-Sharif","AFG","Balkh","127800"
"5","Amsterdam","NLD","Noord-Holland","731200"
"6","Rotterdam","NLD","Zuid-Holland","593321"
"7","Haag","NLD","Zuid-Holland","440900"
"8","Utrecht","NLD","Utrecht","234323"

Now that data is in CSV format, I can import it with a simple Python script, to make an example it is possible to use the following (refer to instructions to setup a Python environment to connect to Redis).

#!/usr/bin/python3
import redis
from csv import reader

r = redis.Redis(host='127.0.0.1', port=4321, password='')

with open('/tmp/city.csv', 'r') as cities:
    csv_reader = reader(cities)
    for row in csv_reader:
        r.hset("city:" + row[0],  mapping={ "Name" : row[1], "CountryCode" : row[2], "District" : row[3], "Population" : row[4] })

Let’s check if it was imported correctly:

127.0.0.1:4321> SCAN 0 MATCH city:* COUNT 5
1) "3072"
2) 1) "city:2316"
   2) "city:749"
   3) "city:570"
   4) "city:3625"
   5) "city:3328"
   6) "city:1771"

I can verify the content of a Hash entry as well:

127.0.0.1:4321> HGETALL city:4059
1) "Name"
2) "Cary"
3) "CountryCode"
4) "USA"
5) "District"
6) "North Carolina"
7) "Population"
8) "91213"

Export in CSV format using mysqldump

Instead of using SELECT INTO OUTFILE (to some it may represent a security issue because it is the server itself to dump the rows to the file system, and this needs secure_file_priv to be enabled), it is possible to export data remotely, using mysqldump backup utility which, acting as a client, does not require any change to server configuration.

mysqldump --host=127.0.0.1 --user=root --password --tab=/tmp --fields-enclosed-by='"' --fields-terminated-by=',' --lines-terminated-by='\n' world city

Once the file is exported (mysqldump exports two files, table definitions in city.sql file, and the CSV formatted file, city.txt), data can be imported as explained.

Import CSV dump with RIOT

Assuming you have a CSV dump of your MySQL table (in reality the CSV can proceed from any arbitrary source of data, an Excel spreadsheet, Postgres…) and you don’t want to code any single line to import it into Redis, you can take advantage of RIOT (Redis Input Output Tools), that serve the purpose of migrating heterogeneous sources of data to Redis.

Redis Input Output Tools: RIOT

Redis Input/Output Tools (RIOT) is a series of utilities designed to help you get data in and out of Redis.

In this section I will explore the CSV import utility of RIOT, that is called RIOT File. First, install it (I have a Mac, but refer to the documentation for other installation methods).

brew install redis-developer/tap/riot-file

Once installed, test the help:

(redisvenv) bash-3.2$ riot-file --help
The operation couldn’t be completed. Unable to locate a Java Runtime.
Please visit http://www.java.com for information on installing Java.

And yes, you need also to download a JRE from www.java.com. Once done, go ahead:

(redisvenv) bash-3.2$ riot-file import hset --help
Usage: riot-file import hset [OPTIONS]
Set hashes from input
  -H, --help                 Show this help message and exit
  -s, --separator=<str>      Key separator (default: :)
  -r, --remove               Remove key or member fields the first time they are used
      --ignore-missing       Ignore missing fields
  -p, --keyspace=<str>       Keyspace prefix
  -k, --keys=<fields>...     Key fields
      --include=<field>...   Fields to include
      --exclude=<field>...   Fields to exclude

Now we have to map the CSV source of data to the proper Hash data structure. Our initial CSV file had no header in the first line. If we would like to have it, we could even extract them from MySQL information_schema in the right order as follows.

Note: MySQL SELECT INTO OUTFILE does not add a header to the exported dataset, this must be added manually using the proper SQL syntax, or editing the exported CSV file.

mysql> SELECT group_concat(COLUMN_NAME ORDER BY ORDINAL_POSITION separator ',') FROM information_schema.COLUMNS WHERE TABLE_NAME='city' AND TABLE_SCHEMA='world';
+-------------------------------------------------------------------+
| group_concat(COLUMN_NAME ORDER BY ORDINAL_POSITION separator ',') |
+-------------------------------------------------------------------+
| ID,Name,CountryCode,District,Population                           |
+-------------------------------------------------------------------+
1 row in set (0.00 sec)

And add the line 'ID,Name,CountryCode,District,Population‘ as the first line in the CSV file. But we can still go ahead and do the mapping manually as follows. Here we specify the order of the fields as they come in the CSV file, what keyspace name we want, and what key will complement the keyspace name (e.g. city:1234).

riot-file -h 127.0.0.1 -p 4321 import /tmp/city.csv --fields ID Name CountryCode District Population hmset --keyspace city --keys ID

Alternatively, if the CSV file has the header obtained from MySQL, we can just add --header to use the CSV header added manually.

riot-file -h 127.0.0.1 -p 4321 import /tmp/city.csv --header  hmset --keyspace city --keys ID

Use RIOT DB to import from MySQL

In order to import from MySQL with no intermediate exported dump, you can code your solution including connectors to both the source database and the target Redis database, or simply use RIOT DB. So install it:

brew install redis-developer/tap/riot-db

Then proceed to import as usual, specifying the keyspace name and the key, and providing the correct connection string. And obviously, the query that defines the dataset to be imported.

riot-db -h 127.0.0.1 -p 4321 import "SELECT * FROM city" --url jdbc:mysql://root:Oracle1*@127.0.0.1:3306/world hmset --keyspace city --keys ID

Export and do mass insertion in RESP format

So far I have explored logical approaches, but for mass insertion, where top speed is required to insert millions of keys, there is a different approach using the pipe mode of redis-cli, which reads data when it is available and send it to the server as soon as possible. In particular, using this method, we will use Redis Protocol, RESP. Read more about mass insertion.

It is possible to export a table in RESP format and stream it to the Redis Server in a shot. So let’s see how this works. Using SQL we can generate entries to be sent to Redis Server using this format:

*<args><cr><lf>
$<len0><cr><lf>
<arg0><cr><lf>
$<len1><cr><lf>
<arg1><cr><lf>
...
$<lenN><cr><lf>
<argN><cr><lf>

Where:

  • args is the number of arguments
  • lenN is the length of the argument that will follow
  • argN is the argument

So assuming that we want to add a Hash for every row in the MySQL table, and every Hash will insert 4 fields, the SQL statement to produce the dump in RESP protocol would be the following:

SELECT CONCAT(
"*10\r\n",
'$', LENGTH(redis_cmd), '\r\n',redis_cmd, '\r\n','$', LENGTH(redis_key), '\r\n',redis_key, '\r\n',
'$', LENGTH(hkey1), '\r\n',hkey1, '\r\n','$', LENGTH(hval1), '\r\n', hval1, '\r\n'
'$', LENGTH(hkey2), '\r\n',hkey2, '\r\n','$', LENGTH(hval2), '\r\n', hval2, '\r\n'
'$', LENGTH(hkey3), '\r\n',hkey3, '\r\n','$', LENGTH(hval3), '\r\n', hval3, '\r\n'
'$', LENGTH(hkey4), '\r\n',hkey4, '\r\n','$', LENGTH(hval4), '\r\n', hval4, '\r\n'
)
FROM (
 SELECT
'HSET' AS redis_cmd, CONCAT('city:',ID) AS redis_key,
'Name' AS hkey1, Name AS hval1,
'CountryCode' AS hkey2, CountryCode AS hval2,
'District' AS hkey3, District AS hval3,
'Population' AS hkey4, Population AS hval4
 FROM world.city
 ) AS t;

Save this SQL in a resp.sql file and stream from MySQL into Redis using the pipe mode:

bash-3.2$ mysql -h 127.0.0.1 -uroot -p -Dworld --skip-column-names --raw </tmp/resp.sql |redis-cli -p 4321 --pipe
Enter password: 
All data transferred. Waiting for the last reply...
Last reply received from server.
errors: 0, replies: 4079

Data has been imported into the Redis Server!

Use Redis Connect for MySQL: redis-connect-mysql

The last option I will present here takes advantage of Redis Connect suite, and in particolar of redis-connect-mysql. As you can read in the docs:

The first time redis-connect-mysql connects to a MySQL database, it reads a consistent snapshot of all the schemas. When that snapshot is complete, the connector continuously streams the changes that were committed to MySQL and generates a corresponding insert, update or delete event.

So it is not just a migration tool, but a replication and transformation tool to connect heterogeneous databases. Redis CDC (Change Data Capture) was presented at RedisConf 2021, so have a look at the presentation for the details.

I am not running examples now using RedisCDC, but check the repository to learn how to operate with it.

Clean up

In order to remove the Hashes that were imported, you can go ahead with the following commands:

redis-cli -p <PORT> --scan --pattern city:* -i 0.01 | xargs redis-cli -p <PORT> unlink

Wrapping Up

You are now able to export a table from MySQL (or other relational database: any database can surely export to CSV, and either RIOT DB or RedisCDC includes connectors to several databases).

Connectors supported by RedisCDC are here, connectors supported by RIOT DB are here.

And now what? You may be wondering what you can do with those Hashes created in the Redis instance. So in my next post I will share a few example commands to store, retrieve, change, delete, index and search data in Redis. Stick around, I will write about it soon!

Redis is a registered trademark of Redis Ltd. Any rights therein are reserved to Redis Ltd. Any use by mortensi is for referential purposes only and does not indicate any sponsorship, endorsement or affiliation between Redis and mortensi.

Leave A Comment