Migrate a MySQL Table to Redis
In this post I will list a few options to migrate a table from MySQL to Redis logically. I will run examples using the classical world
database. The world
database can be downloaded here, so if you would like to try the examples, just import it in your MySQL instance.
mysql> use world;
Database changed
mysql> show tables;
+-----------------+
| Tables_in_world |
+-----------------+
| city |
| country |
| countrylanguage |
+-----------------+
3 rows in set (0.01 sec)
There are several ways to go ahead and import a table into Redis, let’s go through some of them. In the following examples I will choose a Hash data type, the most similar thing to a traditional table row in Redis.
Redis Hashes are maps between string fields and string values, so they are the perfect data type to represent objects (e.g. A User with a number of fields like name, surname, age, and so forth)
Export in CSV format using SELECT INTO OUTFILE, import with Python
In order to export data from MySQL and import it into Redis, we can use MySQL SELECT INTO OUTFILE facility, which generates an output file formatted as desired. In particular, I will make sure that the fields are separated by commas. Let’s try it:
mysql> SELECT * INTO OUTFILE '/tmp/city.csv' FIELDS TERMINATED BY ',' ENCLOSED BY '"' LINES TERMINATED BY '\n' FROM world.city;
ERROR 1290 (HY000): The MySQL server is running with the --secure-file-priv option so it cannot execute this statement
Oh, MySQL default configuration disallows exporting data to the file system. Let’s configure the my.cnf configuration file and fix it:
[mysqld]
secure_file_priv=/tmp
Now I can finally export:
mysql> SELECT * INTO OUTFILE '/tmp/city.csv' FIELDS TERMINATED BY ',' ENCLOSED BY '"' LINES TERMINATED BY '\n' FROM world.city;
Query OK, 4079 rows affected (0.01 sec)
Here is an example:
"1","Kabul","AFG","Kabol","1780000"
"2","Qandahar","AFG","Qandahar","237500"
"3","Herat","AFG","Herat","186800"
"4","Mazar-e-Sharif","AFG","Balkh","127800"
"5","Amsterdam","NLD","Noord-Holland","731200"
"6","Rotterdam","NLD","Zuid-Holland","593321"
"7","Haag","NLD","Zuid-Holland","440900"
"8","Utrecht","NLD","Utrecht","234323"
Now that data is in CSV format, I can import it with a simple Python script, to make an example it is possible to use the following (refer to instructions to setup a Python environment to connect to Redis).
#!/usr/bin/python3 import redis from csv import reader r = redis.Redis(host='127.0.0.1', port=4321, password='') with open('/tmp/city.csv', 'r') as cities: csv_reader = reader(cities) for row in csv_reader: r.hset("city:" + row[0], mapping={ "Name" : row[1], "CountryCode" : row[2], "District" : row[3], "Population" : row[4] })
Let’s check if it was imported correctly:
127.0.0.1:4321> SCAN 0 MATCH city:* COUNT 5
1) "3072"
2) 1) "city:2316"
2) "city:749"
3) "city:570"
4) "city:3625"
5) "city:3328"
6) "city:1771"
I can verify the content of a Hash entry as well:
127.0.0.1:4321> HGETALL city:4059
1) "Name"
2) "Cary"
3) "CountryCode"
4) "USA"
5) "District"
6) "North Carolina"
7) "Population"
8) "91213"
Export in CSV format using mysqldump
Instead of using SELECT INTO OUTFILE (to some it may represent a security issue because it is the server itself to dump the rows to the file system, and this needs secure_file_priv to be enabled), it is possible to export data remotely, using mysqldump backup utility which, acting as a client, does not require any change to server configuration.
mysqldump --host=127.0.0.1 --user=root --password --tab=/tmp --fields-enclosed-by='"' --fields-terminated-by=',' --lines-terminated-by='\n' world city
Once the file is exported (mysqldump exports two files, table definitions in city.sql
file, and the CSV formatted file, city.txt
), data can be imported as explained.
Import CSV dump with RIOT
Assuming you have a CSV dump of your MySQL table (in reality the CSV can proceed from any arbitrary source of data, an Excel spreadsheet, Postgres…) and you don’t want to code any single line to import it into Redis, you can take advantage of RIOT (Redis Input Output Tools), that serve the purpose of migrating heterogeneous sources of data to Redis.
Redis Input/Output Tools (RIOT) is a series of utilities designed to help you get data in and out of Redis.
In this section I will explore the CSV import utility of RIOT, that is called RIOT File. First, install it (I have a Mac, but refer to the documentation for other installation methods).
brew install redis-developer/tap/riot-file
Once installed, test the help:
(redisvenv) bash-3.2$ riot-file --help
The operation couldn’t be completed. Unable to locate a Java Runtime.
Please visit http://www.java.com for information on installing Java.
And yes, you need also to download a JRE from www.java.com. Once done, go ahead:
(redisvenv) bash-3.2$ riot-file import hset --help
Usage: riot-file import hset [OPTIONS]
Set hashes from input
-H, --help Show this help message and exit
-s, --separator=<str> Key separator (default: :)
-r, --remove Remove key or member fields the first time they are used
--ignore-missing Ignore missing fields
-p, --keyspace=<str> Keyspace prefix
-k, --keys=<fields>... Key fields
--include=<field>... Fields to include
--exclude=<field>... Fields to exclude
Now we have to map the CSV source of data to the proper Hash data structure. Our initial CSV file had no header in the first line. If we would like to have it, we could even extract them from MySQL information_schema
in the right order as follows.
Note: MySQL SELECT INTO OUTFILE does not add a header to the exported dataset, this must be added manually using the proper SQL syntax, or editing the exported CSV file.
mysql> SELECT group_concat(COLUMN_NAME ORDER BY ORDINAL_POSITION separator ',') FROM information_schema.COLUMNS WHERE TABLE_NAME='city' AND TABLE_SCHEMA='world'; +-------------------------------------------------------------------+ | group_concat(COLUMN_NAME ORDER BY ORDINAL_POSITION separator ',') | +-------------------------------------------------------------------+ | ID,Name,CountryCode,District,Population | +-------------------------------------------------------------------+ 1 row in set (0.00 sec)
And add the line 'ID,Name,CountryCode,District,Population
‘ as the first line in the CSV file. But we can still go ahead and do the mapping manually as follows. Here we specify the order of the fields as they come in the CSV file, what keyspace name we want, and what key will complement the keyspace name (e.g. city:1234
).
riot-file -h 127.0.0.1 -p 4321 import /tmp/city.csv --fields ID Name CountryCode District Population hmset --keyspace city --keys ID
Alternatively, if the CSV file has the header obtained from MySQL, we can just add --header
to use the CSV header added manually.
riot-file -h 127.0.0.1 -p 4321 import /tmp/city.csv --header hmset --keyspace city --keys ID
Use RIOT DB to import from MySQL
In order to import from MySQL with no intermediate exported dump, you can code your solution including connectors to both the source database and the target Redis database, or simply use RIOT DB. So install it:
brew install redis-developer/tap/riot-db
Then proceed to import as usual, specifying the keyspace name and the key, and providing the correct connection string. And obviously, the query that defines the dataset to be imported.
riot-db -h 127.0.0.1 -p 4321 import "SELECT * FROM city" --url jdbc:mysql://root:Oracle1*@127.0.0.1:3306/world hmset --keyspace city --keys ID
Export and do mass insertion in RESP format
So far I have explored logical approaches, but for mass insertion, where top speed is required to insert millions of keys, there is a different approach using the pipe mode of redis-cli, which reads data when it is available and send it to the server as soon as possible. In particular, using this method, we will use Redis Protocol, RESP. Read more about mass insertion.
It is possible to export a table in RESP format and stream it to the Redis Server in a shot. So let’s see how this works. Using SQL we can generate entries to be sent to Redis Server using this format:
*<args><cr><lf>
$<len0><cr><lf>
<arg0><cr><lf>
$<len1><cr><lf>
<arg1><cr><lf>
...
$<lenN><cr><lf>
<argN><cr><lf>
Where:
args
is the number of argumentslenN
is the length of the argument that will followargN
is the argument
So assuming that we want to add a Hash for every row in the MySQL table, and every Hash will insert 4 fields, the SQL statement to produce the dump in RESP protocol would be the following:
SELECT CONCAT( "*10\r\n", '$', LENGTH(redis_cmd), '\r\n',redis_cmd, '\r\n','$', LENGTH(redis_key), '\r\n',redis_key, '\r\n', '$', LENGTH(hkey1), '\r\n',hkey1, '\r\n','$', LENGTH(hval1), '\r\n', hval1, '\r\n' '$', LENGTH(hkey2), '\r\n',hkey2, '\r\n','$', LENGTH(hval2), '\r\n', hval2, '\r\n' '$', LENGTH(hkey3), '\r\n',hkey3, '\r\n','$', LENGTH(hval3), '\r\n', hval3, '\r\n' '$', LENGTH(hkey4), '\r\n',hkey4, '\r\n','$', LENGTH(hval4), '\r\n', hval4, '\r\n' ) FROM ( SELECT 'HSET' AS redis_cmd, CONCAT('city:',ID) AS redis_key, 'Name' AS hkey1, Name AS hval1, 'CountryCode' AS hkey2, CountryCode AS hval2, 'District' AS hkey3, District AS hval3, 'Population' AS hkey4, Population AS hval4 FROM world.city ) AS t;
Save this SQL in a resp.sql
file and stream from MySQL into Redis using the pipe mode:
bash-3.2$ mysql -h 127.0.0.1 -uroot -p -Dworld --skip-column-names --raw </tmp/resp.sql |redis-cli -p 4321 --pipe
Enter password:
All data transferred. Waiting for the last reply...
Last reply received from server.
errors: 0, replies: 4079
Data has been imported into the Redis Server!
Use Redis Connect for MySQL: redis-connect-mysql
The last option I will present here takes advantage of Redis Connect suite, and in particolar of redis-connect-mysql. As you can read in the docs:
The first time redis-connect-mysql connects to a MySQL database, it reads a consistent snapshot of all the schemas. When that snapshot is complete, the connector continuously streams the changes that were committed to MySQL and generates a corresponding insert, update or delete event.
So it is not just a migration tool, but a replication and transformation tool to connect heterogeneous databases. Redis CDC (Change Data Capture) was presented at RedisConf 2021, so have a look at the presentation for the details.
I am not running examples now using RedisCDC, but check the repository to learn how to operate with it.
Clean up
In order to remove the Hashes that were imported, you can go ahead with the following commands:
redis-cli -p <PORT> --scan --pattern city:* -i 0.01 | xargs redis-cli -p <PORT> unlink
Wrapping Up
You are now able to export a table from MySQL (or other relational database: any database can surely export to CSV, and either RIOT DB or RedisCDC includes connectors to several databases).
Connectors supported by RedisCDC are here, connectors supported by RIOT DB are here.
And now what? You may be wondering what you can do with those Hashes created in the Redis instance. So in my next post I will share a few example commands to store, retrieve, change, delete, index and search data in Redis. Stick around, I will write about it soon!
Redis is a registered trademark of Redis Ltd. Any rights therein are reserved to Redis Ltd. Any use by mortensi is for referential purposes only and does not indicate any sponsorship, endorsement or affiliation between Redis and mortensi.