# 如何高效地向Redis插入大量的数据（转）

Using a normal Redis client to perform mass insertion is not a good idea for a few reasons: the naive approach of sending one command after the other is slow because you have to pay for the round trip time for every command. It is possible to use pipelining, but for mass insertion of many records you need to write new commands while you read replies at the same time to make sure you are inserting as fast as possible.

Only a small percentage of clients support non-blocking I/O, and not all the clients are able to parse the replies in an efficient way in order to maximize throughput. For all this reasons the preferred way to mass import data into Redis is to generate a text file containing the Redis protocol, in raw format, in order to call the commands needed to insert the required data.

1> 每个redis客户端命令之间有往返时延。

2> 只要一部分客户端支持非阻塞I/O。

1. 新建一个文本文件，包含redis命令

SET Key0 Value0
SET Key1 Value1
...
SET KeyN ValueN

2. 将这些命令转化成Redis Protocol。

3. 利用管道插入

cat data.txt | redis-cli --pipe

Shell VS Redis pipe

Shell

#!/bin/bash
for ((i=0;i<100000;i++))
do
echo -en "helloworld" | redis-cli -x set name$i >>redis.log done 每次插入的值都是helloworld，但键不同，name0，name1...name99999。 Redis pipe Redis pipe会稍微麻烦一点 1> 首先构造redis命令的文本文件 在这里，我选用了python #!/usr/bin/python for i in range(100000): print 'set name'+str(i),'helloworld' # python 1.py > redis_commands.txt # head -2 redis_commands.txt set name0 helloworld set name1 helloworld 2> 将这些命令转化成Redis Protocol 在这里，我利用了github上一个shell脚本， #!/bin/bash while read CMD; do # each command begins with *{number arguments in command}\r\n XS=($CMD); printf "*${#XS[@]}\r\n" # for each argument, we append${length}\r\n{argument}\r\n
for X in $CMD; do printf "\$${#X}\r\n$X\r\n"; done
done < redis_commands.txt

# sh 20.sh > redis_data.txt

*3
$3 set$5
name0
\$10
helloworld

• redis-cli --pipe tries to send data as fast as possible to the server.
• At the same time it reads data when available, trying to parse it.
• Once there is no more data to read from stdin, it sends a special ECHO command with a random 20 bytes string: we are sure this is the latest command sent, and we are sure we can match the reply checking if we receive the same 20 bytes as a bulk reply.
• Once this special final command is sent, the code receiving replies starts to match replies with this 20 bytes. When the matching reply is reached it can exit with success.

[root@mysql-server1 ~]# time python 1.py > redis_commands.txt

real    0m0.110s
user    0m0.070s
sys    0m0.040s
[root@mysql-server1 ~]# time sh 20.sh > redis_data.txt

real    0m7.112s
user    0m5.861s
sys    0m1.255s

0