测试一下直接使用key-value和使用Hash的field-value 对来存储同样的数据消耗的内存和性能的对比.
测试环境 :
CentOS 5.x 64位
Redis version 2.4.14
测试数据 :
key=32Byte Value=36Byte
直接存储1亿对key-value的情况下需要多少空间及性能情况;
在1个Hash类型中存储1亿对field-value的情况下需要多少空间及性能情况;
使用hash的情况下,key=key去掉后三个字节,field=这个三个字节,value保持不变.
也就是Hash 存储 29Byte 3Byte 36Byte.
但是3个字节只能存储2^24=16777216个数字. 不能出现重复的field, 同一个HASH中如果出现同名的field重复赋值时新的value将覆盖老的value.
显然如果field只用3个字节存储1亿对filed-value是不可能的.
但是这里测试的不是单个HASH存储多对field-value, 而是多个HASH存储单对field-value.
其实这样使用HASH确实没啥用, 幸好有好心的网友指出, 大家请不要这么使用HASH. 这里只是测试而已.
hash 一般是单个KEY存储多对field-value 这么来用, 例如学生名字是1个HASHKEY的话, 学生的多项属性可以写多对field-value.
用法参见 :
本文测试脚本 :
1.key-value存储 :
[root@db5 ruby]# cat set.rb
2.使用Hash存储
#!/usr/local/rvm/bin/ruby
require "rubygems"
require "redis"
require "date"
rb = Redis.new
p DateTime.now
(ARGV[0]..ARGV[1]).each do
|x|
rb.set(("%032d" % x),("%036d" % x))
end
p DateTime.now
2.使用Hash存储
[root@db5 ruby]# cat hset.rb
测试脚本.
#!/usr/local/rvm/bin/ruby
require "rubygems"
require "redis"
require "date"
rb = Redis.new
p DateTime.now
(ARGV[0]..ARGV[1]).each do
|x|
rb.hset(("%029d" % x),("%003d" % x),("%036d" % x))
end
p DateTime.now
测试脚本.
[root@db5 ruby]# cat test.sh
#!/bin/bash
set() {
for((i=1;i<=8;i++))
do
ruby ./set.rb $((($i-1)*125000+1)) $(($i*120000)) >./set_$i.log 2>&1 &
done
}
hset() {
for((i=1;i<=8;i++))
do
ruby ./hset.rb $((($i-1)*125000+1)) $(($i*125000)) >./hset_$i.log 2>&1 &
done
}
case "$1" in
set)
set
;;
hset)
hset
;;
*)
echo $"Usage: $0 {set|hset}"
exit 0
esac
测试 :
1. 清空数据库
flushall
测试结果 :
1. 使用key-value存储 :
test.sh set
100W条插入耗时20秒, 消耗内存146MB.
详细信息 :
日志 :
2. 使用Hash存储
info
used_memory:153115032
used_memory_human:146.02M
db0:keys=1000000,expires=0
日志 :
[root@db-172-16-3-150 ~]# cat set*.log
#<DateTime: 2013-02-22T21:29:19+08:00 ((2456346j,48559s,721551000n),+28800s,2299161j)>
#<DateTime: 2013-02-22T21:29:39+08:00 ((2456346j,48579s,38908000n),+28800s,2299161j)>
#<DateTime: 2013-02-22T21:29:19+08:00 ((2456346j,48559s,721531000n),+28800s,2299161j)>
#<DateTime: 2013-02-22T21:29:37+08:00 ((2456346j,48577s,856512000n),+28800s,2299161j)>
#<DateTime: 2013-02-22T21:29:19+08:00 ((2456346j,48559s,721544000n),+28800s,2299161j)>
#<DateTime: 2013-02-22T21:29:38+08:00 ((2456346j,48578s,525444000n),+28800s,2299161j)>
#<DateTime: 2013-02-22T21:29:19+08:00 ((2456346j,48559s,721538000n),+28800s,2299161j)>
#<DateTime: 2013-02-22T21:29:38+08:00 ((2456346j,48578s,381601000n),+28800s,2299161j)>
#<DateTime: 2013-02-22T21:29:19+08:00 ((2456346j,48559s,721531000n),+28800s,2299161j)>
#<DateTime: 2013-02-22T21:29:40+08:00 ((2456346j,48580s,330511000n),+28800s,2299161j)>
#<DateTime: 2013-02-22T21:29:19+08:00 ((2456346j,48559s,721540000n),+28800s,2299161j)>
#<DateTime: 2013-02-22T21:29:39+08:00 ((2456346j,48579s,377204000n),+28800s,2299161j)>
#<DateTime: 2013-02-22T21:29:19+08:00 ((2456346j,48559s,722379000n),+28800s,2299161j)>
#<DateTime: 2013-02-22T21:29:41+08:00 ((2456346j,48581s,595150000n),+28800s,2299161j)>
#<DateTime: 2013-02-22T21:29:19+08:00 ((2456346j,48559s,721724000n),+28800s,2299161j)>
#<DateTime: 2013-02-22T21:29:39+08:00 ((2456346j,48579s,96449000n),+28800s,2299161j)>
数据取样 :
redis 127.0.0.1:6379> mget 00000000000000000000000000000001
1) "000000000000000000000000000000000001"
2. 使用Hash存储
test.sh hset
100W条插入耗时20秒, 消耗内存146MB.
详细信息 :
【其他】
info
used_memory:153115152
used_memory_human:146.02M
db0:keys=1000000,expires=0
日志 :
#<DateTime: 2013-02-22T21:31:09+08:00 ((2456346j,48669s,949762000n),+28800s,2299161j)>
#<DateTime: 2013-02-22T21:31:34+08:00 ((2456346j,48694s,136926000n),+28800s,2299161j)>
#<DateTime: 2013-02-22T21:31:09+08:00 ((2456346j,48669s,949507000n),+28800s,2299161j)>
#<DateTime: 2013-02-22T21:31:30+08:00 ((2456346j,48690s,234948000n),+28800s,2299161j)>
#<DateTime: 2013-02-22T21:31:09+08:00 ((2456346j,48669s,949754000n),+28800s,2299161j)>
#<DateTime: 2013-02-22T21:31:30+08:00 ((2456346j,48690s,549032000n),+28800s,2299161j)>
#<DateTime: 2013-02-22T21:31:09+08:00 ((2456346j,48669s,949755000n),+28800s,2299161j)>
#<DateTime: 2013-02-22T21:31:32+08:00 ((2456346j,48692s,41759000n),+28800s,2299161j)>
#<DateTime: 2013-02-22T21:31:09+08:00 ((2456346j,48669s,950409000n),+28800s,2299161j)>
#<DateTime: 2013-02-22T21:31:30+08:00 ((2456346j,48690s,539557000n),+28800s,2299161j)>
#<DateTime: 2013-02-22T21:31:09+08:00 ((2456346j,48669s,950731000n),+28800s,2299161j)>
#<DateTime: 2013-02-22T21:31:30+08:00 ((2456346j,48690s,32225000n),+28800s,2299161j)>
#<DateTime: 2013-02-22T21:31:09+08:00 ((2456346j,48669s,950718000n),+28800s,2299161j)>
#<DateTime: 2013-02-22T21:31:29+08:00 ((2456346j,48689s,244081000n),+28800s,2299161j)>
#<DateTime: 2013-02-22T21:31:09+08:00 ((2456346j,48669s,951798000n),+28800s,2299161j)>
#<DateTime: 2013-02-22T21:31:30+08:00 ((2456346j,48690s,550450000n),+28800s,2299161j)>
数据取样 :
redis 127.0.0.1:6379> hkeys 00000000000000000000000000001
1) "001"
【其他】
1. 使用key value存储和使用hashkey field value存储耗费的内存基本一致.
为什么会耗费差不多的内存呢? 和数据结构息息相关.
数据结构参考redis.h ,
例如listNode的数据结构 :
adlist.h
如果改成list存储, 空间可就不一样了.
/* Node, List, and Iterator are the only data structures used currently. */
typedef struct listNode {
struct listNode *prev;
struct listNode *next;
void *value;
} listNode;
如果改成list存储, 空间可就不一样了.
list测试如下 :
【参考】
[root@db-172-16-3-150 ~]# cat test.sh
#!/bin/bash
set() {
for((i=1;i<=8;i++))
do
ruby ./set.rb $((($i-1)*125000+1)) $(($i*125000)) >./set_$i.log 2>&1 &
done
}
hset() {
for((i=1;i<=8;i++))
do
ruby ./hset.rb $((($i-1)*125000+1)) $(($i*125000)) >./hset_$i.log 2>&1 &
done
}
list() {
for((i=1;i<=8;i++))
do
ruby ./list.rb $((($i-1)*125000+1)) $(($i*125000)) >./list_$i.log 2>&1 &
done
}
case "$1" in
set)
set
;;
hset)
hset
;;
list)
list
;;
*)
echo $"Usage: $0 {set|hset|list}"
exit 0
esac
[root@db-172-16-3-150 ~]# cat list.rb
#!/opt/bin/ruby
require "rubygems"
require "redis"
require "date"
rb = Redis.new
p DateTime.now
(ARGV[0]..ARGV[1]).each do
|x|
rb.lpush(("%032d" % x),("%036d" % x))
end
p DateTime.now
[root@db-172-16-3-150 ~]# cat list*.log
#<DateTime: 2013-02-22T22:04:18+08:00 ((2456346j,50658s,381501000n),+28800s,2299161j)>
#<DateTime: 2013-02-22T22:05:06+08:00 ((2456346j,50706s,933510000n),+28800s,2299161j)>
#<DateTime: 2013-02-22T22:04:18+08:00 ((2456346j,50658s,380903000n),+28800s,2299161j)>
#<DateTime: 2013-02-22T22:05:06+08:00 ((2456346j,50706s,455043000n),+28800s,2299161j)>
#<DateTime: 2013-02-22T22:04:18+08:00 ((2456346j,50658s,380904000n),+28800s,2299161j)>
#<DateTime: 2013-02-22T22:05:05+08:00 ((2456346j,50705s,924739000n),+28800s,2299161j)>
#<DateTime: 2013-02-22T22:04:18+08:00 ((2456346j,50658s,380901000n),+28800s,2299161j)>
#<DateTime: 2013-02-22T22:05:05+08:00 ((2456346j,50705s,499455000n),+28800s,2299161j)>
#<DateTime: 2013-02-22T22:04:18+08:00 ((2456346j,50658s,380901000n),+28800s,2299161j)>
#<DateTime: 2013-02-22T22:05:05+08:00 ((2456346j,50705s,706013000n),+28800s,2299161j)>
#<DateTime: 2013-02-22T22:04:18+08:00 ((2456346j,50658s,381887000n),+28800s,2299161j)>
#<DateTime: 2013-02-22T22:05:07+08:00 ((2456346j,50707s,770955000n),+28800s,2299161j)>
#<DateTime: 2013-02-22T22:04:18+08:00 ((2456346j,50658s,380905000n),+28800s,2299161j)>
#<DateTime: 2013-02-22T22:05:06+08:00 ((2456346j,50706s,104680000n),+28800s,2299161j)>
#<DateTime: 2013-02-22T22:04:18+08:00 ((2456346j,50658s,382742000n),+28800s,2299161j)>
#<DateTime: 2013-02-22T22:05:07+08:00 ((2456346j,50707s,149242000n),+28800s,2299161j)>
info
used_memory:169117248
used_memory_human:161.28M
db0:keys=1000000,expires=0
redis 127.0.0.1:6379> lrange 00000000000000000000000000000001 0 0
1) "000000000000000000000000000000000001"
【参考】
Redis内存容量的预估和优化
2. Redis DOC
3. HASH