Memcached
标签 : Java与NoSQL
在程序的实现中, 经常会忽略程序的运行时间. 即使采用类似的实现方法, 有时候运行速度也会相差很多. 大多数情况下, 这一速度上的差异是由数据访问速度的差异所导致的.
– 松本行弘<代码的未来>
Memcached is an in-memory key-value store for small chunks of arbitrary data (strings, objects) from results of database calls, API calls, or page rendering.
Free & open source, high-performance, distributed memory object caching system, generic in nature, but intended for use in speeding up dynamic web applications by alleviating database load.
编译
- 环境
编译Memcached需要使用gcc/make/cmake/autoconf/libtool等工具:
yum install gcc make cmake autoconf libtool
Memcached的事件循环机制基于libevent库:
wget https://github.com/libevent/libevent/releases/download/release-2.0.22-stable/libevent-2.0.22-stable.tar.gz
./configure --prefix=/usr/local/libevent
make && make install
- Memcached
wget http://memcached.org/files/memcached-1.4.25.tar.gz
./configure --prefix=/usr/local/memcached --with-libevent=/usr/local/libevent
make && make install
/usr/local/memcached/bin/memcached -vv -d #启动
- Manual
-p <num> TCP port number to listen on (default: 11211)
-U <num> UDP port number to listen on (default: 11211, 0 is off)
-s <file> UNIX socket path to listen on (disables network support)
-A enable ascii "shutdown" command
-a <mask> access mask for UNIX socket, in octal (default: 0700)
-l <addr> interface to listen on (default: INADDR_ANY, all addresses)
<addr> may be specified as host:port. If you don\'t specify
a port number, the value you specified with -p or -U is
used. You may specify multiple addresses separated by comma
or by using -l multiple times
-d run as a daemon
-u <username> assume identity of <username> (only when run as root)
-m <num> max memory to use for items in megabytes (default: 64 MB) ## 设置最大使用内存
-c <num> max simultaneous connections (default: 1024) ## 设置最大连接数
-v verbose (print errors/warnings while in event loop)
-vv very verbose (also print client commands/reponses)
-f <factor> chunk size growth factor (default: 1.25) ## 设置增长因子
-n <bytes> minimum space allocated for key+value+flags (default: 48)
-t <num> number of threads to use (default: 4)
-R Maximum number of requests per event, limits the number of
requests process for a given connection to prevent
starvation (default: 20)
-b <num> Set the backlog queue limit (default: 1024)
-I Override the size of each slab page. Adjusts max item size
(default: 1mb, min: 1k, max: 128m)
-F Disable flush_all command
- Client
Memcached可以通过C、C++、Java、PHP、Python、Ruby、Perl、Erlang、Lua等语言来访问.此外, Memcached的通信协议由简单文本构成, 使用如Telnet的方式也很容易访问:
telnet host port
命令
Memcached提供的命令简单易懂, 因此在此只做简单介绍, 详细可参考官方Wiki的Commands部分.
Storage Commands
<command name> <key> <flag> <expire> <bytes>
<data block>
Memcached提供的存储的命令如下:
命令 | 描述 |
---|---|
set |
Store this data, possibly overwriting any existing data. New items are at the top of the LRU. |
add |
Store this data, only if it does not already exist. New items are at the top of the LRU. If an item already exists and an add fails, it promotes the item to the front of the LRU anyway. |
replace |
Store this data, but only if the data already exists. Almost never used, and exists for protocol completeness (set, add, replace, etc) |
append |
Add this data after the last byte in an existing item. This does not allow you to extend past the item limit. Useful for managing lists. |
prepend |
Same as append, but adding new data before existing data. |
cas |
Check And Set (or Compare And Swap). An operation that stores data, but only if no one else has updated the data since you read it last. Useful for resolving race conditions on updating cache data. |
Storage Commands后面参数的形式/含义相同:
参数 | 描述 |
---|---|
key | 键: 类似于Map内的key,不能重复 |
flag | 自定义标志位(正整数) |
expire | 失效时间 (/s) |
bytes | 缓存内容长度 |
注意: 命令cas的格式稍有不同
cas <key> <flag> <expire> <bytes> <value>
<data block>
只有当value
值与gets
(见下)返回的identifier
数字一致时才会生效,否则返回EXISTS
.
Retrieval Commands
get/gets <key>
参数 | 描述 |
---|---|
get |
Command for retrieving data. Takes one or more keys and returns all found items. |
gets |
An alternative get command for using with CAS. Returns a CAS identifier (a unique 64bit number) with the item. Return this value with the cas command. If the item’s CAS value has changed since you gets’ed it, it will not be stored. |
Delete
delete key [time]
Removes an item from the cache, if it exists. time可选,指定rm该key的time秒内,不允许操作该key.
Incr/Decr
incr/decr <key> <number>
Increment and Decrement. If an item stored is the string representation of a 64bit integer, you may run incr or decr commands to modify that number. You may only incr by positive values, or decr by positive values. They does not accept negative values.
- 应用场景: 秒杀
原先秒杀的下单过程的所有操作都通过数据库,比如读取库存/写入订单/更新库存/收缴欠款等, 响应缓慢且对数据库压力具大,现在可将与库存相关操作都放到Memcached内: 在Memcached中设置一个count(库存量),每个秒杀decr
之,成功之后再进行后面的一系列下单操作,由于主要在内存操作,速度非常快.
Statistics
参数 | 描述 |
---|---|
stats |
ye ‘ole basic stats command. 如计算统计缓存命中率: (get_hits / cmd_get) * 100% |
stats items |
Returns some information, broken down by slab, about items stored in memcached. |
stats slabs |
Returns more information, broken down by slab, about items stored in memcached. More centered to performance of a slab rather than counts of particular items. |
stats sizes |
A special command that shows you how items would be distributed if slabs were broken into 32byte buckets instead of your current number of slabs. Useful for determining how efficient your slab sizing is. |
Flush
flush_all
Invalidate all existing cache items. Optionally takes a parameter, which means to invalidate all items after N seconds have passed.
This command does not pause the server, as it returns immediately. It does not free up or flush memory at all, it just causes all items to expire.
限制
参数 | 限制 |
---|---|
key长度限制 | 文本协议支持250B, 二进制协议支持 65536B |
value限制 | 1M |
总内存限制 | 32位操作系统最大支持2G |
Memcached Slab Allocator
Memcached使用Slab Allocator机制来管理内存, 缓解内存碎片化问题:
Memcached启动时首先向操作系统申请一大块内存,并将其分割成各种尺寸的Chunk,并将尺寸相同的Chunk分成组Slab Class.
其中,Chunk块就是用来存储key-value数据的最小单位; 而每个Slab Class的大小可以在Memcached启动的时通过指定-f参数设置(默认1.25,所以如果第一组Chunk的大小为88B,第二组Chunk为122B,依此类推).
当Memcached收到客户端发送过来的数据时会首先根据数据大小选择一个最合适的Slab Class,然后通过查询该Slab Class内空闲Chunk列表缓存数据.
当一条数据过期/delete
时,该记录所占用的Chunk就可以回收,重新添加到空闲列表中.
Growth Factor调优
从以上过程可以看到Memcached内存管理制效率非常高,不会造成内存碎片,但它最大的缺点是会导致内存空间浪费:
因为每个Chunk的长度是“固定”的,所以变长数据无法充分利用这些空间:
如图:将100字节的数据缓存到128字节的Chunk中,剩余28个字节就浪费掉了.
Chunk空间的浪费问题无法彻底解决,只能缓解: 比如开发者预先对缓存数据长度进行估计, 并指定合理的Chunk大小. 但可惜的是,Memcached目前还不支持自定义Chunk的大小,但可以通过-f参数来调整Slab Class内Chunk的Growth Factor(增长因子):
/usr/local/memcached/bin/memcached -u nobody -vvv -f 1.25
注意:当
f=1.25
时,从Memcached输出结果来看,某些相邻的Slab Class大小比值并非精确的1.25,这些误差是为了保持字节对齐而故意设置的.
Memcached数据过期
Memcached会优先使用已超时Item空间,即便如此,也会发生追加新记录时空间不
足的情况,此时就要使用Least Recently Used(LRU)机制来分配空间:
Memcached为每个Item维持一个计数器count, 当某个Item被请求时,count+1,当空间不足时,通过count断其中某个Item最近最少使用, 然后踢出.
Lazy Expiration
Memcached内部不会监视记录是否过期,而是当get
时查看记录时间戳,以检查否过
期. 这种技术被称为Lazy Expiration, 好处是不会在Item过期监视上耗费CPU时间.
永久数据被踢
“Memcached数据丢失”(永久数据被踢): 明明将key设为永久有效,却莫名其妙的丢失了.
这种情况需要从以下几个方面分析:
1. 如果 Slab里的很多Chunk已经过期,但过期后并未被get
,Memcached则不知道他们已经过期.
2. 永久数据很久没get
,不活跃, 同时Memcached相关Slab内存紧张, 如果新增Item, 很可能被LRU机制踢出.
解决方案: 永久数据和非永久数据分开存放.