序列化 (Serialization)是将对象的状态信息转换为可以存储或传输的形式的过程。在序列化期间,对象将其当前状态写入到临时或持久性存储区。以后,可以通过从存储区中读取或反序列化对象的状态,重新创建该对象。
在scrapy_redis中,一个Request对象先经过DupeFilter去重,然后递交给scheduler调度储存在Redis中,这就面临一个问题,Request是一个对象,Redis不能存储该对象,这时就需要将request序列化储存。
scrapy中序列化模块如下:
from scrapy_redis import picklecompat
"""A pickle wrapper module with protocol=-1 by default."""
try:
import cPickle as pickle # PY2
except ImportError:
import pickle
def loads(s):
return pickle.loads(s)
def dumps(obj):
return pickle.dumps(obj, protocol=-1)
当然python3直接使用pickle模块, 已经没有cPickle,该模块最为重要的两个方法,序列化与反序列化如上,通过序列化后的对象我们可以存储在数据库、文本等文件中,并快速恢复。
同时模式设计中的备忘录模式通过这种方式达到最佳效果《python设计模式(十九):备忘录模式》;可序列化的对象和数据类型如下:
None
,True,
False-
整数,长整数,浮点数,复数 -
普通字符串和Unicode字符串 -
元组、列表、集合和字典,只包含可选择的对象。 -
在模块顶层定义的函数 -
在模块顶层定义的内置函数 -
在模块的顶层定义的类。 -
这些类的实例
PicklingError
RuntimeError
pickle.
dump
(obj, file[, protocol])
Write a pickled representation of objto the open file object file. This is equivalent to Pickler(file, protocol).dump(obj)
.If the protocolparameter is omitted, protocol 0 is used. If protocolis specified as a negative value or HIGHEST_PROTOCOL
, the highest protocol version will be used. Changed in version 2.3:
Introduced the protocolparameter. file
must have a write()
method that accepts a single string argument. It can thus be a file object opened for writing, a StringIO
object, or any other custom object that meets this interface. pickle.
load
(file)Read a string from the open file object fileand interpret it as a pickle data stream, reconstructing and returning the original object hierarchy. This is equivalent to Unpickler(file).load()
.file
must have two methods, a read()
method that takes an integer argument, and a readline()
method that requires no arguments. Both methods should return a string. Thus filecan be a file object opened for reading, a StringIO
object, or any other custom object that meets this interface. This function automatically determines whether the data stream was written in binary mode or not. pickle.
dumps
(obj[, protocol])Return the pickled representation of the object as a string, instead of writing it to a file. If the protocolparameter is omitted, protocol 0 is used. If protocolis specified as a negative value or HIGHEST_PROTOCOL
, the highest protocol version will be used. Changed in version 2.3:
The protocolparameter was added. pickle.
loads
(string)Read a pickled object hierarchy from a string. Characters in the string past the pickled object’s representation are ignored.