前面使用py-postgresql测试过PostgreSQL性能, 可能是这个驱动效率较低, 我们接下来使用psycopg2测试一下.
psycopg2使用libpq接口, 支持2PC, 支持异步提交等,
但是不支持绑定变量.
安装
异步操作, 对数据库端来说意义不大, 因为在执行的话, 不能再次提交SQL请求.
[root@localhost ~]# . /home/postgres/.bash_profile root@localhost-> which pg_config /opt/pgsql9.3.5/bin/pg_config [root@localhost ~]# pip3.4 install psycopg2 --upgrade 或 [root@localhost ~]# pip3.4 install psycopg2
AI 代码解读
异步操作, 对数据库端来说意义不大, 因为在执行的话, 不能再次提交SQL请求.
如下 :
同时psycopg2不支持prepared statement
为什么这么说呢?, 调用 curs.execute("insert into tt values(%(id)s, 'digoal.zhou', 32, 'digoal@126.com', '276732431')", {"id": 1})
import psycopg2 import time import select conn = psycopg2.connect(database="postgres", user="postgres", password="postgres", host="/data01/pgdata/pg_root", port="1921", async=True) >>> def wait(conn): ... while 1: ... state = conn.poll() ... if state == psycopg2.extensions.POLL_OK: ... break ... elif state == psycopg2.extensions.POLL_WRITE: ... select.select([], [conn.fileno()], []) ... elif state == psycopg2.extensions.POLL_READ: ... select.select([conn.fileno()], [], []) ... else: ... raise psycopg2.OperationalError("poll() returned %s" % state) ... wait(conn) curs = conn.cursor() curs.execute("insert into tt values(%(id)s, 'digoal.zhou', 32, 'digoal@126.com', '276732431')", {"id": 1}) curs.execute("insert into tt values(%(id)s, 'digoal.zhou', 32, 'digoal@126.com', '276732431')", {"id": 1}) Traceback (most recent call last): File "<stdin>", line 1, in <module> psycopg2.ProgrammingError: execute cannot be used while an asynchronous query is underway >>> wait(curs.connection) >>> wait(curs.connection) >>> curs.execute("insert into tt values(%(id)s, 'digoal.zhou', 32, 'digoal@126.com', '276732431')", {"id": 1}) >>> wait(curs.connection) >>> curs.execute("insert into tt values(%(id)s, 'digoal.zhou', 32, 'digoal@126.com', '276732431')", {"id": 1}) >>> wait(curs.connection) >>> conn.poll() 0 >>> curs.execute("insert into tt values(%(id)s, 'digoal.zhou', 32, 'digoal@126.com', '276732431')", {"id": 1}) >>> conn.poll() 0 >>> curs.execute("insert into tt values(%(id)s, 'digoal.zhou', 32, 'digoal@126.com', '276732431')", {"id": 1}) >>> conn.poll() 0 >>> psycopg2.extensions.POLL_OK 0 >>> psycopg2.extensions.POLL_WRITE 2 >>> psycopg2.extensions.POLL_READ 1
AI 代码解读
同时psycopg2不支持prepared statement
>>> curs.prepare("insert into tt values(%(id)s, 'digoal.zhou', 32, 'digoal@126.com', '276732431')") Traceback (most recent call last): File "<stdin>", line 1, in <module> AttributeError: 'psycopg2._psycopg.cursor' object has no attribute 'prepare' >>> dir(curs) ['__class__', '__delattr__', '__dir__', '__doc__', '__enter__', '__eq__', '__exit__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__iter__', '__le__', '__lt__', '__ne__', '__new__', '__next__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'arraysize', 'binary_types', 'callproc', 'cast', 'close', 'closed', 'connection', 'copy_expert', 'copy_from', 'copy_to', 'description', 'execute', 'executemany', 'fetchall', 'fetchmany', 'fetchone', 'itersize', 'lastrowid', 'mogrify', 'name', 'nextset', 'query', 'row_factory', 'rowcount', 'rownumber', 'scroll', 'scrollable', 'setinputsizes', 'setoutputsize', 'statusmessage', 'string_types', 'typecaster', 'tzinfo_factory', 'withhold']
AI 代码解读
为什么这么说呢?, 调用 curs.execute("insert into tt values(%(id)s, 'digoal.zhou', 32, 'digoal@126.com', '276732431')", {"id": 1})
你觉得是绑定变量吗? 看看PostgreSQL的日志吧 :
2015-02-05 22:37:32.901 CST,"postgres","postgres",69785,"[local]",54d37ee8.11099,1,"idle",2015-02-05 22:32:08 CST,3/1996971,0,LOG,00000,"statement: insert into tt values(1, 'digoal.zhou', 32, 'digoal@126.com', '276732431')",,,,,,,,"exec_simple_query, postgres.c:890",""
AI 代码解读
调用的是exec_simple_query接口, 当然不是绑定变量.
间接的使用绑定变量, 但实际上SQL调用的还是
exec_simple_query接口, 只是在处理execute这个SQL时使用到了绑定变量.
这个是较老的用法.
PostgreSQL日志 :
>>> curs.execute("prepare pre1(int) as insert into tt values($1, 'digoal.zhou', 32, 'digoal@126.com', '276732431')") >>> conn.poll() 0 >>> curs.execute("execute pre1(1)") >>> conn.poll() 0 >>> curs.execute("execute pre1(1)") >>> conn.poll() 0
AI 代码解读
PostgreSQL日志 :
2015-02-05 22:50:42.678 CST,"postgres","postgres",69785,"[local]",54d37ee8.11099,5,"idle",2015-02-05 22:32:08 CST,3/1996975,0,LOG,00000,"statement: prepare pre1(int) as insert into tt values($1, 'digoal.zhou', 32, 'digoal@126.com', '276732431')",,,,,,,,"exec_simple_query, postgres.c:890","" 2015-02-05 22:50:59.425 CST,"postgres","postgres",69785,"[local]",54d37ee8.11099,6,"idle",2015-02-05 22:32:08 CST,3/1996976,0,LOG,00000,"statement: execute pre1(1)","prepare: prepare pre1(int) as insert into tt values($1, 'digoal.zhou', 32, 'digoal@126.com', '276732431')",,,,,,,"exec_simple_query, postgres.c:890","" 2015-02-05 22:51:22.425 CST,"postgres","postgres",69785,"[local]",54d37ee8.11099,7,"idle",2015-02-05 22:32:08 CST,3/1996977,0,LOG,00000,"statement: execute pre1(1)","prepare: prepare pre1(int) as insert into tt values($1, 'digoal.zhou', 32, 'digoal@126.com', '276732431')",,,,,,,"exec_simple_query, postgres.c:890",""
AI 代码解读
效率如何呢?
使用8个线程插入100W数据, 耗时41秒.
py-postgresql驱动耗时226秒, psycopg2效率高了很多, 接近pgbench的16秒了.
测试结果 :
[root@localhost ~]# cat t.py import psycopg2 import time import threading class n_t(threading.Thread): #The timer class is derived from the class threading.Thread def __init__(self, num): threading.Thread.__init__(self) self.thread_num = num def run(self): #Overwrite run() method, put what you want the thread do here conn = psycopg2.connect(database="postgres", user="postgres", password="postgres", host="/data01/pgdata/pg_root", port="1921") curs = conn.cursor() conn.autocommit=True start_t = time.time() print("TID:" + str(self.thread_num) + " " + str(start_t)) for i in range(self.thread_num*125000, (self.thread_num+1)*125000): curs.execute("insert into tt values(%(id)s, 'digoal.zhou', 32, 'digoal@126.com', '276732431')", {"id": i}) stop_t = time.time() print("TID:" + str(self.thread_num) + " " + str(stop_t)) print(stop_t-start_t) def test(): t_names = dict() for i in range(0,8): t_names[i] = n_t(i) t_names[i].start() return if __name__ == '__main__': test()
AI 代码解读
测试结果 :
[root@localhost ~]# python t.py TID:0 1423148404.5820847 TID:1 1423148404.5828164 TID:2 1423148404.5850544 TID:3 1423148404.58532 TID:4 1423148404.5861893 TID:5 1423148404.5867805 TID:6 1423148404.5882406 TID:7 1423148404.588671 TID:2 1423148445.146449 40.561394691467285 TID:7 1423148445.2051861 40.616515159606934 TID:6 1423148445.2321012 40.64386057853699 TID:1 1423148445.263236 40.68041968345642 TID:5 1423148445.2703242 40.68354368209839 TID:0 1423148445.2967775 40.71469283103943 TID:3 1423148445.3065617 40.72124171257019 TID:4 1423148445.3345916 40.74840235710144 postgres=# select count(*),count(distinct id) from tt; count | count ---------+--------- 1000000 | 1000000 (1 row)
AI 代码解读
[参考]