Python 金融编程第二版（GPT 重译）（四）（4）-阿里云开发者社区

Python 金融编程第二版（GPT 重译）（四）（3）https://developer.aliyun.com/article/1559367

使用 PyTables 进行快速 I/O

PyTables是HDF5数据库标准的 Python 绑定（参见http://www.hdfgroup.org）。它专门设计用于优化 I/O 操作的性能，并充分利用可用的硬件。库的导入名称是tables。与pandas类似，当涉及到内存分析时，PyTables既不能也不意味着是对SQL数据库的完全替代。然而，它带来了一些进一步缩小差距的特性。例如，一个PyTables数据库可以有很多表，它支持压缩和索引以及对表的非平凡查询。此外，它可以有效地存储NumPy数组，并具有其自己的数组数据结构的风格。

首先，一些导入：

In [123]: import tables as tb  ![1](https://gitee.com/OpenDocCN/ibooker-quant-zh/raw/master/docs/py-fin-2e/img/1.png)
          import datetime as dt

包名是PyTables，导入名称是tables。

与表格一起工作

PyTables提供了一种基于文件的数据库格式，类似于SQLite3。⁵。以下是打开数据库文件并创建表格的示例：

In [124]: filename = path + 'pytab.h5'
In [125]: h5 = tb.open_file(filename, 'w')  ![1](https://gitee.com/OpenDocCN/ibooker-quant-zh/raw/master/docs/py-fin-2e/img/1.png)
In [126]: row_des = {
              'Date': tb.StringCol(26, pos=1),  ![2](https://gitee.com/OpenDocCN/ibooker-quant-zh/raw/master/docs/py-fin-2e/img/2.png)
              'No1': tb.IntCol(pos=2),  ![3](https://gitee.com/OpenDocCN/ibooker-quant-zh/raw/master/docs/py-fin-2e/img/3.png)
              'No2': tb.IntCol(pos=3),  ![3](https://gitee.com/OpenDocCN/ibooker-quant-zh/raw/master/docs/py-fin-2e/img/3.png)
              'No3': tb.Float64Col(pos=4),  ![4](https://gitee.com/OpenDocCN/ibooker-quant-zh/raw/master/docs/py-fin-2e/img/4.png)
              'No4': tb.Float64Col(pos=5)  ![4](https://gitee.com/OpenDocCN/ibooker-quant-zh/raw/master/docs/py-fin-2e/img/4.png)
              }
In [127]: rows = 2000000
In [128]: filters = tb.Filters(complevel=0)  ![5](https://gitee.com/OpenDocCN/ibooker-quant-zh/raw/master/docs/py-fin-2e/img/5.png)
In [129]: tab = h5.create_table('/', 'ints_floats',  ![6](https://gitee.com/OpenDocCN/ibooker-quant-zh/raw/master/docs/py-fin-2e/img/6.png)
                                row_des,  ![7](https://gitee.com/OpenDocCN/ibooker-quant-zh/raw/master/docs/py-fin-2e/img/7.png)
                                title='Integers and Floats',  ![8](https://gitee.com/OpenDocCN/ibooker-quant-zh/raw/master/docs/py-fin-2e/img/8.png)
                                expectedrows=rows,  ![9](https://gitee.com/OpenDocCN/ibooker-quant-zh/raw/master/docs/py-fin-2e/img/9.png)
                                filters=filters)  ![10](https://gitee.com/OpenDocCN/ibooker-quant-zh/raw/master/docs/py-fin-2e/img/10.png)
In [130]: type(tab)
Out[130]: tables.table.Table
In [131]: tab
Out[131]: /ints_floats (Table(0,)) 'Integers and Floats'
            description := {
            "Date": StringCol(itemsize=26, shape=(), dflt=b'', pos=0),
            "No1": Int32Col(shape=(), dflt=0, pos=1),
            "No2": Int32Col(shape=(), dflt=0, pos=2),
            "No3": Float64Col(shape=(), dflt=0.0, pos=3),
            "No4": Float64Col(shape=(), dflt=0.0, pos=4)}
            byteorder := 'little'
            chunkshape := (2621,)

以HDF5二进制存储格式打开数据库文件。

用于日期时间信息的date列（作为str对象）。

用于存储int对象的两列。

用于存储float对象的两列。

通过Filters对象，可以指定压缩级别等。

表的节点（路径）和技术名称。

行数据结构的描述。

表的名称（标题）。

预期的行数；允许进行优化。

用于表格的Filters对象。

为了用数字数据填充表格，生成两个具有随机数字的ndarray对象。一个是随机整数，另一个是随机浮点数。通过一个简单的 Python 循环来填充表格。

In [132]: pointer = tab.row  ![1](https://gitee.com/OpenDocCN/ibooker-quant-zh/raw/master/docs/py-fin-2e/img/1.png)
In [133]: ran_int = np.random.randint(0, 10000, size=(rows, 2))  ![2](https://gitee.com/OpenDocCN/ibooker-quant-zh/raw/master/docs/py-fin-2e/img/2.png)
In [134]: ran_flo = np.random.standard_normal((rows, 2)).round(4)  ![3](https://gitee.com/OpenDocCN/ibooker-quant-zh/raw/master/docs/py-fin-2e/img/3.png)
In [135]: %%time
          for i in range(rows):
              pointer['Date'] = dt.datetime.now()  ![4](https://gitee.com/OpenDocCN/ibooker-quant-zh/raw/master/docs/py-fin-2e/img/4.png)
              pointer['No1'] = ran_int[i, 0]  ![4](https://gitee.com/OpenDocCN/ibooker-quant-zh/raw/master/docs/py-fin-2e/img/4.png)
              pointer['No2'] = ran_int[i, 1]  ![4](https://gitee.com/OpenDocCN/ibooker-quant-zh/raw/master/docs/py-fin-2e/img/4.png)
              pointer['No3'] = ran_flo[i, 0]  ![4](https://gitee.com/OpenDocCN/ibooker-quant-zh/raw/master/docs/py-fin-2e/img/4.png)
              pointer['No4'] = ran_flo[i, 1]  ![4](https://gitee.com/OpenDocCN/ibooker-quant-zh/raw/master/docs/py-fin-2e/img/4.png)
              pointer.append()  ![5](https://gitee.com/OpenDocCN/ibooker-quant-zh/raw/master/docs/py-fin-2e/img/5.png)
          tab.flush()  ![6](https://gitee.com/OpenDocCN/ibooker-quant-zh/raw/master/docs/py-fin-2e/img/6.png)
          CPU times: user 8.36 s, sys: 136 ms, total: 8.49 s
          Wall time: 8.92 s
In [136]: tab  ![7](https://gitee.com/OpenDocCN/ibooker-quant-zh/raw/master/docs/py-fin-2e/img/7.png)
Out[136]: /ints_floats (Table(2000000,)) 'Integers and Floats'
            description := {
            "Date": StringCol(itemsize=26, shape=(), dflt=b'', pos=0),
            "No1": Int32Col(shape=(), dflt=0, pos=1),
            "No2": Int32Col(shape=(), dflt=0, pos=2),
            "No3": Float64Col(shape=(), dflt=0.0, pos=3),
            "No4": Float64Col(shape=(), dflt=0.0, pos=4)}
            byteorder := 'little'
            chunkshape := (2621,)
In [137]: ll $path*
          -rw-r--r--  1 yves  staff  100156248 Jan 18 10:06 /Users/yves/Documents/Temp/data/pytab.h5

创建了一个指针对象。

具有随机int对象的ndarray对象。

具有随机float对象的ndarray对象。

datetime对象，两个int和两个float对象被逐行写入。

新行被附加。

所有写入的行都会被刷新，即作为永久更改提交。

更改反映在 Table 对象描述中。

在这种情况下，Python 循环相当慢。有一种更高效和 Pythonic 的方法可以实现相同的结果，即使用 NumPy 结构化数组。使用存储在结构化数组中的完整数据集，表的创建归结为一行代码。请注意，不再需要行描述; PyTables 使用结构化数组的 dtype 对象来推断数据类型：

In [138]: dty = np.dtype([('Date', 'S26'), ('No1', '<i4'), ('No2', '<i4'),
                                           ('No3', '<f8'), ('No4', '<f8')])  ![1](https://gitee.com/OpenDocCN/ibooker-quant-zh/raw/master/docs/py-fin-2e/img/1.png)
In [139]: sarray = np.zeros(len(ran_int), dtype=dty)  ![2](https://gitee.com/OpenDocCN/ibooker-quant-zh/raw/master/docs/py-fin-2e/img/2.png)
In [140]: sarray[:4]  ![3](https://gitee.com/OpenDocCN/ibooker-quant-zh/raw/master/docs/py-fin-2e/img/3.png)
Out[140]: array([(b'', 0, 0,  0.,  0.), (b'', 0, 0,  0.,  0.), (b'', 0, 0,  0.,  0.),
                 (b'', 0, 0,  0.,  0.)],
                dtype=[('Date', 'S26'), ('No1', '<i4'), ('No2', '<i4'), ('No3', '<f8'), ('No4', '<f8')])
In [141]: %%time
          sarray['Date'] = dt.datetime.now()  ![4](https://gitee.com/OpenDocCN/ibooker-quant-zh/raw/master/docs/py-fin-2e/img/4.png)
          sarray['No1'] = ran_int[:, 0]  ![4](https://gitee.com/OpenDocCN/ibooker-quant-zh/raw/master/docs/py-fin-2e/img/4.png)
          sarray['No2'] = ran_int[:, 1]  ![4](https://gitee.com/OpenDocCN/ibooker-quant-zh/raw/master/docs/py-fin-2e/img/4.png)
          sarray['No3'] = ran_flo[:, 0]  ![4](https://gitee.com/OpenDocCN/ibooker-quant-zh/raw/master/docs/py-fin-2e/img/4.png)
          sarray['No4'] = ran_flo[:, 1]  ![4](https://gitee.com/OpenDocCN/ibooker-quant-zh/raw/master/docs/py-fin-2e/img/4.png)
          CPU times: user 82.7 ms, sys: 37.9 ms, total: 121 ms
          Wall time: 133 ms
In [142]: %%time
          h5.create_table('/', 'ints_floats_from_array', sarray,
                                title='Integers and Floats',
                                expectedrows=rows, filters=filters)  ![5](https://gitee.com/OpenDocCN/ibooker-quant-zh/raw/master/docs/py-fin-2e/img/5.png)
          CPU times: user 39 ms, sys: 61 ms, total: 100 ms
          Wall time: 123 ms
Out[142]: /ints_floats_from_array (Table(2000000,)) 'Integers and Floats'
            description := {
            "Date": StringCol(itemsize=26, shape=(), dflt=b'', pos=0),
            "No1": Int32Col(shape=(), dflt=0, pos=1),
            "No2": Int32Col(shape=(), dflt=0, pos=2),
            "No3": Float64Col(shape=(), dflt=0.0, pos=3),
            "No4": Float64Col(shape=(), dflt=0.0, pos=4)}
            byteorder := 'little'
            chunkshape := (2621,)

定义特殊的 dtype 对象。

使用零（和空字符串）创建结构化数组。

来自 ndarray 对象的几条记录。

ndarray 对象的列一次性填充。

这将创建 Table 对象，并用数据填充它。

这种方法快了一个数量级，代码更简洁，且实现了相同的结果。

In [143]: type(h5)
Out[143]: tables.file.File
In [144]: h5  ![1](https://gitee.com/OpenDocCN/ibooker-quant-zh/raw/master/docs/py-fin-2e/img/1.png)
Out[144]: File(filename=/Users/yves/Documents/Temp/data/pytab.h5, title='', mode='w', root_uep='/', filters=Filters(complevel=0, shuffle=False, bitshuffle=False, fletcher32=False, least_significant_digit=None))
          / (RootGroup) ''
          /ints_floats (Table(2000000,)) 'Integers and Floats'
            description := {
            "Date": StringCol(itemsize=26, shape=(), dflt=b'', pos=0),
            "No1": Int32Col(shape=(), dflt=0, pos=1),
            "No2": Int32Col(shape=(), dflt=0, pos=2),
            "No3": Float64Col(shape=(), dflt=0.0, pos=3),
            "No4": Float64Col(shape=(), dflt=0.0, pos=4)}
            byteorder := 'little'
            chunkshape := (2621,)
          /ints_floats_from_array (Table(2000000,)) 'Integers and Floats'
            description := {
            "Date": StringCol(itemsize=26, shape=(), dflt=b'', pos=0),
            "No1": Int32Col(shape=(), dflt=0, pos=1),
            "No2": Int32Col(shape=(), dflt=0, pos=2),
            "No3": Float64Col(shape=(), dflt=0.0, pos=3),
            "No4": Float64Col(shape=(), dflt=0.0, pos=4)}
            byteorder := 'little'
            chunkshape := (2621,)
In [145]: h5.remove_node('/', 'ints_floats_from_array')  ![2](https://gitee.com/OpenDocCN/ibooker-quant-zh/raw/master/docs/py-fin-2e/img/2.png)

带有两个 Table 对象的 File 对象的描述。

这会删除具有冗余数据的第二个 Table 对象。

Table 对象在大多数情况下的行为与 NumPy 结构化的 ndarray 对象非常相似（另见图 9-5）：

In [146]: tab[:3]  ![1](https://gitee.com/OpenDocCN/ibooker-quant-zh/raw/master/docs/py-fin-2e/img/1.png)
Out[146]: array([(b'2018-01-18 10:06:28.516235', 8576, 5991, -0.0528,  0.2468),
                 (b'2018-01-18 10:06:28.516332', 2990, 9310, -0.0261,  0.3932),
                 (b'2018-01-18 10:06:28.516344', 4400, 4823,  0.9133,  0.2579)],
                dtype=[('Date', 'S26'), ('No1', '<i4'), ('No2', '<i4'), ('No3', '<f8'), ('No4', '<f8')])
In [147]: tab[:4]['No4']  ![2](https://gitee.com/OpenDocCN/ibooker-quant-zh/raw/master/docs/py-fin-2e/img/2.png)
Out[147]: array([ 0.2468,  0.3932,  0.2579, -0.5582])
In [148]: %time np.sum(tab[:]['No3'])  ![3](https://gitee.com/OpenDocCN/ibooker-quant-zh/raw/master/docs/py-fin-2e/img/3.png)
          CPU times: user 64.5 ms, sys: 97.1 ms, total: 162 ms
          Wall time: 165 ms
Out[148]: 88.854299999999697
In [149]: %time np.sum(np.sqrt(tab[:]['No1']))  ![3](https://gitee.com/OpenDocCN/ibooker-quant-zh/raw/master/docs/py-fin-2e/img/3.png)
          CPU times: user 59.3 ms, sys: 69.4 ms, total: 129 ms
          Wall time: 130 ms
Out[149]: 133349920.36892509
In [150]: %%time
          plt.figure(figsize=(10, 6))
          plt.hist(tab[:]['No3'], bins=30);  ![4](https://gitee.com/OpenDocCN/ibooker-quant-zh/raw/master/docs/py-fin-2e/img/4.png)
          plt.savefig('../../images/ch09/io_05.png');
          CPU times: user 244 ms, sys: 67.6 ms, total: 312 ms
          Wall time: 340 ms

通过索引选择行。

仅通过索引选择列值。

应用 NumPy 通用函数。

从 Table 对象绘制列。

图 9-5. 列数据的直方图

PyTables 还提供了通过典型的 SQL-like 语句查询数据的灵活工具，如下例所示（其结果如图 9-6 所示；与图 9-2 相比，基于 pandas 查询）：

In [151]: query = '((No3 < -0.5) | (No3 > 0.5)) & ((No4 < -1) | (No4 > 1))'  ![1](https://gitee.com/OpenDocCN/ibooker-quant-zh/raw/master/docs/py-fin-2e/img/1.png)
In [152]: iterator = tab.where(query)  ![2](https://gitee.com/OpenDocCN/ibooker-quant-zh/raw/master/docs/py-fin-2e/img/2.png)
In [153]: %time res = [(row['No3'], row['No4']) for row in iterator]  ![3](https://gitee.com/OpenDocCN/ibooker-quant-zh/raw/master/docs/py-fin-2e/img/3.png)
          CPU times: user 487 ms, sys: 128 ms, total: 615 ms
          Wall time: 637 ms
In [154]: res = np.array(res)  ![4](https://gitee.com/OpenDocCN/ibooker-quant-zh/raw/master/docs/py-fin-2e/img/4.png)
          res[:3]
Out[154]: array([[ 0.7694,  1.4866],
                 [ 0.9201,  1.3346],
                 [ 1.4701,  1.8776]])
In [155]: plt.figure(figsize=(10, 6))
          plt.plot(res.T[0], res.T[1], 'ro');
          plt.savefig('../../images/ch09/io_06.png');

查询作为 str 对象，由逻辑运算符组合的四个条件。

基于查询的迭代器对象。

通过列表推导收集查询结果的行…

… 并转换为 ndarray 对象。

图 9-6. 列数据的直方图

快速查询

pandas和PyTables都能够处理相对复杂的、类似SQL的查询和选择。它们在执行此类操作时都进行了速度优化。但是，与关系型数据库相比，这些方法当然存在限制。但对于大多数数值和金融应用程序，它们通常并不决定性。

正如以下示例所示，使用存储在PyTables中的数据作为Table对象让您感觉就像是在NumPy或pandas中工作且是内存中的，从语法和性能方面都是如此：

In [156]: %%time
          values = tab[:]['No3']
          print('Max %18.3f' % values.max())
          print('Ave %18.3f' % values.mean())
          print('Min %18.3f' % values.min())
          print('Std %18.3f' % values.std())
          Max              5.224
          Ave              0.000
          Min             -5.649
          Std              1.000
          CPU times: user 88.9 ms, sys: 70 ms, total: 159 ms
          Wall time: 156 ms
In [157]: %%time
          res = [(row['No1'], row['No2']) for row in
                  tab.where('((No1 > 9800) | (No1 < 200)) \
 & ((No2 > 4500) & (No2 < 5500))')]
          CPU times: user 78.4 ms, sys: 38.9 ms, total: 117 ms
          Wall time: 80.9 ms
In [158]: for r in res[:4]:
              print(r)
          (91, 4870)
          (9803, 5026)
          (9846, 4859)
          (9823, 5069)
In [159]: %%time
          res = [(row['No1'], row['No2']) for row in
                  tab.where('(No1 == 1234) & (No2 > 9776)')]
          CPU times: user 58.9 ms, sys: 40.1 ms, total: 99 ms
          Wall time: 133 ms
In [160]: for r in res:
              print(r)
          (1234, 9841)
          (1234, 9821)
          (1234, 9867)
          (1234, 9987)
          (1234, 9849)
          (1234, 9800)

使用压缩表

使用PyTables的一个主要优势是它采用的压缩方法。它不仅使用压缩来节省磁盘空间，还利用了在某些硬件场景下改善 I/O 操作性能的压缩。这是如何实现的？当 I/O 成为瓶颈，而 CPU 能够快速（解）压缩数据时，压缩在速度方面的净效果可能是积极的。由于以下示例基于标准 SSD 的 I/O，因此观察不到压缩的速度优势。但是，使用压缩也几乎没有缺点：

In [161]: filename = path + 'pytabc.h5'
In [162]: h5c = tb.open_file(filename, 'w')
In [163]: filters = tb.Filters(complevel=5,  ![1](https://gitee.com/OpenDocCN/ibooker-quant-zh/raw/master/docs/py-fin-2e/img/1.png)
                               complib='blosc')  ![2](https://gitee.com/OpenDocCN/ibooker-quant-zh/raw/master/docs/py-fin-2e/img/2.png)
In [164]: tabc = h5c.create_table('/', 'ints_floats', sarray,
                                  title='Integers and Floats',
                                  expectedrows=rows, filters=filters)
In [165]: query = '((No3 < -0.5) | (No3 > 0.5)) & ((No4 < -1) | (No4 > 1))'
In [166]: iteratorc = tabc.where(query)  ![3](https://gitee.com/OpenDocCN/ibooker-quant-zh/raw/master/docs/py-fin-2e/img/3.png)
In [167]: %time res = [(row['No3'], row['No4']) for row in iteratorc]  ![4](https://gitee.com/OpenDocCN/ibooker-quant-zh/raw/master/docs/py-fin-2e/img/4.png)
          CPU times: user 362 ms, sys: 55.3 ms, total: 418 ms
          Wall time: 445 ms
In [168]: res = np.array(res)
          res[:3]
Out[168]: array([[ 0.7694,  1.4866],
                 [ 0.9201,  1.3346],
                 [ 1.4701,  1.8776]])

压缩级别（complevel）可以取 0（无压缩）到 9（最高压缩）的值。

使用了经过优化的Blosc压缩引擎（Blosc），该引擎旨在提高性能。

给定前面查询的迭代器对象。

通过列表推导收集查询结果行。

使用原始数据生成压缩的Table对象并对其进行分析比使用未压缩的Table对象稍慢一些。那么将数据读入ndarray对象呢？让我们来检查一下：

In [169]: %time arr_non = tab.read()  ![1](https://gitee.com/OpenDocCN/ibooker-quant-zh/raw/master/docs/py-fin-2e/img/1.png)
          CPU times: user 42.9 ms, sys: 69.9 ms, total: 113 ms
          Wall time: 117 ms
In [170]: tab.size_on_disk
Out[170]: 100122200
In [171]: arr_non.nbytes
Out[171]: 100000000
In [172]: %time arr_com = tabc.read()  ![2](https://gitee.com/OpenDocCN/ibooker-quant-zh/raw/master/docs/py-fin-2e/img/2.png)
          CPU times: user 123 ms, sys: 60.5 ms, total: 184 ms
          Wall time: 191 ms
In [173]: tabc.size_on_disk
Out[173]: 40612465
In [174]: arr_com.nbytes
Out[174]: 100000000
In [175]: ll $path*  ![3](https://gitee.com/OpenDocCN/ibooker-quant-zh/raw/master/docs/py-fin-2e/img/3.png)
          -rw-r--r--  1 yves  staff  200312336 Jan 18 10:06 /Users/yves/Documents/Temp/data/pytab.h5
          -rw-r--r--  1 yves  staff   40647761 Jan 18 10:06 /Users/yves/Documents/Temp/data/pytabc.h5
In [176]: h5c.close()  ![4](https://gitee.com/OpenDocCN/ibooker-quant-zh/raw/master/docs/py-fin-2e/img/4.png)

从未压缩的Table对象tab中读取。

从压缩的Table对象tabc中读取。

压缩表的大小显着减小了。

关闭数据库文件。

例子表明，与未压缩的Table对象相比，使用压缩的Table对象工作时几乎没有速度差异。但是，磁盘上的文件大小可能会根据数据的质量而显着减少，这有许多好处：

存储成本：存储成本降低了
备份成本：备份成本降低了
网络流量：网络流量减少了
网络速度：存储在远程服务器上并从中检索的速度更快
CPU 利用率：为了克服 I/O 瓶颈而增加了 CPU 利用率

使用数组

“Python 基本 I/O”演示了NumPy对于ndarray对象具有内置的快速写入和读取功能。当涉及到存储和检索ndarray对象时，PyTables也非常快速和高效。由于它基于分层数据库结构，因此提供了许多便利功能：

In [177]: %%time
          arr_int = h5.create_array('/', 'integers', ran_int)  ![1](https://gitee.com/OpenDocCN/ibooker-quant-zh/raw/master/docs/py-fin-2e/img/1.png)
          arr_flo = h5.create_array('/', 'floats', ran_flo)  ![2](https://gitee.com/OpenDocCN/ibooker-quant-zh/raw/master/docs/py-fin-2e/img/2.png)
          CPU times: user 3.24 ms, sys: 33.1 ms, total: 36.3 ms
          Wall time: 41.6 ms
In [178]: h5  ![3](https://gitee.com/OpenDocCN/ibooker-quant-zh/raw/master/docs/py-fin-2e/img/3.png)
Out[178]: File(filename=/Users/yves/Documents/Temp/data/pytab.h5, title='', mode='w', root_uep='/', filters=Filters(complevel=0, shuffle=False, bitshuffle=False, fletcher32=False, least_significant_digit=None))
          / (RootGroup) ''
          /floats (Array(2000000, 2)) ''
            atom := Float64Atom(shape=(), dflt=0.0)
            maindim := 0
            flavor := 'numpy'
            byteorder := 'little'
            chunkshape := None
          /integers (Array(2000000, 2)) ''
            atom := Int64Atom(shape=(), dflt=0)
            maindim := 0
            flavor := 'numpy'
            byteorder := 'little'
            chunkshape := None
          /ints_floats (Table(2000000,)) 'Integers and Floats'
            description := {
            "Date": StringCol(itemsize=26, shape=(), dflt=b'', pos=0),
            "No1": Int32Col(shape=(), dflt=0, pos=1),
            "No2": Int32Col(shape=(), dflt=0, pos=2),
            "No3": Float64Col(shape=(), dflt=0.0, pos=3),
            "No4": Float64Col(shape=(), dflt=0.0, pos=4)}
            byteorder := 'little'
            chunkshape := (2621,)
In [179]: ll $path*
          -rw-r--r--  1 yves  staff  262344490 Jan 18 10:06 /Users/yves/Documents/Temp/data/pytab.h5
          -rw-r--r--  1 yves  staff   40647761 Jan 18 10:06 /Users/yves/Documents/Temp/data/pytabc.h5
In [180]: h5.close()
In [181]: !rm -f $path*

存储ran_int ndarray对象。

存储ran_flo ndarray对象。

更改反映在对象描述中。

将这些对象直接写入HDF5数据库比遍历对象并逐行将数据写入Table对象或使用结构化ndarray对象的方法更快。

Python 金融编程第二版（GPT 重译）（四）（5）https://developer.aliyun.com/article/1559373

Python 金融编程第二版（GPT 重译）（四）（4）

使用 PyTables 进行快速 I/O

与表格一起工作

图 9-5. 列数据的直方图

图 9-6. 列数据的直方图

快速查询

使用压缩表

使用数组

热门文章

最新文章

相关课程

相关电子书

相关实验场景

推荐镜像

探索云世界

热门

云计算

大数据

云原生

人工智能

数据库

开发与运维

活动广场

任务中心

训练营

直播

乘风者计划

下载

镜像站

技术资料

Python 金融编程第二版（GPT 重译）（四）（4）

使用 PyTables 进行快速 I/O

与表格一起工作

图 9-5. 列数据的直方图

图 9-6. 列数据的直方图

快速查询

使用压缩表

使用数组

热门文章

最新文章

相关课程

相关电子书

相关实验场景

推荐镜像