Python 金融编程第二版（一）（4）-阿里云开发者社区

Python 金融编程第二版（一）（4）

2024-07-04 31

版权

本文内容由阿里云实名注册用户自发贡献，版权归原作者所有，阿里云开发者社区不拥有其著作权，亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容，填写侵权投诉表单进行举报，一经查实，本社区将立刻删除涉嫌侵权内容。

本文涉及的产品

公共DNS（含HTTPDNS解析），每月1000万次HTTP解析

全局流量管理 GTM，标准版 1个月

云解析 DNS，旗舰版 1个月

简介： Python 金融编程第二版（一）

Python 金融编程第二版（一）（3）https://developer.aliyun.com/article/1559398

字符串

现在我们可以表示自然数和浮点数，我们转向文本。在Python中表示文本的基本数据类型是string。string对象具有许多非常有用的内置方法。事实上，当涉及到处理任何类型和任何大小的文本文件时，Python通常被认为是一个很好的选择。string对象通常由单引号或双引号定义，或者通过使用str函数转换另一个对象（即使用对象的标准或用户定义的string表示）来定义：

In [56]: t = 'this is a string object'

关于内置方法，例如，您可以将对象中的第一个单词大写：

In [57]: t.capitalize()
Out[57]: 'This is a string object'

或者您可以将其拆分为其单词组件以获得所有单词的list对象（稍后会有关于list对象的更多内容）：

In [58]: t.split()
Out[58]: ['this', 'is', 'a', 'string', 'object']

您还可以搜索单词并在成功情况下获得单词的第一个字母的位置（即，索引值）：

In [59]: t.find('string')
Out[59]: 10

如果单词不在string对象中，则该方法返回-1：

In [60]: t.find('Python')
Out[60]: -1

用replace方法轻松替换字符串中的字符是一个典型的任务：

In [61]: t.replace(' ', '|')
Out[61]: 'this|is|a|string|object'

字符串的剥离—即删除某些前导/后置字符—也经常是必要的：

In [62]: 'http://www.python.org'.strip('htp:/')
Out[62]: 'www.python.org'

表格 3-1 列出了string对象的许多有用方法。

表格 3-1. 选择的字符串方法

方法	参数	返回/结果
`capitalize`	`()`	第一个字母大写的字符串副本
`count`	`(子串[, 开始[, 结束]])`	子串出现次数的计数
`decode`	`([编码[, 错误]])`	使用`编码`（例如，UTF-8）的字符串的解码版本
`encode`	`([编码+[, 错误]])`	字符串的编码版本
`find`	`(sub[, start[, end]])`	找到子字符串的（最低）索引
`join`	`(seq)`	将序列`seq`中的字符串连接起来
`replace`	`(old, new[, count])`	将`old`替换为`new`的第一个`count`次
`split`	`([sep[, maxsplit]])`	以`sep`为分隔符的字符串中的单词列表
`splitlines`	`([keepends])`	如果`keepends`为`True`，则带有行结束符/断行的分隔行
`strip`	`(chars)`	删除`chars`中的前导/尾随字符的字符串的副本
`upper`	`()`	复制并将所有字母大写

注意

从 Python 2.7（本书的第一版）到 Python 3.6（本书的第二版使用的版本）的基本变化是字符串对象的编码和解码以及 Unicode 的引入（参见https://docs.python.org/3/howto/unicode.html）。本章不允许详细讨论此上下文中重要的许多细节。对于本书的目的，主要涉及包含英文单词的数字数据和标准字符串，这种省略似乎是合理的。

附录：打印和字符串替换

打印str对象或其他 Python 对象的字符串表示通常是通过print()函数完成的（在 Python 2.7 中是一个语句）。

In [63]: print('Python for Finance')  ![1](https://gitee.com/OpenDocCN/ibooker-quant-zh/raw/master/docs/py-fin-2e/img/1.png)
         Python for Finance
In [64]: print(t)  ![2](https://gitee.com/OpenDocCN/ibooker-quant-zh/raw/master/docs/py-fin-2e/img/2.png)
         this is a string object
In [65]: i = 0
         while i < 4:
             print(i)  ![3](https://gitee.com/OpenDocCN/ibooker-quant-zh/raw/master/docs/py-fin-2e/img/3.png)
             i += 1
         0
         1
         2
         3
In [66]: i = 0
         while i < 4:
             print(i, end='|')  ![4](https://gitee.com/OpenDocCN/ibooker-quant-zh/raw/master/docs/py-fin-2e/img/4.png)
             i += 1
         0|1|2|3|

打印一个str对象。

打印由变量名引用的str对象。

打印int对象的字符串表示。

指定打印的最后一个字符（默认为前面看到的换行符\n）。

Python 提供了强大的字符串替换操作。有通过%字符进行的旧方法和通过花括号{}和format()进行的新方法。两者在实践中仍然适用。本节不能提供所有选项的详尽说明，但以下代码片段显示了一些重要的内容。首先，旧的方法。

In [67]: 'this is an integer %d' % 15  ![1](https://gitee.com/OpenDocCN/ibooker-quant-zh/raw/master/docs/py-fin-2e/img/1.png)
Out[67]: 'this is an integer 15'
In [68]: 'this is an integer %4d' % 15  ![2](https://gitee.com/OpenDocCN/ibooker-quant-zh/raw/master/docs/py-fin-2e/img/2.png)
Out[68]: 'this is an integer   15'
In [69]: 'this is an integer %04d' % 15  ![3](https://gitee.com/OpenDocCN/ibooker-quant-zh/raw/master/docs/py-fin-2e/img/3.png)
Out[69]: 'this is an integer 0015'
In [70]: 'this is a float %f' % 15.3456  ![4](https://gitee.com/OpenDocCN/ibooker-quant-zh/raw/master/docs/py-fin-2e/img/4.png)
Out[70]: 'this is a float 15.345600'
In [71]: 'this is a float %.2f' % 15.3456  ![5](https://gitee.com/OpenDocCN/ibooker-quant-zh/raw/master/docs/py-fin-2e/img/5.png)
Out[71]: 'this is a float 15.35'
In [72]: 'this is a float %8f' % 15.3456  ![6](https://gitee.com/OpenDocCN/ibooker-quant-zh/raw/master/docs/py-fin-2e/img/6.png)
Out[72]: 'this is a float 15.345600'
In [73]: 'this is a float %8.2f' % 15.3456  ![7](https://gitee.com/OpenDocCN/ibooker-quant-zh/raw/master/docs/py-fin-2e/img/7.png)
Out[73]: 'this is a float    15.35'
In [74]: 'this is a float %08.2f' % 15.3456  ![8](https://gitee.com/OpenDocCN/ibooker-quant-zh/raw/master/docs/py-fin-2e/img/8.png)
Out[74]: 'this is a float 00015.35'
In [75]: 'this is a string %s' % 'Python'  ![9](https://gitee.com/OpenDocCN/ibooker-quant-zh/raw/master/docs/py-fin-2e/img/9.png)
Out[75]: 'this is a string Python'
In [76]: 'this is a string %10s' % 'Python'  ![10](https://gitee.com/OpenDocCN/ibooker-quant-zh/raw/master/docs/py-fin-2e/img/10.png)
Out[76]: 'this is a string     Python'

int对象替换。

带有固定数量的字符。

如果必要，带有前导零。

float对象替换。

带有固定数量的小数位数。

带有固定数量的字符（并填充小数）。

带有固定数量的字符和小数位数…

… 以及必要时的前导零。

str对象替换。

带有固定数量的字符。

现在以新方式实现相同的示例。请注意，输出在某些地方略有不同。

In [77]: 'this is an integer {:d}'.format(15)
Out[77]: 'this is an integer 15'
In [78]: 'this is an integer {:4d}'.format(15)
Out[78]: 'this is an integer   15'
In [79]: 'this is an integer {:04d}'.format(15)
Out[79]: 'this is an integer 0015'
In [80]: 'this is a float {:f}'.format(15.3456)
Out[80]: 'this is a float 15.345600'
In [81]: 'this is a float {:.2f}'.format(15.3456)
Out[81]: 'this is a float 15.35'
In [82]: 'this is a float {:8f}'.format(15.3456)
Out[82]: 'this is a float 15.345600'
In [83]: 'this is a float {:8.2f}'.format(15.3456)
Out[83]: 'this is a float    15.35'
In [84]: 'this is a float {:08.2f}'.format(15.3456)
Out[84]: 'this is a float 00015.35'
In [85]: 'this is a string {:s}'.format('Python')
Out[85]: 'this is a string Python'
In [86]: 'this is a string {:10s}'.format('Python')
Out[86]: 'this is a string Python    '

字符串替换在多次打印操作的上下文中特别有用，其中打印的数据会更新，例如，在while循环期间。

In [87]: i = 0
         while i < 4:
             print('the number is %d' % i)
             i += 1
         the number is 0
         the number is 1
         the number is 2
         the number is 3
In [88]: i = 0
         while i < 4:
             print('the number is {:d}'.format(i))
             i += 1
         the number is 0
         the number is 1
         the number is 2
         the number is 3

旅行：正则表达式

在处理string对象时，使用正则表达式是一种强大的工具。Python在模块re中提供了这样的功能：

In [89]: import re

假设你面对一个大文本文件，例如一个逗号分隔值（CSV）文件，其中包含某些时间序列和相应的日期时间信息。往往情况下，日期时间信息以Python无法直接解释的格式提供。然而，日期时间信息通常可以通过正则表达式描述。考虑以下string对象，其中包含三个日期时间元素，三个整数和三个字符串。请注意，三重引号允许在多行上定义字符串：

In [90]: series = """
 '01/18/2014 13:00:00', 100, '1st';
 '01/18/2014 13:30:00', 110, '2nd';
 '01/18/2014 14:00:00', 120, '3rd'
 """

以下正则表达式描述了提供在string对象中的日期时间信息的格式：⁴

In [91]: dt = re.compile("'[0-9/:\s]+'")  # datetime

有了这个正则表达式，我们可以继续找到所有日期时间元素。通常，将正则表达式应用于string对象还会导致典型解析任务的性能改进：

In [92]: result = dt.findall(series)
         result
Out[92]: ["'01/18/2014 13:00:00'", "'01/18/2014 13:30:00'", "'01/18/2014 14:00:00'"]

正则表达式

在解析string对象时，考虑使用正则表达式，这可以为此类操作带来便利性和性能。

然后可以解析生成Python datetime对象的结果string对象（参见[Link to Come]，了解如何使用Python处理日期和时间数据的概述）。要解析包含日期时间信息的string对象，我们需要提供如何解析的信息 —— 再次作为string对象：

In [93]: from datetime import datetime
         pydt = datetime.strptime(result[0].replace("'", ""),
                                  '%m/%d/%Y %H:%M:%S')
         pydt
Out[93]: datetime.datetime(2014, 1, 18, 13, 0)
In [94]: print(pydt)
         2014-01-18 13:00:00
In [95]: print(type(pydt))
         <class 'datetime.datetime'>

后续章节将提供有关日期时间数据的更多信息，以及处理此类数据和datetime对象及其方法。这只是对金融中这一重要主题的一个引子。

基本数据结构

作为一个通用规则，数据结构是包含可能大量其他对象的对象。在Python提供的内置结构中包括：

tuple

一个不可变的任意对象的集合；只有少量方法可用

list

一个可变的任意对象的集合；有许多方法可用

dict

键-值存储对象

set

用于其他唯一对象的无序集合对象

元组

tuple是一种高级数据结构，但在其应用中仍然相当简单且有限。通过在括号中提供对象来定义它：

In [96]: t = (1, 2.5, 'data')
         type(t)
Out[96]: tuple

你甚至可以放弃括号，提供多个对象，只需用逗号分隔：

In [97]: t = 1, 2.5, 'data'
         type(t)
Out[97]: tuple

像几乎所有的Python数据结构一样，tuple具有内置索引，借助它可以检索单个或多个tuple元素。重要的是要记住，Python使用零基编号，因此tuple的第三个元素位于索引位置 2：

In [98]: t[2]
Out[98]: 'data'
In [99]: type(t[2])
Out[99]: str

Python 金融编程第二版（一）（5）https://developer.aliyun.com/article/1559400

Python 金融编程第二版（一）（4）

字符串

注意

附录：打印和字符串替换

旅行：正则表达式

正则表达式

基本数据结构

元组

热门文章

最新文章

相关课程

相关电子书

相关实验场景

推荐镜像

热门

活动广场

任务中心

开发者评测

高校计划

乘风者计划

训练营

阿里云MVP

话题

直播

下载

镜像站

技术资料

插件

Python 金融编程第二版（一）（4）

字符串

注意

附录：打印和字符串替换

旅行：正则表达式

正则表达式

基本数据结构

元组

热门文章

最新文章

相关课程

相关电子书

相关实验场景

推荐镜像