python移动窗口求股票预测误差均值

简介: python移动窗口求股票预测误差均值

一、题目描述

已知贵州茅台的2019年1月至今每天的股票收益率序列(数据见下),采用移动窗口平均法预测,计算预测误差均值。并在[2,20]范围内求使得该误差最小的移动窗口长度。

移动窗口平均法预测的计算方法说明如下:

原始序列:0.3,0.1,0.5,0.4,0.6,0.3,0.6,0.8,0.3,0.2

当窗口长度=3时,步骤如下:

  1. 对0.3,0.1,0.5求平均得到0.3,作为第四天的预测值,该预测值与真实值的误差为:|0.4-0.3|=0.1
  2. 在原始序列上移动一个位置,即求0.1,0.5,0.4的平均值得到0.33作为第五天的预测值,误差为:|0.6-0.33|=0.27
  3. 继续往右移动一个位置,求0.5,0.4,0.6的平均值0.5作为第六天的预测值,误差为|0.3-0.5|=0.2
  4. 以此类推,直到0.6,0.8,0.3,误差为|0.2-0.57|=0.37把各个步骤的误差求平均得到预测误差均值。

股票的原始为:

1. num_list = [0.02931,0.048,0.0076,0.00302,
2. 0.03759,-0.0058,-0.02738,0.00462,
3. 0.00832,0.01201,-0.0095,0.01106,
4.             -0.00728,0.02506,-0.00293,0.04029,
5. 0,0.01134,0.00503,-0.00528,0.01363,
6.             -0.00804,-0.02041,-0.00417,0.01918,
7. 0.02333,-0.01979,0.00668,0.03198,0.04399,
8.             -0.01258,-0.01274,-0.01649,-0.03191,0.0029,
9. 0.03213,-0.02398,0.01543,0.0251,0.00361,0.03107,
10.             -0.01619,-0.00253,0.00028,-0.00282,-0.00763,0.00128,
11. 0.01592,0.05239,0.02994,-0.00001,-0.0231,-0.00014,
12. 0.07143,0.00333,0,0.04983,-0.04114,0.0242,-0.02803,
13. 0.03658,0.0079,-0.00153,0.01646,-0.00886,0.03037,-0.01132,
14.             -0.02321,-0.00137,0.02434,-0.0439,-0.03481,-0.03726,0.01744,
15. 0.00075,0.02649,-0.01542,0.00592,0.03231,0.00873,-0.02915,
16.             -0.01667,0.01128,-0.01562,-0.01816,0.0163,-0.00126,0.01141,
17. 0.02476,-0.0104,-0.00947,0.00336,-0.01229,-0.00948,-0.02015,
18. 0.021,0.0345,0.00417,0.0011,0.12514,-0.00488,-0.03353,0.0002,
19. 0.00918,-0.01583,0.00166,0.00921,-0.00875,-0.01376,0.0079,
20.             -0.00764,-0.00671,-0.00676,0.00524,0.0024,-0.00415,-0.01288,
21. 0.01691,0.00006,0.01504,-0.00154,0.00155,-0.03329,0.00106,
22.             -0.01481,0.01987,0.00421,0.02622,0.03251,0.00579,0.01364,
23.             -0.00196,0.02125,0.01063,-0.0566,0.01431,0.0027,0.02983,
24. 0.00724,0.00359,-0.00716,-0.00361,0.0181,0.01332,
25. 0.00001,0.0007,-0.01034,0.01373,0.00044,-0.0934,
26.             -0.01329,-0.04755,0.01595,0.01016,0.01325,
27. 0.03563,0.00261,0.00521,-0.00173,0.02165,
28.             -0.00429,-0.00931,-0.00086,-0.0086,0.01145,
29.             -0.03361,0.02662,0.02506,-0.00506,-0.01017,
30. 0.00685,0.00255,-0.00606,0.00524,-0.00849,
31.             -0.00261,-0.01,0.01102,0.01295,0.00753,
32. 0.00504,-0.01254,0.00762,0.00798,-0.00208,
33.             -0.00041,0.0071,-0.0029,0.00208,0.00249,
34. 0.01657,0.00244,-0.00397,0.0025,0.00163,
35. 0.00081,-0.0065,-0.02858,-0.00115,0.00516,
36. 0.00182,-0.02466,-0.04058,0.01324,0.00618,
37.             -0.01316,0.00975,0.03436,-0.01311,0.00724,-0.01027,
38.             -0.00865,0.01483,-0.01144,0.02114,-0.00937,
39.             -0.00071,-0.01995,0.01229,-0.00867,-0.00962,
40. 0.0159,0.01757,0.01094,-0.04649,-0.00975,-0.04131,
41. 0.0062,0.00701,0.00825,0.01371,0.00316,0.01052,
42.             -0.01351,0.00889,-0.00793,0.00168,-0.02776,-0.01018,
43. 0.00561,-0.08457,0.03046,0.03448,0.00898,0.00999,
44.             -0.00749,0.00094,0.02446,0.00826,-0.00688,-0.00729,
45. 0.00691,-0.0046,0.01078,0.01346,-0.00536,-0.02488,
46.             -0.01484,0.01318,-0.0053,-0.01886,0.0456,0.02459,
47. 0.01005,0.02349,-0.02408,-0.01938,0.03819,-0.01236,
48.             -0.04488,0.01283,-0.04437,-0.01422,-0.04424,0.01711,
49.             -0.01088,0.043,0.04313,-0.01347,0.01087,-0.02281,
50. 0.02051,0.03235,-0.01164,0.03173,0.02012,-0.00856,0.01215,-0.00696]

二、题目分析

首先我们要知道什么是滑动窗口

滑动窗口就相当于是双指针类型的题目。在left和right指针框起来的范围,就是需要进行相应判断的窗口。通过两个指针向前滑动,形成一个连续移动的矩形框窗口,对这个窗口内的元素进行相应判断。就是滑动窗口。

有了这个定义我们就很清楚啦 ,这题主要是分两步:

  • 首先我们可以以窗口长度为3时举例,设定一个参数left为0,right为left+3,对股票序列进行截取,并且每截取一次left+=1,right+=1,这样就能不断截取;截取的列表求平均值,和索引为right的值取绝对值存到另一个列表fin_result中。直到right的值等于整个序列的长度,退出循环,对fin_result求和,并且除以计数器count即框选了多少次,就能求在窗口长度为3时的预测误差均值。
  • 有了窗口长度为3举例,我们只要用for循环,从2到21中的每一个值作为窗格长度去求预测误差均值就好,思路都是一样的。最后用一个列表保存将值起来,并且用一个列表将窗口长度和对应的值关联起来,方便之后打印。

三、题目代码

1. #!/usr/bin/env python
2. # -*- coding: UTF-8 -*-
3. """
4. @Project :python 
5. @File    :python_23_4.py
6. @IDE     :PyCharm 
7. @Author  :咋
8. @Date    :2022/10/21 20:10 
9. """
10. #导入库
11. import matplotlib.pyplot as plt
12. import numpy as np
13. #设定画布。dpi越大图越清晰,绘图时间越久
14. num_list = [0.02931,0.048,0.0076,0.00302,
15. 0.03759,-0.0058,-0.02738,0.00462,
16. 0.00832,0.01201,-0.0095,0.01106,
17.             -0.00728,0.02506,-0.00293,0.04029,
18. 0,0.01134,0.00503,-0.00528,0.01363,
19.             -0.00804,-0.02041,-0.00417,0.01918,
20. 0.02333,-0.01979,0.00668,0.03198,0.04399,
21.             -0.01258,-0.01274,-0.01649,-0.03191,0.0029,
22. 0.03213,-0.02398,0.01543,0.0251,0.00361,0.03107,
23.             -0.01619,-0.00253,0.00028,-0.00282,-0.00763,0.00128,
24. 0.01592,0.05239,0.02994,-0.00001,-0.0231,-0.00014,
25. 0.07143,0.00333,0,0.04983,-0.04114,0.0242,-0.02803,
26. 0.03658,0.0079,-0.00153,0.01646,-0.00886,0.03037,-0.01132,
27.             -0.02321,-0.00137,0.02434,-0.0439,-0.03481,-0.03726,0.01744,
28. 0.00075,0.02649,-0.01542,0.00592,0.03231,0.00873,-0.02915,
29.             -0.01667,0.01128,-0.01562,-0.01816,0.0163,-0.00126,0.01141,
30. 0.02476,-0.0104,-0.00947,0.00336,-0.01229,-0.00948,-0.02015,
31. 0.021,0.0345,0.00417,0.0011,0.12514,-0.00488,-0.03353,0.0002,
32. 0.00918,-0.01583,0.00166,0.00921,-0.00875,-0.01376,0.0079,
33.             -0.00764,-0.00671,-0.00676,0.00524,0.0024,-0.00415,-0.01288,
34. 0.01691,0.00006,0.01504,-0.00154,0.00155,-0.03329,0.00106,
35.             -0.01481,0.01987,0.00421,0.02622,0.03251,0.00579,0.01364,
36.             -0.00196,0.02125,0.01063,-0.0566,0.01431,0.0027,0.02983,
37. 0.00724,0.00359,-0.00716,-0.00361,0.0181,0.01332,
38. 0.00001,0.0007,-0.01034,0.01373,0.00044,-0.0934,
39.             -0.01329,-0.04755,0.01595,0.01016,0.01325,
40. 0.03563,0.00261,0.00521,-0.00173,0.02165,
41.             -0.00429,-0.00931,-0.00086,-0.0086,0.01145,
42.             -0.03361,0.02662,0.02506,-0.00506,-0.01017,
43. 0.00685,0.00255,-0.00606,0.00524,-0.00849,
44.             -0.00261,-0.01,0.01102,0.01295,0.00753,
45. 0.00504,-0.01254,0.00762,0.00798,-0.00208,
46.             -0.00041,0.0071,-0.0029,0.00208,0.00249,
47. 0.01657,0.00244,-0.00397,0.0025,0.00163,
48. 0.00081,-0.0065,-0.02858,-0.00115,0.00516,
49. 0.00182,-0.02466,-0.04058,0.01324,0.00618,
50.             -0.01316,0.00975,0.03436,-0.01311,0.00724,-0.01027,
51.             -0.00865,0.01483,-0.01144,0.02114,-0.00937,
52.             -0.00071,-0.01995,0.01229,-0.00867,-0.00962,
53. 0.0159,0.01757,0.01094,-0.04649,-0.00975,-0.04131,
54. 0.0062,0.00701,0.00825,0.01371,0.00316,0.01052,
55.             -0.01351,0.00889,-0.00793,0.00168,-0.02776,-0.01018,
56. 0.00561,-0.08457,0.03046,0.03448,0.00898,0.00999,
57.             -0.00749,0.00094,0.02446,0.00826,-0.00688,-0.00729,
58. 0.00691,-0.0046,0.01078,0.01346,-0.00536,-0.02488,
59.             -0.01484,0.01318,-0.0053,-0.01886,0.0456,0.02459,
60. 0.01005,0.02349,-0.02408,-0.01938,0.03819,-0.01236,
61.             -0.04488,0.01283,-0.04437,-0.01422,-0.04424,0.01711,
62.             -0.01088,0.043,0.04313,-0.01347,0.01087,-0.02281,
63. 0.02051,0.03235,-0.01164,0.03173,0.02012,-0.00856,0.01215,-0.00696]
64. all_result_dic = {}
65. all_result_list = []
66. y_list = []
67. for i in range(2,21):
68.     fin_result = []
69.     left = 0
70.     count = 0
71.     right = left+i
72. while True:
73. if right != len(num_list):
74.             temp_1 = sum(num_list[left:right])/i
75.             temp_2 = abs(num_list[right]-temp_1)
76.             fin_result.append(temp_2)
77.             left += 1
78.             right += 1
79.             count += 1
80. else:
81. break
82.     all_result_dic[i] = sum(fin_result)/count
83.     all_result_list.append(sum(fin_result)/count)
84. k2 = [k for k, v in all_result_dic.items() if v ==min(all_result_list)]
85. for i in range(2,21):
86.     y_list.append(all_result_dic[i])
87. print("当移动窗口长度为:",i,"对应的误差为:",all_result_dic[i])
88. print("在[2:20]范围内使得该误差最小的移动窗格长度为:",k2[0],"对应的误差为:",all_result_dic[k2[0]])
89. fig=plt.figure(figsize=(4, 4), dpi=300)
90. #导入数据
91. x=list(np.arange(2, 21))
92. y= y_list
93. #绘图命令
94. plt.plot(x, y, lw=4, ls='-', c='b', alpha=0.1)
95. plt.plot()
96. #show出图形
97. plt.show()
98. #保存图片
99. fig.savefig("画布")

在最后,我用matplotlib.pyplot做了简单的可视化处理,使结果更加直观,运行结果为:

1. 当移动窗口长度为: 2 对应的误差为: 0.020459651567944236
2. 当移动窗口长度为: 3 对应的误差为: 0.01908844988344989
3. 当移动窗口长度为: 4 对应的误差为: 0.018718245614035088
4. 当移动窗口长度为: 5 对应的误差为: 0.018167260563380288
5. 当移动窗口长度为: 6 对应的误差为: 0.01809260895170789
6. 当移动窗口长度为: 7 对应的误差为: 0.01771991894630194
7. 当移动窗口长度为: 8 对应的误差为: 0.017835907473309605
8. 当移动窗口长度为: 9 对应的误差为: 0.017389440476190475
9. 当移动窗口长度为: 10 对应的误差为: 0.017071688172043017
10. 当移动窗口长度为: 11 对应的误差为: 0.0169103008502289
11. 当移动窗口长度为: 12 对应的误差为: 0.01677201564380265
12. 当移动窗口长度为: 13 对应的误差为: 0.016765702341137122
13. 当移动窗口长度为: 14 对应的误差为: 0.016638602597402602
14. 当移动窗口长度为: 15 对应的误差为: 0.016635518248175184
15. 当移动窗口长度为: 16 对应的误差为: 0.016579548992673998
16. 当移动窗口长度为: 17 对应的误差为: 0.01662127162629757
17. 当移动窗口长度为: 18 对应的误差为: 0.01668552685526856
18. 当移动窗口长度为: 19 对应的误差为: 0.0166987680311891
19. 当移动窗口长度为: 20 对应的误差为: 0.016531403345724907
20. 在[2:20]范围内使得该误差最小的移动窗格长度为: 20 对应的误差为: 0.016531403345724907

四、总结

这道题目的难点在于left和right双变量的运用以及临界点的判断,需要熟悉掌握字典与列表的各项操作,如列表的添加,查找,字典根据值查找键。在最后,用数据可视化作图,使结果更加直观!


相关文章
|
2月前
|
算法 Python
【Leetcode刷题Python】309. 最佳买卖股票时机含冷冻期
解决LeetCode上309题“最佳买卖股票时机含冷冻期”的Python代码示例,利用动态规划方法计算在含有冷冻期约束下的最大利润。
39 1
|
13天前
|
数据采集 人工智能 自然语言处理
Python实时查询股票API的FinanceAgent框架构建股票(美股/A股/港股)AI Agent
金融领域Finance AI Agents方面的工作,发现很多行业需求和用户输入的 query都是和查询股价/行情/指数/财报汇总/金融理财建议相关。如果需要准确的 金融实时数据就不能只依赖LLM 来生成了。常规的方案包括 RAG (包括调用API )再把对应数据和prompt 一起拼接送给大模型来做文本生成。稳定的一些商业机构的金融数据API基本都是收费的,如果是以科研和demo性质有一些开放爬虫API可以使用。这里主要介绍一下 FinanceAgent,github地址 https://github.com/AI-Hub-Admin/FinanceAgent
|
1月前
|
数据挖掘 Python
用python的tushare模块分析股票案例(python3经典编程案例)
该文章提供了使用Python的tushare模块分析股票数据的案例,展示了如何获取股票数据以及进行基本的数据分析。
75 0
|
5月前
|
数据可视化 数据挖掘 Python
Python时间序列分析苹果股票数据:分解、平稳性检验、滤波器、滑动窗口平滑、移动平均、可视化(下)
Python时间序列分析苹果股票数据:分解、平稳性检验、滤波器、滑动窗口平滑、移动平均、可视化
|
5月前
|
数据可视化 API 开发者
Python时间序列分析苹果股票数据:分解、平稳性检验、滤波器、滑动窗口平滑、移动平均、可视化(上)
Python时间序列分析苹果股票数据:分解、平稳性检验、滤波器、滑动窗口平滑、移动平均、可视化
|
1月前
|
机器学习/深度学习 数据采集 TensorFlow
使用Python实现智能股票交易策略
使用Python实现智能股票交易策略
46 0
|
2月前
|
数据采集 数据可视化 索引
【python】python股票量化交易策略分析可视化(源码+数据集+论文)【独一无二】
【python】python股票量化交易策略分析可视化(源码+数据集+论文)【独一无二】
185 1
|
2月前
|
机器学习/深度学习 数据可视化 API
【python】python基于tushare股票数据分析可视化(源码+数据+报告)【独一无二】
【python】python基于tushare股票数据分析可视化(源码+数据+报告)【独一无二】
192 1
|
2月前
|
机器学习/深度学习 数据采集 自然语言处理
基于Python thinker GUI界面的股票评论数据及投资者情绪分析设计与实现
本文介绍了一个基于Python Tkinter库开发的GUI股票评论数据及投资者情绪分析系统,该系统提供股票数据展示、情绪与股价分析、模型指标分析、评论数据展示、词云分析和情感分析结果展示等功能,帮助投资者通过情感分析了解市场舆论对股票价格的影响,以辅助投资决策。
基于Python thinker GUI界面的股票评论数据及投资者情绪分析设计与实现
|
2月前
|
算法 Python
【Leetcode刷题Python】121. 买卖股票的最佳时机
解决LeetCode上121题“买卖股票的最佳时机”的Python代码示例,采用一次遍历的方式寻找最佳买卖时机以获得最大利润。
56 1