该篇文章实现了对深证综指收益率数据进行ARIMA建模及预测,包括对原始收益数据的处理;平稳性及白噪声检验;ACF/PACF定阶;EACF表定阶;模型拟合;残差检验;模型优化;模型预测,同时附带完整代码及相关分析。
1 对数收益率时序可视化
运行程序:
#加载包 data<-read.xlsx("F:\\2021-03-15 上证指数,深证综指日交易数据.xls",'Sheet1', encoding ="UTF-8")#读取数据 rt=log(data$收盘价.元.点._ClPr)-log(data$昨收盘.元.点._PrevClPr)#对数收益率 rt=na.omit(rt) #清除缺失数据 rt1=ts(rt,start =c(1991,4,4),frequency = 244) #转化为时序数据 plot(rt1,sub="图1 深证综指日对数收益率时序图")
运行结果:
首先由收益率公式:
计算深证综指日对数收益率,并对缺失值进行处理,得到从1991-2021年深证综指的对数收益率共7379个观测值。由于该证券数据一年多为244天,故以244天作为1年绘制证券收益率时序图(图1),接下来利用清洗过后的数据进行其分布拟合。
2 平稳性及白噪声检验
2.1 平稳性检验
运行程序:
library(aTSA) #加载包 adf.test(rt) #平稳性检验
运行结果:
## Augmented Dickey-Fuller Test ## alternative: stationary ## ## Type 1: no drift no trend ## lag ADF p.value ## [1,] 0 -81.3 0.01 ## [2,] 1 -58.0 0.01 ## [3,] 2 -46.5 0.01 ## [4,] 3 -38.8 0.01 ## [5,] 4 -35.0 0.01 ## [6,] 5 -33.5 0.01 ## [7,] 6 -30.2 0.01 ## [8,] 7 -28.3 0.01 ## [9,] 8 -26.4 0.01 ## [10,] 9 -25.6 0.01 ## [11,] 10 -24.4 0.01 ## Type 2: with drift no trend ## lag ADF p.value ## [1,] 0 -81.3 0.01 ## [2,] 1 -58.0 0.01 ## [3,] 2 -46.6 0.01 ## [4,] 3 -38.9 0.01 ## [5,] 4 -35.0 0.01 ## [6,] 5 -33.6 0.01 ## [7,] 6 -30.2 0.01 ## [8,] 7 -28.3 0.01 ## [9,] 8 -26.4 0.01 ## [10,] 9 -25.6 0.01 ## [11,] 10 -24.5 0.01 ## Type 3: with drift and trend ## lag ADF p.value ## [1,] 0 -81.3 0.01 ## [2,] 1 -58.0 0.01 ## [3,] 2 -46.6 0.01 ## [4,] 3 -38.9 0.01 ## [5,] 4 -35.0 0.01 ## [6,] 5 -33.6 0.01 ## [7,] 6 -30.2 0.01 ## [8,] 7 -28.3 0.01 ## [9,] 8 -26.4 0.01 ## [10,] 9 -25.6 0.01 ## [11,] 10 -24.5 0.01 ## ---- ## Note: in fact, p.value = 0.01 means p.value <= 0.01
2.2 白噪声检验
运行程序:
for(i in 1:2){ print(Box.test(rt,lag=6*i,type="Ljung-Box")) }
运行结果:
## ## Box-Ljung test ## ## data: rt ## X-squared = 60.804, df = 6, p-value = 3.089e-11 ## ## ## Box-Ljung test ## ## data: rt ## X-squared = 84.94, df = 12, p-value = 4.674e-13
通过图1可以大致看出深证综指的日对数收益率收益率为平稳的时间序列,通过ADF检验P值均小于0.05,拒绝“为非平稳时序数据”的原假设,认为该序列为平稳序列,经过6阶、12阶的白噪声检验,发现其p值均小于0.05,说明该时序数据为平稳非白噪声序列,适合建模。
3 模型定阶
3.1 ACF、PACF图定阶
运行程序:
par(mfrow=c(1,2)) #创建1×2画布 acf(rt1,sub="图5 自相关图") #自相关图 pacf(rt1,sub="图6 偏自相关图") #偏自相关图
运行结果:
由该序列自相关图与偏自相关图可以知道,该序列自相关图与偏自相关图均拖尾,故该序列应该拟合ARIMA模型。
3.2 EACF表定阶
3.2.1 EACF简表
运行程序:
par(mfrow=c(1,2)) #创建1×1画布 library(TSA) #加载包 m1<-eacf(rt1) #EACF简化表
运行结果:
## AR/MA ## 0 1 2 3 4 5 6 7 8 9 10 11 12 13 ## 0 x o x x o x x o o o o x o o ## 1 x o o x x o x o o o o x o o ## 2 x x o o x o x o o o o x o o ## 3 x x x o x x x x o o o o x o ## 4 x x x x o x x o o o o o x o ## 5 x x x x x x o x o o o o x o ## 6 x x x x x x o x o o o x o o ## 7 x x x x x x x o o o o x o o
运行程序:
m1
运行结果:
## $eacf ## [,1] [,2] [,3] [,4] [,5] ## [1,] 0.05186075 0.01823984 0.0325773192 0.054479116 0.01637515 ## [2,] -0.27566564 -0.01859668 0.0006709612 0.042149140 0.02510675 ## [3,] -0.40185155 -0.03309427 0.0010163577 0.020239247 0.03371968 ## [4,] -0.44286371 0.04988986 -0.2128448359 -0.003718364 0.02897970 ## [5,] -0.19118004 0.38128140 -0.2144903835 -0.088680936 0.01158896 ## [6,] 0.26705846 0.19626575 -0.1450178522 -0.222002007 -0.17385311 ## [7,] 0.49969401 0.29144002 -0.1646513342 -0.337466248 -0.24228788 ## [8,] -0.15020845 -0.27865513 0.4074569846 -0.136264204 -0.20457862 ## [,6] [,7] [,8] [,9] [,10] [,11] ## [1,] -0.031052623 0.03357787 0.01095985 0.01882619 -0.014338401 0.005463929 ## [2,] -0.005811746 0.04075826 -0.01247056 0.01625908 -0.006116629 0.002232497 ## [3,] 0.008914872 0.03225957 0.02313849 0.02103490 0.001852557 0.008683467 ## [4,] -0.036415647 0.05209669 0.02677686 0.01459742 -0.018470501 0.004610147 ## [5,] -0.042585501 0.03814105 -0.02166573 0.01710606 -0.017224865 0.010382543 ## [6,] -0.028772804 -0.01335083 -0.03251102 -0.01477290 -0.015664739 0.007008101 ## [7,] 0.204890840 0.01125851 -0.02889638 0.01763848 -0.009981756 0.010181724 ## [8,] 0.171302086 0.11324458 -0.01197408 -0.00758839 -0.014792266 0.016635518 ## [,12] [,13] [,14] ## [1,] 0.0380495740 0.015792999 -0.0005846271 ## [2,] 0.0316646229 0.017095597 0.0002921578 ## [3,] 0.0275045416 0.010343692 -0.0175732761 ## [4,] 0.0054253399 0.027250487 0.0064546092 ## [5,] -0.0009724119 0.027170687 -0.0147765589 ## [6,] 0.0043693392 0.027724684 0.0040001865 ## [7,] 0.0392960113 0.005843992 0.0089263044 ## [8,] 0.0427462184 -0.014826829 -0.0003964176 ## ## $ar.max ## [1] 8 ## ## $ma.ma ## [1] 14 ## ## $symbol ## 0 1 2 3 4 5 6 7 8 9 10 11 12 13 ## 0 "x" "o" "x" "x" "o" "x" "x" "o" "o" "o" "o" "x" "o" "o" ## 1 "x" "o" "o" "x" "x" "o" "x" "o" "o" "o" "o" "x" "o" "o" ## 2 "x" "x" "o" "o" "x" "o" "x" "o" "o" "o" "o" "x" "o" "o" ## 3 "x" "x" "x" "o" "x" "x" "x" "x" "o" "o" "o" "o" "x" "o" ## 4 "x" "x" "x" "x" "o" "x" "x" "o" "o" "o" "o" "o" "x" "o" ## 5 "x" "x" "x" "x" "x" "x" "o" "x" "o" "o" "o" "o" "x" "o" ## 6 "x" "x" "x" "x" "x" "x" "o" "x" "o" "o" "o" "x" "o" "o" ## 7 "x" "x" "x" "x" "x" "x" "x" "o" "o" "o" "o" "x" "o" "o"
3.2.2 EACF表
运行程序:
print(m1$eacf,digits=2) #EACF表
运行结果:
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] ## [1,] 0.052 0.018 0.03258 0.0545 0.016 -0.0311 0.034 0.011 0.0188 ## [2,] -0.276 -0.019 0.00067 0.0421 0.025 -0.0058 0.041 -0.012 0.0163 ## [3,] -0.402 -0.033 0.00102 0.0202 0.034 0.0089 0.032 0.023 0.0210 ## [4,] -0.443 0.050 -0.21284 -0.0037 0.029 -0.0364 0.052 0.027 0.0146 ## [5,] -0.191 0.381 -0.21449 -0.0887 0.012 -0.0426 0.038 -0.022 0.0171 ## [6,] 0.267 0.196 -0.14502 -0.2220 -0.174 -0.0288 -0.013 -0.033 -0.0148 ## [7,] 0.500 0.291 -0.16465 -0.3375 -0.242 0.2049 0.011 -0.029 0.0176 ## [8,] -0.150 -0.279 0.40746 -0.1363 -0.205 0.1713 0.113 -0.012 -0.0076 ## [,10] [,11] [,12] [,13] [,14] ## [1,] -0.0143 0.0055 0.03805 0.0158 -0.00058 ## [2,] -0.0061 0.0022 0.03166 0.0171 0.00029 ## [3,] 0.0019 0.0087 0.02750 0.0103 -0.01757 ## [4,] -0.0185 0.0046 0.00543 0.0273 0.00645 ## [5,] -0.0172 0.0104 -0.00097 0.0272 -0.01478 ## [6,] -0.0157 0.0070 0.00437 0.0277 0.00400 ## [7,] -0.0100 0.0102 0.03930 0.0058 0.00893 ## [8,] -0.0148 0.0166 0.04275 -0.0148 -0.00040
一般地,由“O”组成的三角形的左上顶点位于ARIMA模型的(p,q)处,由EACF简化表可以看出由“O”组成的三角形的左上顶点位于阶(p,q)=(0,1)处。,结合EACF表可以看到由(0,1)作为三角形顶点的三角形范围EACF值只有少数“X”是例外,但其对应EACF值仅比 稍微大一点,事实上,如果用1%的临界值,简化表中的“X”将变为“O”,因此EACF表明深证综指的日对数收益率服从定ARIMA(0,0,1)模型。
4 模型拟合
运行程序:
library(forecast) fit3=arima(rt1,order = c(0,0,1)) fit3 #模型拟合
运行结果:
## ## Call: ## arima(x = rt1, order = c(0, 0, 1)) ## ## Coefficients: ## ma1 intercept ## 0.0504 4e-04 ## s.e. 0.0115 3e-04 ## ## sigma^2 estimated as 0.0004579: log likelihood = 17791.28, aic = -35578.57
运行程序:
t=abs(fit3$coef)/sqrt(diag(fit3$var.coef)) pt(t,length(rt)-length(fit3$coef),lower.tail = F)
运行程序:
## ma1 intercept ## 5.998510e-06 5.070172e-02
5 残差检验
运行程序:
tsdiag(fit3) #残差检验
运行结果:
由结果可知,建立的ARIMA(0,0,1)模型系数通过了显著性检验,但是残差为非白噪声,因此该模型不理想。再尝试建立ARIMA(1,0,1)模型。
6 模型优化
运行程序:
fit4=Arima(rt1,order = c(1,0,1)) fit4 #模型拟合
运行结果:
## Series: rt1 ## ARIMA(1,0,1) with non-zero mean ## ## Coefficients: ## ar1 ma1 mean ## 0.8435 -0.8053 4e-04 ## s.e. 0.0647 0.0717 3e-04 ## ## sigma^2 estimated as 0.0004569: log likelihood=17800.25 ## AIC=-35592.49 AICc=-35592.49 BIC=-35564.89
运行程序:
t=abs(fit4$coef)/sqrt(diag(fit4$var.coef)) pt(t,length(rt)-length(fit3$coef),lower.tail = F)
运行结果:
## ar1 ma1 intercept ## 1.015093e-38 2.628404e-29 8.081087e-02
运行程序:
tsdiag(fit4) #残差检验
运行结果:
由结果可知,拟合的ARIMA(1,0,1)模型的残差依旧不是白噪声,两模型的AIC差距较小,ARIMA(1,0,1)模型的AIC要较小一些,故模型比ARIMA(0,0,1)拟合效果相对较好。
7 模型预测
运行程序:
fore1=forecast(fit4,h=5) plot(fore1,lty=2,sub="图9 模型拟合效果图") lines(fore1$fitted,col=4)
运行结果:
拟合效果与预测效果见上图。由于模型未通过残差显著性检验,因此拟合效果也不是很理性。但经过多种尝试,建立较优的模型为ARIMA(1,0,1):}