ML之NB:基于news新闻文本数据集利用纯统计法、kNN、朴素贝叶斯(高斯/多元伯努利/多项式)、线性判别分析LDA、感知器等算法实现文本分类预测

本文涉及的产品
函数计算FC,每月15万CU 3个月
简介: ML之NB:基于news新闻文本数据集利用纯统计法、kNN、朴素贝叶斯(高斯/多元伯努利/多项式)、线性判别分析LDA、感知器等算法实现文本分类预测


目录

基于news新闻文本数据集利用纯统计法、kNN、朴素贝叶斯(高斯/多元伯努利/多项式)、线性判别分析LDA、感知器等算法实现文本分类预测

设计思路

输出结果

核心代码


 

相关文章

ML之NB:基于news新闻文本数据集利用纯统计法、kNN、朴素贝叶斯(高斯/多元伯努利/多项式)、线性判别分析LDA、感知器等算法实现文本分类预测

ML之NB:基于news新闻文本数据集利用纯统计法、kNN、朴素贝叶斯(高斯/多元伯努利/多项式)、线性判别分析LDA、感知器等算法实现文本分类预测实现

基于news新闻文本数据集利用纯统计法、kNN、朴素贝叶斯(高斯/多元伯努利/多项式)、线性判别分析LDA、感知器等算法实现文本分类预测

设计思路

 

输出结果

代码中的数据集https://download.csdn.net/download/qq_41185868/13757777

1. F:\Program Files\Python\Python36\lib\site-packages\gensim\utils.py:1209: UserWarning: detected Windows; aliasing chunkize to chunkize_serial
2.   warnings.warn("detected Windows; aliasing chunkize to chunkize_serial")
3. <class 'pandas.core.frame.DataFrame'>
4. RangeIndex: 1293 entries, 0 to 1292
5. Data columns (total 6 columns):
6. #   Column      Non-Null Count  Dtype 
7. ---  ------      --------------  ----- 
8. 0   Unnamed: 0  1293 non-null   int64 
9. 1   content     1292 non-null   object
10. 2   id          1293 non-null   int64 
11. 3   tags        1293 non-null   object
12. 4   time        1293 non-null   object
13. 5   title       1293 non-null   object
14. dtypes: int64(2), object(4)
15. memory usage: 60.7+ KB
16. None
17.    Unnamed: 0                                            content  \
18. 0           0   牵动人心的雄安新区规划细节内容和出台时间表敲定。日前,北京商报记者从业内获悉,京津冀协同发...   
19. 1           1  去年以来,多个城市先后发布了多项楼市调控政策。在限购、限贷甚至限售的政策“组合拳”下,房地产...   
20. 2           2  在今年中国国际自行车展上,上海凤凰自行车总裁王朝阳表示,共享单车的到来把我们打懵了,影响更是...   
21. 3           3  25家上市银行迎来了一年一度的“分红季”,21世纪经济报道记者根据公开信息梳理发现,25家银...   
22. 4           4  说起卷饼,大家其实并不陌生,这个来自中原的传统美食,发展至今也衍生出各种各样的种类,卷边的制...   
23. 
24. id                                  tags  \
25. 0  6428905748545732865   ['财经', '白洋淀', '城市规划', '徐匡迪', '太行山']   
26. 1  6428954136200855810   ['财经', '碧桂园', '万科集团', '投资', '广州恒大']   
27. 2  6420576443738784002    ['财经', '自行车', '凤凰', '王朝阳', '汽车展览']   
28. 3  6429007290541031681  ['财经', '银行', '工商银行', '兴业银行', '交通银行']   
29. 4  6397481672254619905     ['财经', '小吃', '装修', '市场营销', '手工艺']   
30. 
31.                   time                   title  
32. 0  2017-06-07 22:52:55  雄安新区规划“骨架”敲定,方案有望9月底出炉  
33. 1  2017-06-08 08:01:13       “红五月”不红 房企资金链压力攀升  
34. 2  2017-05-16 12:03:00      凤凰自行车总裁:共享单车把我们打懵了  
35. 3  2017-06-08 07:00:00    25家银行分红季派出3536亿“大红包”  
36. 4  2017-03-15 07:03:22      五万以下的小本餐饮项目,卷饼赚钱最稳  
37. chinese_pattern re.compile('[\\u4e00-\\u9fff]+')
38. Building prefix dict from F:\File_Jupyter\实用代码\naive_bayes(简单贝叶斯)\jieba_dict\dict.txt.big ...
39. Loading model from cache C:\Users\niu\AppData\Local\Temp\jieba.ue3752d4e13420d2dc6b66831a5a4ab13.cache
40. Loading model cost 1.326 seconds.
41. Prefix dict has been built succesfully.
42. dictionary
43. <class 'gensim.corpora.dictionary.Dictionary'> Dictionary(46351 unique tokens: ['一个', '一个个', '一举一动', '一些', '一体']...)
44. <class 'method'> <bound method Dictionary.doc2bow of <gensim.corpora.dictionary.Dictionary object at 0x000001BDC62291D0>>
45. F:\Program Files\Python\Python36\lib\site-packages\numpy\core\_asarray.py:83: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray
46. return array(a, dtype, copy=False, order=order)
47.    Unnamed: 0                                            content  \
48. 0           0   牵动人心的雄安新区规划细节内容和出台时间表敲定。日前,北京商报记者从业内获悉,京津冀协同发...   
49. 1           1  去年以来,多个城市先后发布了多项楼市调控政策。在限购、限贷甚至限售的政策“组合拳”下,房地产...   
50. 2           2  在今年中国国际自行车展上,上海凤凰自行车总裁王朝阳表示,共享单车的到来把我们打懵了,影响更是...   
51. 
52. id                                 tags  \
53. 0  6428905748545732865  ['财经', '白洋淀', '城市规划', '徐匡迪', '太行山']   
54. 1  6428954136200855810  ['财经', '碧桂园', '万科集团', '投资', '广州恒大']   
55. 2  6420576443738784002   ['财经', '自行车', '凤凰', '王朝阳', '汽车展览']   
56. 
57.                   time                   title  \
58. 0  2017-06-07 22:52:55  雄安新区规划“骨架”敲定,方案有望9月底出炉   
59. 1  2017-06-08 08:01:13       “红五月”不红 房企资金链压力攀升   
60. 2  2017-05-16 12:03:00      凤凰自行车总裁:共享单车把我们打懵了   
61. 
62.                                            doc_words  \
63. 0  [牵动人心, 雄安, 新区, 规划, 细节, 内容, 出台, 时间表, 敲定, 日前, 北京...   
64. 1  [去年, 以来, 多个, 城市, 先后, 发布, 多项, 楼市, 调控, 政策, 限购, 限...   
65. 2  [今年, 中国, 国际, 自行车, 展上, 上海, 凤凰, 自行车, 总裁, 王, 朝阳, ...   
66. 
67.                                               corpus  \
68. 0  [(0, 6), (1, 1), (2, 1), (3, 3), (4, 2), (5, 2...   
69. 1  [(0, 1), (3, 3), (13, 1), (17, 1), (41, 1), (5...   
70. 2  [(15, 1), (53, 1), (167, 1), (262, 1), (396, 1...   
71. 
72.                                                tfidf  
73. 0  [(0, 0.005554342859788116), (1, 0.007470250835...  
74. 1  [(0, 0.002081356679198299), (3, 0.012288034179...  
75. 2  [(15, 0.057457146244872616), (53, 0.0543395377...  
76. after abs 4.7683716e-07
77. foo: (1293, 1293)
78. dis2TSNE_Visual:  (1293, 2)
79. {'养生': 0, '科技': 1, '财经': 2, '游戏': 3, '育儿': 4, '汽车': 5}
80. data_frame.keyword_index: 1    379
81. 2    287
82. 5    283
83. 4    148
84. 3    141
85. 0     55
86. Name: keyword_index, dtype: int64
87.    Unnamed: 0                                            content  \
88. 0           0   牵动人心的雄安新区规划细节内容和出台时间表敲定。日前,北京商报记者从业内获悉,京津冀协同发...   
89. 1           1  去年以来,多个城市先后发布了多项楼市调控政策。在限购、限贷甚至限售的政策“组合拳”下,房地产...   
90. 2           2  在今年中国国际自行车展上,上海凤凰自行车总裁王朝阳表示,共享单车的到来把我们打懵了,影响更是...   
91. 
92. id                                 tags  \
93. 0  6428905748545732865  ['财经', '白洋淀', '城市规划', '徐匡迪', '太行山']   
94. 1  6428954136200855810  ['财经', '碧桂园', '万科集团', '投资', '广州恒大']   
95. 2  6420576443738784002   ['财经', '自行车', '凤凰', '王朝阳', '汽车展览']   
96. 
97.                   time                   title  \
98. 0  2017-06-07 22:52:55  雄安新区规划“骨架”敲定,方案有望9月底出炉   
99. 1  2017-06-08 08:01:13       “红五月”不红 房企资金链压力攀升   
100. 2  2017-05-16 12:03:00      凤凰自行车总裁:共享单车把我们打懵了   
101. 
102.                                            doc_words  \
103. 0  [牵动人心, 雄安, 新区, 规划, 细节, 内容, 出台, 时间表, 敲定, 日前, 北京...   
104. 1  [去年, 以来, 多个, 城市, 先后, 发布, 多项, 楼市, 调控, 政策, 限购, 限...   
105. 2  [今年, 中国, 国际, 自行车, 展上, 上海, 凤凰, 自行车, 总裁, 王, 朝阳, ...   
106. 
107.                                               corpus  \
108. 0  [(0, 6), (1, 1), (2, 1), (3, 3), (4, 2), (5, 2...   
109. 1  [(0, 1), (3, 3), (13, 1), (17, 1), (41, 1), (5...   
110. 2  [(15, 1), (53, 1), (167, 1), (262, 1), (396, 1...   
111. 
112.                                                tfidf   visual01   visual02  \
113. 0  [(0, 0.005554342859788116), (1, 0.007470250835... -65.903542 -14.433964
114. 1  [(0, 0.002081356679198299), (3, 0.012288034179... -29.659267 -14.811647
115. 2  [(15, 0.057457146244872616), (53, 0.0543395377... -22.118195 -48.148167
116. 
117.    keyword_index  
118. 0              2
119. 1              2
120. 2              2
121. Childcare,label_category_ID_pos.tfidf)[:20]: ['孩子', '家长', '教育', '学习', '男孩子', '成绩', '爸爸', '分享', '帮助', '方法', '小学', '数学', '交流', '男孩', '妈妈', '成长', '父母', '懂', '免费', '翼航']
122. Childcare,label_category_ID_neg.tfidf)[:20]: []
123. train_index MatrixSimilarity<646 docs, 46329 features>
124. hot_words shape: 6 300
125. {0: {1536, 7681, 17410, 17411, 17415, 6664, 17420, 15886, 4623, 17935, 4625, 5139, 4631, 17916, 17437, 544, 16422, 5671, 1065, 4650, 4651, 4653, 4690, 16943, 4657, 17458, 15921, 51, 7222, 17464, 17465, 10299, 15932, 64, 6209, 66, 17474, 4680, 8264, 8266, 40008, 6730, 8273, 6738, 5203, 5206, 18005, 15958, 597, 15960, 18009, 7258, 4697, 7260, 16989, 3674, 91, 87, 16993, 18020, 616, 4714, 5228, 40044, 1646, 4720, 3185, 15986, 34928, 5236, 113, 34936, 6777, 126, 15999, 127, 4737, 40067, 5252, 643, 4739, 13444, 8840, 1157, 133, 4749, 3219, 10388, 17562, 5278, 46239, 5287, 3751, 167, 680, 6827, 4784, 16048, 16050, 180, 46260, 16054, 6839, 4792, 2743, 4789, 17083, 16060, 4790, 16062, 43200, 5315, 46276, 46279, 17098, 6860, 5836, 16081, 43219, 1237, 1750, 15575, 8921, 2266, 6877, 12511, 12512, 21216, 226, 4834, 6884, 16101, 4838, 742, 2280, 2281, 227, 7915, 6886, 6893, 2798, 6894, 5870, 4849, 242, 1779, 4852, 21215, 44791, 4864, 3329, 258, 4865, 4866, 44805, 4877, 21264, 4882, 274, 8986, 8987, 796, 32029, 4382, 21277, 4896, 1825, 801, 3363, 36644, 1830, 4393, 36138, 303, 815, 4401, 12594, 21299, 7986, 820, 310, 1337, 21307, 4411, 317, 33598, 5953, 17730, 5954, 10050, 17733, 17734, 25927, 21320, 17739, 4939, 21324, 4942, 33615, 6885, 16210, 6071, 18261, 5976, 860, 16740, 16745, 2922, 4969, 17263, 6512, 33649, 16242, 2419, 17775, 373, 1398, 880, 1916, 17276, 16255, 1920, 43394, 3974, 4999, 396, 8080, 16788, 18325, 1942, 16279, 1433, 43418, 36252, 17311, 43425, 16802, 7585, 15959, 7594, 36268, 4525, 7597, 5551, 6063, 36272, 36275, 4533, 16309, 18358, 36280, 1465, 441, 7611, 16825, 16829, 4538, 2488, 2495, 8129, 4545, 4547, 16836, 4549, 7621, 1484, 1997, 11214, 1999, 16846, 16847, 4563, 7636, 14293, 7638, 4567, 16855, 17369, 16861, 478, 16351, 18400, 17377, 993, 9699, 5085, 6111, 7645, 6119, 6124, 17903, 1011, 4597, 6646, 16376, 6138, 16891, 16892, 7165, 4606}, 1: {0, 3, 11785, 2569, 32779, 9227, 526, 21519, 530, 4116, 533, 11805, 2590, 2591, 3105, 7203, 1571, 8740, 1574, 12836, 1062, 1577, 2553, 4654, 1071, 2094, 30257, 51, 30260, 53, 28213, 24633, 1082, 1087, 68, 8779, 78, 12367, 11859, 2647, 91, 13916, 13917, 15455, 608, 9825, 1634, 12387, 13412, 613, 12391, 28267, 12396, 109, 9836, 12399, 11884, 12401, 12400, 12403, 627, 117, 629, 9847, 628, 17020, 637, 9855, 639, 12418, 643, 1668, 133, 3715, 14470, 1160, 12424, 11912, 9867, 33420, 10376, 655, 12433, 148, 150, 3735, 1176, 12440, 154, 21659, 1180, 3742, 10399, 11936, 1185, 31904, 675, 13472, 167, 1704, 7337, 11946, 171, 172, 8876, 8878, 2734, 1200, 1709, 2226, 8877, 180, 1155, 697, 12475, 189, 8894, 1215, 1218, 4291, 708, 709, 3271, 2760, 6354, 2771, 1748, 213, 3798, 727, 730, 20187, 44767, 225, 2786, 2787, 13028, 1765, 1254, 13543, 26344, 740, 11497, 1771, 3819, 13549, 11502, 751, 1775, 752, 242, 21743, 12524, 759, 11511, 2809, 2812, 35581, 257, 8962, 771, 259, 15623, 1288, 3849, 12048, 1810, 786, 788, 3862, 793, 7450, 798, 24862, 7458, 12579, 31524, 31523, 7459, 1322, 810, 25391, 12081, 1329, 820, 3386, 1850, 9023, 319, 835, 9029, 325, 4424, 330, 12107, 13134, 846, 3409, 3924, 1878, 854, 344, 11609, 5978, 1883, 11612, 343, 11615, 358, 4457, 362, 875, 1385, 1900, 4462, 3439, 12144, 369, 3438, 1396, 38773, 28025, 2428, 13305, 13183, 12161, 12674, 1922, 34690, 2438, 1926, 13193, 907, 9100, 911, 13204, 1431, 10135, 2456, 44956, 925, 413, 32670, 1952, 928, 23455, 5540, 1956, 1447, 12200, 1448, 1452, 8109, 12205, 1965, 9651, 2486, 5559, 1464, 956, 1982, 959, 3522, 12235, 976, 3025, 10194, 1491, 12244, 465, 30675, 5585, 472, 470, 10714, 475, 3027, 478, 1503, 479, 5089, 483, 2532, 995, 9190, 5607, 1512, 1513, 9703, 10728, 494, 1518, 1520, 2545, 1007, 1524, 501, 503, 1017, 1534}, 2: {0, 3, 520, 1547, 12300, 2062, 3599, 1040, 26641, 18, 25616, 2577, 13846, 2583, 4121, 25114, 1051, 1052, 25629, 1054, 1567, 2591, 3105, 3616, 4126, 1060, 4125, 1062, 1063, 26663, 1577, 13863, 1066, 1580, 45, 1071, 51, 3123, 53, 2614, 3125, 1082, 2622, 66, 2627, 11843, 1093, 1606, 1605, 3651, 3146, 1100, 26701, 1614, 1102, 592, 3577, 35410, 2639, 2644, 3159, 25688, 1626, 91, 3162, 1119, 608, 21089, 1634, 102, 2662, 31848, 2665, 11881, 27242, 12907, 1131, 1132, 15388, 2672, 3185, 1138, 627, 43124, 2675, 113, 1657, 2682, 3194, 127, 3715, 1668, 133, 3717, 135, 2696, 3209, 1162, 1158, 1676, 2701, 11916, 1167, 138, 1169, 148, 2710, 1174, 152, 1177, 22167, 26779, 21659, 157, 158, 1183, 30880, 1185, 26784, 2209, 2724, 3232, 672, 167, 4256, 8876, 685, 4269, 1202, 2226, 691, 1205, 3253, 1207, 2231, 2242, 4291, 14026, 27340, 1740, 1231, 14032, 24273, 3284, 1749, 213, 727, 217, 730, 2266, 14044, 1246, 1248, 225, 1254, 742, 745, 3819, 14060, 12013, 750, 1775, 242, 1780, 1268, 759, 760, 249, 33536, 1281, 261, 262, 2311, 1290, 267, 37132, 5902, 1810, 7958, 39191, 280, 793, 43813, 1318, 807, 295, 45354, 1324, 28461, 1838, 28462, 815, 1329, 820, 1333, 317, 2366, 39743, 832, 2365, 45378, 835, 330, 1356, 845, 334, 1359, 4433, 4438, 854, 14168, 1370, 1883, 1372, 1371, 860, 863, 3935, 3937, 1378, 11618, 3426, 870, 358, 3942, 361, 874, 362, 875, 28010, 3438, 2416, 369, 880, 14196, 886, 4472, 1403, 894, 895, 2432, 385, 904, 905, 27528, 907, 909, 911, 1431, 409, 1433, 925, 1950, 415, 928, 413, 13731, 3494, 20902, 937, 1452, 942, 1968, 1973, 1464, 1977, 956, 34240, 3009, 32706, 14278, 3015, 456, 1993, 973, 975, 976, 465, 466, 1491, 14290, 2512, 1494, 472, 475, 480, 3554, 995, 2532, 3048, 1513, 23529, 3564, 494, 498, 500, 501, 503, 1017, 3070}, 3: {1536, 0, 10242, 3, 37889, 1029, 10248, 2569, 9740, 9745, 10770, 17938, 2577, 10257, 9238, 3094, 9752, 9751, 9754, 30235, 9243, 18425, 9246, 2590, 24096, 9249, 9250, 9251, 4643, 10272, 9252, 5666, 3616, 3625, 4133, 4136, 1071, 9264, 4657, 51, 9267, 22583, 10808, 40504, 10304, 6210, 3650, 37444, 68, 9284, 6731, 9293, 31823, 2133, 9303, 601, 91, 43615, 608, 9314, 10338, 25709, 1646, 10349, 6257, 7794, 27763, 11381, 9337, 7801, 637, 3709, 639, 11391, 9345, 7299, 3715, 1668, 41606, 11401, 11402, 4233, 9868, 10893, 142, 5259, 9872, 25744, 25741, 148, 10389, 34455, 3735, 8345, 8857, 154, 10396, 1178, 7839, 10399, 8554, 1704, 10409, 9900, 10412, 2734, 14512, 10416, 7858, 9394, 9904, 6325, 2232, 1721, 38589, 8894, 6336, 1220, 9925, 11461, 3271, 9420, 719, 14544, 2773, 3286, 3287, 214, 20187, 9438, 26335, 6048, 13534, 226, 3811, 19172, 1766, 2280, 36585, 14575, 2801, 9457, 10993, 10485, 23797, 759, 27896, 5882, 8443, 23803, 1790, 767, 8962, 9476, 7433, 6924, 2316, 2318, 3853, 14608, 4371, 9494, 8983, 6425, 793, 362, 6433, 7458, 2339, 810, 1835, 8493, 6447, 1329, 28466, 44855, 9527, 1338, 10044, 317, 3390, 10047, 41280, 31554, 2372, 9029, 11592, 9547, 3916, 9042, 10066, 3925, 343, 10072, 5978, 860, 8030, 10079, 10593, 9572, 2916, 9061, 3430, 6501, 4969, 10089, 30571, 10603, 11117, 9582, 10607, 6505, 14193, 28529, 14707, 7197, 369, 11639, 23929, 894, 1919, 3459, 11652, 2438, 10631, 907, 10642, 9109, 2454, 14743, 2456, 29594, 11164, 6559, 9631, 3999, 1951, 14754, 14756, 31653, 9638, 31654, 33704, 45984, 3500, 31661, 1453, 1455, 9645, 9649, 41394, 9651, 9652, 10165, 30718, 2999, 31672, 1982, 9662, 44483, 11205, 2505, 5581, 10704, 465, 977, 31699, 9172, 4053, 9174, 31703, 4567, 470, 10714, 475, 5076, 478, 480, 23008, 9186, 30692, 9190, 9703, 10216, 491, 30699, 1005, 2542, 31726, 1007, 494, 25586, 10222, 18417, 10736, 8178, 3064, 1529, 509, 1534}, 4: {0, 5121, 4098, 3, 3078, 7175, 1543, 1545, 22027, 5131, 14, 4623, 4625, 22547, 533, 2588, 2590, 1570, 4643, 2597, 5669, 5159, 6183, 2602, 45, 6702, 18937, 5168, 5169, 48, 4657, 3063, 51, 1590, 12343, 5686, 5689, 2105, 1586, 5175, 5694, 6721, 68, 2630, 29767, 29778, 4692, 2133, 5204, 6740, 601, 7258, 91, 5722, 5214, 4703, 608, 3679, 2143, 101, 6758, 5224, 616, 7277, 2158, 4723, 5236, 6267, 1660, 637, 639, 4737, 4739, 5252, 133, 1668, 4606, 23688, 5768, 17035, 2188, 5772, 38034, 5779, 3220, 6805, 2199, 1688, 5273, 154, 155, 1694, 4767, 5280, 5278, 5284, 1191, 1704, 167, 3754, 5802, 5290, 3751, 3247, 5296, 3257, 5818, 5823, 3265, 708, 5318, 5830, 4294, 1738, 5841, 5330, 4825, 4316, 734, 6369, 5349, 4838, 4326, 2280, 4329, 46315, 6380, 29660, 44269, 5871, 5873, 242, 7927, 759, 760, 2812, 1277, 8448, 3329, 4866, 2304, 4869, 5382, 7430, 3848, 3339, 2318, 782, 3857, 5906, 26513, 788, 2841, 7450, 4382, 1825, 7458, 801, 37156, 4393, 810, 7979, 3886, 815, 4911, 4401, 7986, 1329, 820, 5942, 3896, 8506, 2874, 317, 5441, 835, 5445, 5958, 6578, 5964, 5965, 4942, 8016, 8024, 344, 4952, 860, 1884, 29533, 8545, 8037, 3430, 6504, 7017, 2922, 4457, 362, 5998, 2928, 373, 374, 2935, 1398, 8057, 6011, 6015, 32127, 384, 4994, 8579, 4996, 8072, 396, 6541, 5006, 6540, 5009, 1938, 1427, 7571, 2965, 1942, 6039, 1940, 7574, 2970, 409, 7068, 7575, 8606, 5014, 5018, 7585, 5017, 6561, 7588, 1447, 3497, 6058, 5547, 1965, 6065, 4529, 21939, 4531, 6069, 5043, 5559, 7096, 1465, 6074, 3515, 4533, 6077, 5054, 7103, 448, 6080, 6076, 4547, 8132, 4552, 4555, 1484, 39372, 39374, 4561, 6611, 5078, 470, 1496, 5081, 472, 7131, 4572, 7133, 5598, 5086, 4576, 4577, 6111, 478, 4580, 1508, 480, 1503, 5096, 1506, 4584, 23019, 493, 494, 498, 5108, 18935, 1529, 6138, 7163, 10238, 5119}, 5: {0, 14849, 512, 3, 11266, 14853, 2053, 23047, 1527, 2569, 15370, 14861, 13, 19471, 2577, 11793, 14867, 18423, 533, 15384, 14875, 15388, 11807, 15396, 4132, 1574, 14890, 14893, 14896, 14897, 1586, 51, 1590, 14911, 1088, 15429, 14406, 23111, 16968, 14921, 14925, 16461, 14929, 15442, 8789, 14934, 2647, 3161, 7770, 91, 14940, 9308, 14937, 14943, 608, 6755, 1124, 13924, 14950, 5219, 14947, 9325, 3697, 14961, 11893, 14968, 12408, 15485, 637, 5247, 1668, 1157, 23172, 647, 15492, 15498, 5773, 19087, 13969, 9362, 15506, 1681, 148, 11926, 1176, 2713, 155, 1180, 15517, 1692, 20124, 10401, 19105, 675, 674, 19109, 167, 1704, 11946, 15019, 12458, 1709, 682, 9091, 2224, 15025, 20656, 176, 180, 7858, 12982, 15031, 15543, 41136, 14013, 2239, 1729, 708, 9413, 21700, 712, 15562, 15051, 2765, 15057, 15061, 9942, 15063, 21718, 22747, 15068, 15069, 32475, 13535, 15583, 15074, 227, 19683, 2789, 1766, 13542, 13036, 2799, 752, 3312, 13552, 242, 26867, 1268, 15618, 759, 2809, 763, 28924, 2812, 10495, 2817, 2818, 14083, 769, 259, 15622, 2823, 1288, 8962, 15109, 19720, 15629, 19213, 3345, 786, 788, 280, 25375, 2337, 15650, 804, 15653, 3366, 807, 2349, 15151, 7984, 1329, 21810, 820, 12602, 1338, 317, 11582, 5953, 2370, 835, 323, 15688, 1864, 15693, 854, 13142, 344, 15705, 4955, 860, 23899, 11615, 863, 15199, 15711, 13155, 15205, 872, 4457, 15722, 362, 15724, 875, 3438, 15215, 369, 883, 19828, 24437, 374, 29179, 9593, 19834, 15227, 894, 19326, 13186, 35203, 2436, 15749, 389, 19847, 15750, 19849, 2438, 1922, 6028, 909, 15752, 2446, 13200, 2448, 409, 21923, 9644, 14766, 22959, 14771, 23989, 12728, 9145, 14778, 14779, 3000, 12733, 7102, 3007, 9665, 14786, 12226, 2498, 14789, 8645, 15301, 15305, 15818, 461, 976, 5585, 977, 1489, 15358, 472, 1496, 42457, 2524, 478, 19422, 480, 15330, 15843, 20452, 26084, 6631, 14827, 492, 15343, 3571, 14836, 15348, 19446, 14839, 11765, 1017, 14843, 14844, 14846}}
126. word_bagNum shape: 6 50
127. {0: [1536, 7681, 17410, 17411, 17415, 6664, 17420, 15886, 4623, 17935, 4625, 5139, 4631, 17916, 17437, 544, 16422, 5671, 1065, 4650, 4651, 4653, 4690, 16943, 4657, 17458, 15921, 51, 7222, 17464, 17465, 10299, 15932, 64, 6209, 66, 17474, 4680, 8264, 8266, 40008, 6730, 8273, 6738, 5203, 5206, 18005, 15958, 597, 15960], 1: [0, 3, 11785, 2569, 32779, 9227, 526, 21519, 530, 4116, 533, 11805, 2590, 2591, 3105, 7203, 1571, 8740, 1574, 12836, 1062, 1577, 2553, 4654, 1071, 2094, 30257, 51, 30260, 53, 28213, 24633, 1082, 1087, 68, 8779, 78, 12367, 11859, 2647, 91, 13916, 13917, 15455, 608, 9825, 1634, 12387, 13412, 613], 2: [0, 3, 520, 1547, 12300, 2062, 3599, 1040, 26641, 18, 25616, 2577, 13846, 2583, 4121, 25114, 1051, 1052, 25629, 1054, 1567, 2591, 3105, 3616, 4126, 1060, 4125, 1062, 1063, 26663, 1577, 13863, 1066, 1580, 45, 1071, 51, 3123, 53, 2614, 3125, 1082, 2622, 66, 2627, 11843, 1093, 1606, 1605, 3651], 3: [1536, 0, 10242, 3, 37889, 1029, 10248, 2569, 9740, 9745, 10770, 17938, 2577, 10257, 9238, 3094, 9752, 9751, 9754, 30235, 9243, 18425, 9246, 2590, 24096, 9249, 9250, 9251, 4643, 10272, 9252, 5666, 3616, 3625, 4133, 4136, 1071, 9264, 4657, 51, 9267, 22583, 10808, 40504, 10304, 6210, 3650, 37444, 68, 9284], 4: [0, 5121, 4098, 3, 3078, 7175, 1543, 1545, 22027, 5131, 14, 4623, 4625, 22547, 533, 2588, 2590, 1570, 4643, 2597, 5669, 5159, 6183, 2602, 45, 6702, 18937, 5168, 5169, 48, 4657, 3063, 51, 1590, 12343, 5686, 5689, 2105, 1586, 5175, 5694, 6721, 68, 2630, 29767, 29778, 4692, 2133, 5204, 6740], 5: [0, 14849, 512, 3, 11266, 14853, 2053, 23047, 1527, 2569, 15370, 14861, 13, 19471, 2577, 11793, 14867, 18423, 533, 15384, 14875, 15388, 11807, 15396, 4132, 1574, 14890, 14893, 14896, 14897, 1586, 51, 1590, 14911, 1088, 15429, 14406, 23111, 16968, 14921, 14925, 16461, 14929, 15442, 8789, 14934, 2647, 3161, 7770, 91]}
128. after all_words, word_bag shape: 6 300
129. {0: [1536, 7681, 17410, 17411, 17415, 6664, 17420, 15886, 4623, 17935, 4625, 5139, 4631, 17916, 17437, 544, 16422, 5671, 1065, 4650, 4651, 4653, 4690, 16943, 4657, 17458, 15921, 51, 7222, 17464, 17465, 10299, 15932, 64, 6209, 66, 17474, 4680, 8264, 8266, 40008, 6730, 8273, 6738, 5203, 5206, 18005, 15958, 597, 15960, 0, 3, 11785, 2569, 32779, 9227, 526, 21519, 530, 4116, 533, 11805, 2590, 2591, 3105, 7203, 1571, 8740, 1574, 12836, 1062, 1577, 2553, 4654, 1071, 2094, 30257, 51, 30260, 53, 28213, 24633, 1082, 1087, 68, 8779, 78, 12367, 11859, 2647, 91, 13916, 13917, 15455, 608, 9825, 1634, 12387, 13412, 613, 0, 3, 520, 1547, 12300, 2062, 3599, 1040, 26641, 18, 25616, 2577, 13846, 2583, 4121, 25114, 1051, 1052, 25629, 1054, 1567, 2591, 3105, 3616, 4126, 1060, 4125, 1062, 1063, 26663, 1577, 13863, 1066, 1580, 45, 1071, 51, 3123, 53, 2614, 3125, 1082, 2622, 66, 2627, 11843, 1093, 1606, 1605, 3651, 1536, 0, 10242, 3, 37889, 1029, 10248, 2569, 9740, 9745, 10770, 17938, 2577, 10257, 9238, 3094, 9752, 9751, 9754, 30235, 9243, 18425, 9246, 2590, 24096, 9249, 9250, 9251, 4643, 10272, 9252, 5666, 3616, 3625, 4133, 4136, 1071, 9264, 4657, 51, 9267, 22583, 10808, 40504, 10304, 6210, 3650, 37444, 68, 9284, 0, 5121, 4098, 3, 3078, 7175, 1543, 1545, 22027, 5131, 14, 4623, 4625, 22547, 533, 2588, 2590, 1570, 4643, 2597, 5669, 5159, 6183, 2602, 45, 6702, 18937, 5168, 5169, 48, 4657, 3063, 51, 1590, 12343, 5686, 5689, 2105, 1586, 5175, 5694, 6721, 68, 2630, 29767, 29778, 4692, 2133, 5204, 6740, 0, 14849, 512, 3, 11266, 14853, 2053, 23047, 1527, 2569, 15370, 14861, 13, 19471, 2577, 11793, 14867, 18423, 533, 15384, 14875, 15388, 11807, 15396, 4132, 1574, 14890, 14893, 14896, 14897, 1586, 51, 1590, 14911, 1088, 15429, 14406, 23111, 16968, 14921, 14925, 16461, 14929, 15442, 8789, 14934, 2647, 3161, 7770, 91], 1: [1536, 7681, 17410, 17411, 17415, 6664, 17420, 15886, 4623, 17935, 4625, 5139, 4631, 17916, 17437, 544, 16422, 5671, 1065, 4650, 4651, 4653, 4690, 16943, 4657, 17458, 15921, 51, 7222, 17464, 17465, 10299, 15932, 64, 6209, 66, 17474, 4680, 8264, 8266, 40008, 6730, 8273, 6738, 5203, 5206, 18005, 15958, 597, 15960, 0, 3, 11785, 2569, 32779, 9227, 526, 21519, 530, 4116, 533, 11805, 2590, 2591, 3105, 7203, 1571, 8740, 1574, 12836, 1062, 1577, 2553, 4654, 1071, 2094, 30257, 51, 30260, 53, 28213, 24633, 1082, 1087, 68, 8779, 78, 12367, 11859, 2647, 91, 13916, 13917, 15455, 608, 9825, 1634, 12387, 13412, 613, 0, 3, 520, 1547, 12300, 2062, 3599, 1040, 26641, 18, 25616, 2577, 13846, 2583, 4121, 25114, 1051, 1052, 25629, 1054, 1567, 2591, 3105, 3616, 4126, 1060, 4125, 1062, 1063, 26663, 1577, 13863, 1066, 1580, 45, 1071, 51, 3123, 53, 2614, 3125, 1082, 2622, 66, 2627, 11843, 1093, 1606, 1605, 3651, 1536, 0, 10242, 3, 37889, 1029, 10248, 2569, 9740, 9745, 10770, 17938, 2577, 10257, 9238, 3094, 9752, 9751, 9754, 30235, 9243, 18425, 9246, 2590, 24096, 9249, 9250, 9251, 4643, 10272, 9252, 5666, 3616, 3625, 4133, 4136, 1071, 9264, 4657, 51, 9267, 22583, 10808, 40504, 10304, 6210, 3650, 37444, 68, 9284, 0, 5121, 4098, 3, 3078, 7175, 1543, 1545, 22027, 5131, 14, 4623, 4625, 22547, 533, 2588, 2590, 1570, 4643, 2597, 5669, 5159, 6183, 2602, 45, 6702, 18937, 5168, 5169, 48, 4657, 3063, 51, 1590, 12343, 5686, 5689, 2105, 1586, 5175, 5694, 6721, 68, 2630, 29767, 29778, 4692, 2133, 5204, 6740, 0, 14849, 512, 3, 11266, 14853, 2053, 23047, 1527, 2569, 15370, 14861, 13, 19471, 2577, 11793, 14867, 18423, 533, 15384, 14875, 15388, 11807, 15396, 4132, 1574, 14890, 14893, 14896, 14897, 1586, 51, 1590, 14911, 1088, 15429, 14406, 23111, 16968, 14921, 14925, 16461, 14929, 15442, 8789, 14934, 2647, 3161, 7770, 91], 2: [1536, 7681, 17410, 17411, 17415, 6664, 17420, 15886, 4623, 17935, 4625, 5139, 4631, 17916, 17437, 544, 16422, 5671, 1065, 4650, 4651, 4653, 4690, 16943, 4657, 17458, 15921, 51, 7222, 17464, 17465, 10299, 15932, 64, 6209, 66, 17474, 4680, 8264, 8266, 40008, 6730, 8273, 6738, 5203, 5206, 18005, 15958, 597, 15960, 0, 3, 11785, 2569, 32779, 9227, 526, 21519, 530, 4116, 533, 11805, 2590, 2591, 3105, 7203, 1571, 8740, 1574, 12836, 1062, 1577, 2553, 4654, 1071, 2094, 30257, 51, 30260, 53, 28213, 24633, 1082, 1087, 68, 8779, 78, 12367, 11859, 2647, 91, 13916, 13917, 15455, 608, 9825, 1634, 12387, 13412, 613, 0, 3, 520, 1547, 12300, 2062, 3599, 1040, 26641, 18, 25616, 2577, 13846, 2583, 4121, 25114, 1051, 1052, 25629, 1054, 1567, 2591, 3105, 3616, 4126, 1060, 4125, 1062, 1063, 26663, 1577, 13863, 1066, 1580, 45, 1071, 51, 3123, 53, 2614, 3125, 1082, 2622, 66, 2627, 11843, 1093, 1606, 1605, 3651, 1536, 0, 10242, 3, 37889, 1029, 10248, 2569, 9740, 9745, 10770, 17938, 2577, 10257, 9238, 3094, 9752, 9751, 9754, 30235, 9243, 18425, 9246, 2590, 24096, 9249, 9250, 9251, 4643, 10272, 9252, 5666, 3616, 3625, 4133, 4136, 1071, 9264, 4657, 51, 9267, 22583, 10808, 40504, 10304, 6210, 3650, 37444, 68, 9284, 0, 5121, 4098, 3, 3078, 7175, 1543, 1545, 22027, 5131, 14, 4623, 4625, 22547, 533, 2588, 2590, 1570, 4643, 2597, 5669, 5159, 6183, 2602, 45, 6702, 18937, 5168, 5169, 48, 4657, 3063, 51, 1590, 12343, 5686, 5689, 2105, 1586, 5175, 5694, 6721, 68, 2630, 29767, 29778, 4692, 2133, 5204, 6740, 0, 14849, 512, 3, 11266, 14853, 2053, 23047, 1527, 2569, 15370, 14861, 13, 19471, 2577, 11793, 14867, 18423, 533, 15384, 14875, 15388, 11807, 15396, 4132, 1574, 14890, 14893, 14896, 14897, 1586, 51, 1590, 14911, 1088, 15429, 14406, 23111, 16968, 14921, 14925, 16461, 14929, 15442, 8789, 14934, 2647, 3161, 7770, 91], 3: [1536, 7681, 17410, 17411, 17415, 6664, 17420, 15886, 4623, 17935, 4625, 5139, 4631, 17916, 17437, 544, 16422, 5671, 1065, 4650, 4651, 4653, 4690, 16943, 4657, 17458, 15921, 51, 7222, 17464, 17465, 10299, 15932, 64, 6209, 66, 17474, 4680, 8264, 8266, 40008, 6730, 8273, 6738, 5203, 5206, 18005, 15958, 597, 15960, 0, 3, 11785, 2569, 32779, 9227, 526, 21519, 530, 4116, 533, 11805, 2590, 2591, 3105, 7203, 1571, 8740, 1574, 12836, 1062, 1577, 2553, 4654, 1071, 2094, 30257, 51, 30260, 53, 28213, 24633, 1082, 1087, 68, 8779, 78, 12367, 11859, 2647, 91, 13916, 13917, 15455, 608, 9825, 1634, 12387, 13412, 613, 0, 3, 520, 1547, 12300, 2062, 3599, 1040, 26641, 18, 25616, 2577, 13846, 2583, 4121, 25114, 1051, 1052, 25629, 1054, 1567, 2591, 3105, 3616, 4126, 1060, 4125, 1062, 1063, 26663, 1577, 13863, 1066, 1580, 45, 1071, 51, 3123, 53, 2614, 3125, 1082, 2622, 66, 2627, 11843, 1093, 1606, 1605, 3651, 1536, 0, 10242, 3, 37889, 1029, 10248, 2569, 9740, 9745, 10770, 17938, 2577, 10257, 9238, 3094, 9752, 9751, 9754, 30235, 9243, 18425, 9246, 2590, 24096, 9249, 9250, 9251, 4643, 10272, 9252, 5666, 3616, 3625, 4133, 4136, 1071, 9264, 4657, 51, 9267, 22583, 10808, 40504, 10304, 6210, 3650, 37444, 68, 9284, 0, 5121, 4098, 3, 3078, 7175, 1543, 1545, 22027, 5131, 14, 4623, 4625, 22547, 533, 2588, 2590, 1570, 4643, 2597, 5669, 5159, 6183, 2602, 45, 6702, 18937, 5168, 5169, 48, 4657, 3063, 51, 1590, 12343, 5686, 5689, 2105, 1586, 5175, 5694, 6721, 68, 2630, 29767, 29778, 4692, 2133, 5204, 6740, 0, 14849, 512, 3, 11266, 14853, 2053, 23047, 1527, 2569, 15370, 14861, 13, 19471, 2577, 11793, 14867, 18423, 533, 15384, 14875, 15388, 11807, 15396, 4132, 1574, 14890, 14893, 14896, 14897, 1586, 51, 1590, 14911, 1088, 15429, 14406, 23111, 16968, 14921, 14925, 16461, 14929, 15442, 8789, 14934, 2647, 3161, 7770, 91], 4: [1536, 7681, 17410, 17411, 17415, 6664, 17420, 15886, 4623, 17935, 4625, 5139, 4631, 17916, 17437, 544, 16422, 5671, 1065, 4650, 4651, 4653, 4690, 16943, 4657, 17458, 15921, 51, 7222, 17464, 17465, 10299, 15932, 64, 6209, 66, 17474, 4680, 8264, 8266, 40008, 6730, 8273, 6738, 5203, 5206, 18005, 15958, 597, 15960, 0, 3, 11785, 2569, 32779, 9227, 526, 21519, 530, 4116, 533, 11805, 2590, 2591, 3105, 7203, 1571, 8740, 1574, 12836, 1062, 1577, 2553, 4654, 1071, 2094, 30257, 51, 30260, 53, 28213, 24633, 1082, 1087, 68, 8779, 78, 12367, 11859, 2647, 91, 13916, 13917, 15455, 608, 9825, 1634, 12387, 13412, 613, 0, 3, 520, 1547, 12300, 2062, 3599, 1040, 26641, 18, 25616, 2577, 13846, 2583, 4121, 25114, 1051, 1052, 25629, 1054, 1567, 2591, 3105, 3616, 4126, 1060, 4125, 1062, 1063, 26663, 1577, 13863, 1066, 1580, 45, 1071, 51, 3123, 53, 2614, 3125, 1082, 2622, 66, 2627, 11843, 1093, 1606, 1605, 3651, 1536, 0, 10242, 3, 37889, 1029, 10248, 2569, 9740, 9745, 10770, 17938, 2577, 10257, 9238, 3094, 9752, 9751, 9754, 30235, 9243, 18425, 9246, 2590, 24096, 9249, 9250, 9251, 4643, 10272, 9252, 5666, 3616, 3625, 4133, 4136, 1071, 9264, 4657, 51, 9267, 22583, 10808, 40504, 10304, 6210, 3650, 37444, 68, 9284, 0, 5121, 4098, 3, 3078, 7175, 1543, 1545, 22027, 5131, 14, 4623, 4625, 22547, 533, 2588, 2590, 1570, 4643, 2597, 5669, 5159, 6183, 2602, 45, 6702, 18937, 5168, 5169, 48, 4657, 3063, 51, 1590, 12343, 5686, 5689, 2105, 1586, 5175, 5694, 6721, 68, 2630, 29767, 29778, 4692, 2133, 5204, 6740, 0, 14849, 512, 3, 11266, 14853, 2053, 23047, 1527, 2569, 15370, 14861, 13, 19471, 2577, 11793, 14867, 18423, 533, 15384, 14875, 15388, 11807, 15396, 4132, 1574, 14890, 14893, 14896, 14897, 1586, 51, 1590, 14911, 1088, 15429, 14406, 23111, 16968, 14921, 14925, 16461, 14929, 15442, 8789, 14934, 2647, 3161, 7770, 91], 5: [1536, 7681, 17410, 17411, 17415, 6664, 17420, 15886, 4623, 17935, 4625, 5139, 4631, 17916, 17437, 544, 16422, 5671, 1065, 4650, 4651, 4653, 4690, 16943, 4657, 17458, 15921, 51, 7222, 17464, 17465, 10299, 15932, 64, 6209, 66, 17474, 4680, 8264, 8266, 40008, 6730, 8273, 6738, 5203, 5206, 18005, 15958, 597, 15960, 0, 3, 11785, 2569, 32779, 9227, 526, 21519, 530, 4116, 533, 11805, 2590, 2591, 3105, 7203, 1571, 8740, 1574, 12836, 1062, 1577, 2553, 4654, 1071, 2094, 30257, 51, 30260, 53, 28213, 24633, 1082, 1087, 68, 8779, 78, 12367, 11859, 2647, 91, 13916, 13917, 15455, 608, 9825, 1634, 12387, 13412, 613, 0, 3, 520, 1547, 12300, 2062, 3599, 1040, 26641, 18, 25616, 2577, 13846, 2583, 4121, 25114, 1051, 1052, 25629, 1054, 1567, 2591, 3105, 3616, 4126, 1060, 4125, 1062, 1063, 26663, 1577, 13863, 1066, 1580, 45, 1071, 51, 3123, 53, 2614, 3125, 1082, 2622, 66, 2627, 11843, 1093, 1606, 1605, 3651, 1536, 0, 10242, 3, 37889, 1029, 10248, 2569, 9740, 9745, 10770, 17938, 2577, 10257, 9238, 3094, 9752, 9751, 9754, 30235, 9243, 18425, 9246, 2590, 24096, 9249, 9250, 9251, 4643, 10272, 9252, 5666, 3616, 3625, 4133, 4136, 1071, 9264, 4657, 51, 9267, 22583, 10808, 40504, 10304, 6210, 3650, 37444, 68, 9284, 0, 5121, 4098, 3, 3078, 7175, 1543, 1545, 22027, 5131, 14, 4623, 4625, 22547, 533, 2588, 2590, 1570, 4643, 2597, 5669, 5159, 6183, 2602, 45, 6702, 18937, 5168, 5169, 48, 4657, 3063, 51, 1590, 12343, 5686, 5689, 2105, 1586, 5175, 5694, 6721, 68, 2630, 29767, 29778, 4692, 2133, 5204, 6740, 0, 14849, 512, 3, 11266, 14853, 2053, 23047, 1527, 2569, 15370, 14861, 13, 19471, 2577, 11793, 14867, 18423, 533, 15384, 14875, 15388, 11807, 15396, 4132, 1574, 14890, 14893, 14896, 14897, 1586, 51, 1590, 14911, 1088, 15429, 14406, 23111, 16968, 14921, 14925, 16461, 14929, 15442, 8789, 14934, 2647, 3161, 7770, 91]}
130. features_data_frame.shape: (6, 255)
131. 0 30
132. 1 185
133. 2 139
134. 3 66
135. 4 69
136. 5 157
137. class_Proportion: 
138.  [0.04643962848297214, 0.28637770897832815, 0.21517027863777088, 0.1021671826625387, 0.10681114551083591, 0.24303405572755418]
139. test_data_frame.head(2) 
140.       Unnamed: 0                                            content  \
141. 854         854  据Mobileexpose报道,华硕已经正式向媒体发出邀请,定于6月14日在台湾举办记者会,...   
142. 101         101   6月6日,王者荣耀猴三棍重做引起王者峡谷一阵轩然大波,毕竟这个强势的猴子已经陪伴我们好几个...   
143. 
144. id                                   tags  \
145. 854  6429089676803440897  ['科技', '华硕', '华硕ZenFone', '台湾', '手机']   
146. 101  6429098400347586818       ['游戏', '猴子', '王者荣耀', '黄忠', '游戏']   
147. 
148.                     time                     title  \
149. 854  2017-06-07 10:11:00        华硕ZenFone AR宣布本月发售   
150. 101  2017-06-07 10:39:20  猴子重做之后是加强还是削弱?狂到站对面泉水拿双杀   
151. 
152.                                              doc_words  \
153. 854  [报道, 华硕, 已经, 正式, 媒体, 发出, 邀请, 定于, 月, 日, 台湾, 举办,...   
154. 101  [月, 日, 王者, 荣耀, 猴三棍, 重, 做, 引起, 王者, 峡谷, 一阵, 轩然大波...   
155. 
156.                                                 corpus  \
157. 854  [(142, 1), (362, 1), (472, 1), (475, 1), (494,...   
158. 101  [(0, 2), (68, 3), (133, 1), (184, 1), (226, 1)...   
159. 
160.                                                  tfidf   visual01   visual02  \
161. 854  [(142, 0.13953435619531032), (362, 0.046441336...  21.684397 -30.567736
162. 101  [(0, 0.012838015508020575), (68, 0.04742284222...  67.188065  21.183245
163. 
164.      keyword_index  
165. 854              1
166. 101              3
167. print the first sample 
168.  Unnamed: 0                                                     854
169. content          据Mobileexpose报道,华硕已经正式向媒体发出邀请,定于6月14日在台湾举办记者会,...
170. id                                             6429089676803440897
171. tags                         ['科技', '华硕', '华硕ZenFone', '台湾', '手机']
172. time                                           2017-06-07 10:11:00
173. title                                           华硕ZenFone AR宣布本月发售
174. doc_words        [报道, 华硕, 已经, 正式, 媒体, 发出, 邀请, 定于, 月, 日, 台湾, 举办,...
175. corpus           [(142, 1), (362, 1), (472, 1), (475, 1), (494,...
176. tfidf            [(142, 0.13953435619531032), (362, 0.046441336...
177. visual01                                                   21.6844
178. visual02                                                  -30.5677
179. keyword_index                                                    1
180. Name: 854, dtype: object
181. test_data_frame.iloc[0].corpus:  [(142, 1), (362, 1), (472, 1), (475, 1), (494, 1), (530, 1), (872, 1), (909, 1), (1254, 1), (1312, 1), (1878, 1), (2577, 1), (2783, 1), (2979, 1), (3697, 1), (5508, 1), (9052, 1), (12204, 1), (12256, 1), (12591, 1), (12936, 1), (12991, 1), (13128, 1), (13194, 1), (13244, 1), (13317, 1), (31670, 1), (31683, 1), (33417, 1)]
182. [1.45708072e-43 1.78656934e-66 7.12148875e-63 1.71090490e-53
183. 4.71385662e-54 2.08405934e-64]
184. [-35.34436300647761, -16.431856044032266, -20.267559000416433, -22.405433968586664, -27.97121661401147, -18.05089965903481]
185. F:\File_Jupyter\实用代码\naive_bayes(简单贝叶斯)\TextClassPrediction_kNN_NB_LDA_P.py:346: SettingWithCopyWarning: 
186. A value is trying to be set on a copy of a slice from a DataFrame.
187. Try using .loc[row_indexer,col_indexer] = value instead
188. 
189. See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
190.   test_data_frame['predicted_class'] = test_data_frame['corpus'].apply(predict_text_ByMax)       #预测所有测试文档   predict all test documents
191. test_data_frame:       Unnamed: 0                                            content  \
192. 854          854  据Mobileexpose报道,华硕已经正式向媒体发出邀请,定于6月14日在台湾举办记者会,...   
193. 101          101   6月6日,王者荣耀猴三棍重做引起王者峡谷一阵轩然大波,毕竟这个强势的猴子已经陪伴我们好几个...   
194. 738          738  骗子往往都很会讲故事,比如以下这些硅谷骗局:验血公司Theranos,号称只要从指尖抽几滴血...   
195. 511          511  专访 Whyd 创始人 孟崨在学校,他是最调皮,却又成绩最好的学生,让老师头疼不已。在公司,...   
196. 725          725  据介绍,喜马拉雅FM会员月费为18元,年度会员188元,价格与视频网站会员价格相仿。在会员福...   
197. ...          ...                                                ...   
198. 805          805  每经记者 王海慜 每经编辑 叶峰今日盘中,昨日领涨的中小创出现休整,而昨日暂时休整的一批龙头...   
199. 448          448  中国人买什么都喜欢大的,房子要买面积大的、手机要买屏大的,买车自然也是要挑选空间大的。抛开拉...   
200. 782          782  中证网讯 (记者 徐金忠)6月7日,国能电动汽车瑞典有限公司(NEVS)亮相CES亚洲消费电...   
201. 1264        1264  目前日系豪华品牌讴歌已经开启了国产之路,在推出CDX车型后,讴歌在国内的知名度一度飙升。CD...   
202. 1195        1195  近日有爆料称,乐视位于北京达美中心的办公地因未及时缴纳办公地费用已被停止物业一切服务;物业公...   
203. 
204. id                                   tags  \
205. 854   6429089676803440897  ['科技', '华硕', '华硕ZenFone', '台湾', '手机']   
206. 101   6429098400347586818       ['游戏', '猴子', '王者荣耀', '黄忠', '游戏']   
207. 738   6413133652368982274     ['科技', '厨卫电器', '榨汁机', '小家电', '硅谷']   
208. 511   6428827159980867842     ['科技', '智能家居', '音箱', '苹果公司', '法国']   
209. 725   6428841852455354625                  ['科技', '喜马拉雅山', '科技']   
210. ...                   ...                                    ...   
211. 805   6429151552733069569                           ['财经', '财经']   
212. 448   6415852634885341441    ['汽车', 'SUV', '国产车', '概念车', '汽车用品']   
213. 782   6428858665063383297   ['科技', '新能源汽车', '电动汽车', '新能源', '经济']   
214. 1264  6427822755417194753    ['汽车', '日本汽车', '讴歌汽车', 'SUV', '空调']   
215. 1195  6429093420292210945                     ['科技', '乐视', '科技']   
216. 
217.                      time                        title  \
218. 854   2017-06-07 10:11:00           华硕ZenFone AR宣布本月发售   
219. 101   2017-06-07 10:39:20     猴子重做之后是加强还是削弱?狂到站对面泉水拿双杀   
220. 738   2017-04-26 10:41:39                绝!他用一台榨汁机骗了8亿   
221. 511   2017-06-08 11:06:00    他的智能音箱一上市,苹果公司就推出了HomePod   
222. 725   2017-06-07 18:37:00  喜马拉雅FM推出“付费会员”,当天召集超221万名会员   
223. ...                   ...                          ...   
224. 805   2017-06-08 14:30:00          盘中近20家龙头白马股集体创下历史新高   
225. 448   2017-05-03 18:37:20      别瞎找了!10万左右尺寸最大的SUV都在这里了   
226. 782   2017-06-07 19:12:00      倡导移动出行新概念 NEVS两款概念量产车亮相   
227. 1264  2017-06-08 09:54:40        居然还有一款车,最低配和中高配看不出差别?   
228. 1195  2017-06-08 10:45:00     乐视被爆未及时缴物业费,员工或将被阻止进大楼办公   
229. 
230.                                               doc_words  \
231. 854   [报道, 华硕, 已经, 正式, 媒体, 发出, 邀请, 定于, 月, 日, 台湾, 举办,...   
232. 101   [月, 日, 王者, 荣耀, 猴三棍, 重, 做, 引起, 王者, 峡谷, 一阵, 轩然大波...   
233. 738   [骗子, 往往, 很会, 讲故事, 以下, 硅谷, 骗局, 验血, 公司, 号称, 指尖, ...   
234. 511   [专访, 创始人, 孟, 崨, 学校, 最, 调皮, 却, 成绩, 最好, 学生, 老师, ...   
235. 725   [据介绍, 喜马拉雅, 会员, 月费, 元, 年度, 会员, 元, 价格, 视频, 网站, ...   
236. ...                                                 ...   
237. 805   [每经, 记者, 王海, 慜, 每经, 编辑, 叶峰, 今日, 盘中, 昨日, 领涨, 中小...   
238. 448   [中国, 人买, 喜欢, 房子, 买, 面积, 手机, 买, 屏大, 买车, 自然, 挑选,...   
239. 782   [中证网, 讯, 记者, 徐金忠, 月, 日, 国, 电动汽车, 瑞典, 有限公司, 亮相,...   
240. 1264  [目前, 日系, 豪华, 品牌, 讴歌, 已经, 开启, 国产, 路, 推出, 车型, 后,...   
241. 1195  [近日, 爆料, 称, 乐视, 位于, 北京, 达美, 中心, 办公地, 因未, 及时, 缴...   
242. 
243.                                                  corpus  \
244. 854   [(142, 1), (362, 1), (472, 1), (475, 1), (494,...   
245. 101   [(0, 2), (68, 3), (133, 1), (184, 1), (226, 1)...   
246. 738   [(0, 2), (45, 1), (48, 1), (133, 2), (155, 1),...   
247. 511   [(0, 10), (13, 2), (14, 2), (20, 1), (45, 1), ...   
248. 725   [(30, 1), (102, 1), (142, 1), (154, 1), (189, ...   
249. ...                                                 ...   
250. 805   [(113, 1), (167, 1), (169, 1), (214, 1), (258,...   
251. 448   [(4, 2), (8, 1), (14, 1), (51, 6), (53, 2), (6...   
252. 782   [(15, 2), (30, 1), (53, 7), (93, 1), (143, 1),...   
253. 1264  [(0, 1), (20, 1), (51, 1), (176, 1), (225, 1),...   
254. 1195  [(57, 1), (111, 1), (191, 1), (361, 1), (476, ...   
255. 
256.                                                   tfidf   visual01   visual02  \
257. 854   [(142, 0.13953435619531032), (362, 0.046441336...  21.684397 -30.567736
258. 101   [(0, 0.012838015508020575), (68, 0.04742284222...  67.188065  21.183245
259. 738   [(0, 0.008984009118453712), (45, 0.01791359767... -22.855194 -11.270862
260. 511   [(0, 0.04361196171462796), (13, 0.028607388065... -22.198786  12.217076
261. 725   [(30, 0.05815947983270004), (102, 0.0450585853...  26.268911  21.240065
262. ...                                                 ...        ...        ...   
263. 805   [(113, 0.030899018921031703), (167, 0.02103003... -66.232071   0.221611
264. 448   [(4, 0.04071064284477513), (8, 0.0235138776022...  41.836094 -44.539528
265. 782   [(15, 0.03392075672049564), (30, 0.03003603467... -26.810091 -29.602842
266. 1264  [(0, 0.009883726180653873), (20, 0.04080153677...  36.279522 -52.474297
267. 1195  [(57, 0.09668298763559263), (111, 0.1255406499...  -6.373239  16.101738
268. 
269.       keyword_index  predicted_class  
270. 854               1                1
271. 101               3                3
272. 738               1                1
273. 511               1                2
274. 725               1                1
275. ...             ...              ...  
276. 805               2                2
277. 448               5                5
278. 782               1                1
279. 1264              5                5
280. 1195              1                1
281. 
282. [647 rows x 13 columns]
283. SModel_CS_acc_score: 0.7047913446676971
284. 300
285. label_category_ID 2
286. 一个
287. 一些
288. 概念
289. 经营
290. 补贴
291. 股市
292. 增持
293. 成本
294. 乳业
295. 万吨
296. train_data_frame.corpus[0] 
297.  [(0, 6), (1, 1), (2, 1), (3, 3), (4, 2), (5, 2), (6, 1), (7, 1), (8, 2), (9, 1), (10, 3), (11, 1), (12, 2), (13, 2), (14, 2), (15, 1), (16, 1), (17, 2), (18, 1), (19, 1), (20, 2), (21, 1), (22, 2), (23, 2), (24, 1), (25, 1), (26, 1), (27, 1), (28, 1), (29, 2), (30, 3), (31, 4), (32, 3), (33, 1), (34, 1), (35, 1), (36, 7), (37, 1), (38, 1), (39, 2), (40, 3), (41, 1), (42, 1), (43, 1), (44, 1), (45, 2), (46, 1), (47, 1), (48, 1), (49, 2), (50, 4), (51, 21), (52, 3), (53, 7), (54, 1), (55, 2), (56, 1), (57, 4), (58, 2), (59, 1), (60, 5), (61, 1), (62, 1), (63, 1), (64, 2), (65, 1), (66, 3), (67, 1), (68, 2), (69, 2), (70, 1), (71, 1), (72, 1), (73, 1), (74, 2), (75, 1), (76, 1), (77, 1), (78, 1), (79, 2), (80, 1), (81, 1), (82, 1), (83, 4), (84, 7), (85, 2), (86, 3), (87, 1), (88, 9), (89, 1), (90, 1), (91, 8), (92, 3), (93, 1), (94, 4), (95, 1), (96, 2), (97, 1), (98, 7), (99, 1), (100, 2), (101, 1), (102, 1), (103, 1), (104, 1), (105, 1), (106, 1), (107, 1), (108, 1), (109, 2), (110, 1), (111, 2), (112, 1), (113, 1), (114, 1), (115, 1), (116, 1), (117, 1), (118, 1), (119, 1), (120, 1), (121, 2), (122, 1), (123, 1), (124, 1), (125, 1), (126, 5), (127, 1), (128, 4), (129, 1), (130, 1), (131, 1), (132, 2), (133, 2), (134, 1), (135, 5), (136, 1), (137, 1), (138, 3), (139, 1), (140, 1), (141, 1), (142, 1), (143, 1), (144, 1), (145, 2), (146, 1), (147, 1), (148, 2), (149, 4), (150, 1), (151, 1), (152, 2), (153, 2), (154, 1), (155, 3), (156, 1), (157, 1), (158, 1), (159, 1), (160, 1), (161, 2), (162, 1), (163, 1), (164, 1), (165, 2), (166, 1), (167, 3), (168, 1), (169, 1), (170, 3), (171, 3), (172, 1), (173, 2), (174, 1), (175, 1), (176, 2), (177, 5), (178, 1), (179, 1), (180, 1), (181, 1), (182, 1), (183, 1), (184, 4), (185, 1), (186, 1), (187, 1), (188, 1), (189, 3), (190, 1), (191, 14), (192, 2), (193, 2), (194, 2), (195, 1), (196, 3), (197, 1), (198, 1), (199, 11), (200, 6), (201, 1), (202, 1), (203, 2), (204, 1), (205, 8), (206, 2), (207, 2), (208, 2), (209, 1), (210, 1), (211, 1), (212, 1), (213, 1), (214, 1), (215, 1), (216, 3), (217, 1), (218, 1), (219, 2), (220, 2), (221, 1), (222, 1), (223, 1), (224, 1), (225, 17), (226, 1), (227, 1), (228, 1), (229, 1), (230, 1), (231, 1), (232, 2), (233, 1), (234, 1), (235, 3), (236, 1), (237, 1), (238, 2), (239, 1), (240, 1), (241, 1), (242, 1), (243, 2), (244, 2), (245, 1), (246, 1), (247, 2), (248, 2), (249, 2), (250, 1), (251, 1), (252, 2), (253, 1), (254, 1), (255, 1), (256, 1), (257, 1), (258, 3), (259, 3), (260, 1), (261, 3), (262, 2), (263, 1), (264, 1), (265, 6), (266, 1), (267, 3), (268, 1), (269, 1), (270, 3), (271, 2), (272, 1), (273, 2), (274, 1), (275, 1), (276, 5), (277, 1), (278, 4), (279, 4), (280, 25), (281, 2), (282, 2), (283, 2), (284, 7), (285, 1), (286, 1), (287, 2), (288, 2), (289, 1), (290, 1), (291, 1), (292, 1), (293, 3), (294, 2), (295, 1), (296, 3), (297, 1), (298, 3), (299, 2), (300, 1), (301, 1), (302, 1), (303, 2), (304, 1), (305, 1), (306, 1), (307, 2), (308, 2), (309, 1), (310, 1), (311, 1), (312, 1), (313, 1), (314, 1), (315, 1), (316, 7), (317, 2), (318, 2), (319, 1), (320, 1), (321, 1), (322, 1), (323, 1), (324, 1), (325, 4), (326, 1), (327, 2), (328, 1), (329, 1), (330, 3), (331, 3), (332, 1), (333, 2), (334, 2), (335, 1), (336, 1), (337, 2), (338, 1), (339, 1), (340, 1), (341, 1), (342, 1), (343, 1), (344, 2), (345, 1), (346, 1), (347, 2), (348, 1), (349, 2), (350, 5), (351, 2), (352, 3), (353, 1), (354, 4), (355, 1), (356, 1), (357, 2), (358, 4), (359, 2), (360, 2), (361, 1), (362, 9), (363, 2), (364, 2), (365, 1), (366, 1), (367, 7), (368, 1), (369, 4), (370, 2), (371, 1), (372, 1), (373, 1), (374, 1), (375, 1), (376, 1), (377, 1), (378, 2), (379, 1), (380, 3), (381, 1), (382, 2), (383, 1), (384, 3), (385, 26), (386, 1), (387, 1), (388, 1), (389, 3), (390, 1), (391, 2), (392, 1), (393, 4), (394, 4), (395, 4), (396, 2), (397, 1), (398, 40), (399, 2), (400, 4), (401, 1), (402, 1), (403, 2), (404, 1), (405, 1), (406, 2), (407, 1), (408, 1), (409, 3), (410, 1), (411, 1), (412, 2), (413, 7), (414, 4), (415, 2), (416, 1), (417, 1), (418, 1), (419, 3), (420, 1), (421, 1), (422, 1), (423, 1), (424, 1), (425, 1), (426, 1), (427, 2), (428, 1), (429, 1), (430, 1), (431, 1), (432, 5), (433, 1), (434, 1), (435, 1), (436, 1), (437, 1), (438, 1), (439, 1), (440, 1), (441, 1), (442, 1), (443, 3), (444, 3), (445, 2), (446, 5), (447, 1), (448, 1), (449, 1), (450, 4), (451, 1), (452, 2), (453, 2), (454, 1), (455, 4), (456, 1), (457, 1), (458, 1), (459, 2), (460, 1), (461, 1), (462, 5), (463, 2), (464, 1), (465, 5), (466, 74), (467, 2), (468, 1), (469, 1), (470, 2), (471, 22), (472, 2), (473, 1), (474, 1), (475, 2), (476, 2), (477, 2), (478, 2), (479, 1), (480, 1), (481, 1), (482, 1), (483, 2), (484, 1), (485, 1), (486, 2), (487, 1), (488, 2), (489, 1), (490, 1), (491, 1), (492, 4), (493, 1), (494, 2), (495, 4), (496, 2), (497, 1), (498, 1), (499, 1), (500, 1), (501, 5), (502, 1), (503, 13), (504, 4), (505, 3), (506, 1), (507, 7), (508, 1), (509, 1), (510, 1), (511, 1), (512, 1), (513, 1), (514, 2), (515, 1), (516, 3), (517, 4), (518, 1), (519, 1), (520, 1), (521, 1), (522, 1), (523, 1), (524, 1), (525, 1), (526, 2), (527, 2), (528, 1), (529, 1), (530, 1), (531, 1), (532, 1), (533, 1), (534, 1), (535, 2), (536, 5), (537, 2), (538, 1), (539, 1), (540, 1), (541, 7), (542, 1), (543, 1), (544, 1), (545, 2), (546, 1), (547, 3), (548, 2), (549, 1), (550, 1), (551, 2), (552, 1), (553, 2), (554, 1), (555, 1), (556, 2), (557, 1), (558, 2), (559, 5), (560, 2), (561, 1), (562, 1), (563, 1), (564, 1), (565, 1), (566, 1), (567, 7), (568, 2), (569, 1), (570, 2), (571, 1), (572, 1), (573, 1), (574, 4), (575, 1), (576, 2), (577, 2), (578, 1), (579, 2), (580, 1), (581, 1), (582, 1), (583, 2), (584, 1), (585, 1), (586, 1), (587, 4), (588, 1), (589, 4), (590, 2), (591, 1), (592, 1), (593, 1), (594, 2), (595, 1), (596, 1), (597, 1), (598, 1), (599, 1), (600, 1), (601, 1), (602, 1), (603, 1), (604, 1), (605, 1), (606, 1), (607, 1), (608, 2), (609, 1), (610, 2), (611, 1), (612, 1), (613, 11), (614, 1), (615, 1), (616, 3), (617, 1), (618, 1), (619, 1), (620, 1), (621, 1), (622, 1), (623, 1), (624, 32), (625, 2), (626, 1), (627, 8), (628, 1), (629, 3), (630, 3), (631, 1), (632, 1), (633, 4), (634, 1), (635, 1), (636, 2), (637, 1), (638, 3), (639, 2), (640, 1), (641, 1), (642, 1), (643, 3), (644, 5), (645, 4), (646, 1), (647, 1), (648, 3), (649, 1), (650, 1), (651, 1), (652, 1), (653, 1), (654, 1), (655, 2), (656, 1), (657, 7), (658, 1), (659, 2), (660, 1), (661, 2), (662, 1), (663, 1), (664, 1), (665, 1), (666, 1), (667, 1), (668, 4), (669, 1), (670, 1), (671, 3), (672, 1), (673, 1), (674, 2), (675, 1), (676, 1), (677, 1), (678, 1), (679, 1), (680, 2), (681, 2), (682, 1), (683, 1), (684, 1), (685, 3), (686, 1), (687, 1), (688, 1), (689, 1), (690, 4), (691, 1), (692, 2), (693, 3), (694, 1), (695, 2), (696, 1), (697, 1), (698, 2), (699, 1), (700, 1), (701, 4), (702, 1), (703, 1), (704, 2), (705, 1), (706, 1), (707, 1), (708, 1), (709, 2), (710, 1), (711, 3), (712, 1), (713, 1), (714, 4), (715, 1), (716, 1), (717, 1), (718, 2), (719, 1), (720, 1), (721, 2), (722, 1), (723, 1), (724, 4), (725, 1), (726, 1), (727, 1), (728, 1), (729, 2), (730, 12), (731, 2), (732, 1), (733, 2), (734, 3), (735, 1), (736, 26), (737, 1), (738, 5), (739, 1), (740, 2), (741, 5), (742, 2), (743, 3), (744, 3), (745, 2), (746, 1), (747, 3), (748, 2), (749, 2), (750, 2), (751, 1), (752, 1), (753, 2), (754, 1), (755, 1), (756, 1), (757, 1), (758, 1), (759, 4), (760, 1), (761, 1), (762, 1), (763, 1), (764, 1), (765, 2), (766, 1), (767, 1), (768, 1), (769, 2), (770, 8), (771, 2), (772, 4), (773, 1), (774, 8), (775, 3), (776, 1), (777, 1), (778, 3), (779, 1), (780, 1), (781, 1), (782, 5), (783, 2), (784, 2), (785, 1), (786, 4), (787, 1), (788, 1), (789, 1), (790, 1), (791, 1), (792, 1), (793, 4), (794, 1), (795, 1), (796, 1), (797, 5), (798, 3), (799, 5), (800, 3), (801, 1), (802, 1), (803, 1), (804, 1), (805, 2), (806, 2), (807, 2), (808, 1), (809, 1), (810, 1), (811, 1), (812, 1), (813, 1), (814, 1), (815, 3), (816, 1), (817, 2), (818, 1), (819, 1), (820, 11), (821, 1), (822, 1), (823, 2), (824, 3), (825, 1), (826, 1), (827, 1), (828, 1), (829, 1), (830, 3), (831, 4), (832, 46), (833, 1), (834, 1), (835, 2), (836, 2), (837, 1), (838, 1), (839, 2), (840, 2), (841, 1), (842, 1), (843, 2), (844, 2), (845, 2), (846, 1), (847, 1), (848, 2), (849, 1), (850, 1), (851, 1), (852, 3), (853, 1), (854, 1), (855, 6), (856, 1), (857, 1), (858, 1)]
298. [33. 74. 73. 31. 47. 48.]
299. <class 'numpy.ndarray'>
300. SModel_acc_score: 0.8114374034003091
301. kNNC_acc_score: 0.8160741885625966
302. GNBC_acc_score: 0.6352395672333848
303. MNBC_acc_score: 0.6352395672333848
304. BNBC_acc_score: 0.29675425038639874
305. LDAC_acc_score: 0.8238021638330757
306. PerceptronC_acc_score: 0.8222565687789799

 

 

核心代码

1. class GaussianNB Found at: sklearn.naive_bayes
2. 
3. class GaussianNB(_BaseNB):
4. """
5.     Gaussian Naive Bayes (GaussianNB)
6.     
7.     Can perform online updates to model parameters via :meth:`partial_fit`.
8.     For details on algorithm used to update feature means and variance online,
9.     see Stanford CS tech report STAN-CS-79-773 by Chan, Golub, and LeVeque:
10.     
11.     http://i.stanford.edu/pub/cstr/reports/cs/tr/79/773/CS-TR-79-773.pdf
12.     
13.     Read more in the :ref:`User Guide <gaussian_naive_bayes>`.
14.     
15.     Parameters
16.     ----------
17.     priors : array-like of shape (n_classes,)
18.     Prior probabilities of the classes. If specified the priors are not
19.     adjusted according to the data.
20.     
21.     var_smoothing : float, default=1e-9
22.     Portion of the largest variance of all features that is added to
23.     variances for calculation stability.
24.     
25.     .. versionadded:: 0.20
26.     
27.     Attributes
28.     ----------
29.     class_count_ : ndarray of shape (n_classes,)
30.     number of training samples observed in each class.
31.     
32.     class_prior_ : ndarray of shape (n_classes,)
33.     probability of each class.
34.     
35.     classes_ : ndarray of shape (n_classes,)
36.     class labels known to the classifier
37.     
38.     epsilon_ : float
39.     absolute additive value to variances
40.     
41.     sigma_ : ndarray of shape (n_classes, n_features)
42.     variance of each feature per class
43.     
44.     theta_ : ndarray of shape (n_classes, n_features)
45.     mean of each feature per class
46.     
47.     Examples
48.     --------
49.     >>> import numpy as np
50.     >>> X = np.array([[-1, -1], [-2, -1], [-3, -2], [1, 1], [2, 1], [3, 2]])
51.     >>> Y = np.array([1, 1, 1, 2, 2, 2])
52.     >>> from sklearn.naive_bayes import GaussianNB
53.     >>> clf = GaussianNB()
54.     >>> clf.fit(X, Y)
55.     GaussianNB()
56.     >>> print(clf.predict([[-0.8, -1]]))
57.     [1]
58.     >>> clf_pf = GaussianNB()
59.     >>> clf_pf.partial_fit(X, Y, np.unique(Y))
60.     GaussianNB()
61.     >>> print(clf_pf.predict([[-0.8, -1]]))
62.     [1]
63.     """
64.     @_deprecate_positional_args
65. def __init__(self, *, priors=None, var_smoothing=1e-9):
66.         self.priors = priors
67.         self.var_smoothing = var_smoothing
68. 
69. def fit(self, X, y, sample_weight=None):
70. """Fit Gaussian Naive Bayes according to X, y
71. 
72.         Parameters
73.         ----------
74.         X : array-like of shape (n_samples, n_features)
75.             Training vectors, where n_samples is the number of samples
76.             and n_features is the number of features.
77. 
78.         y : array-like of shape (n_samples,)
79.             Target values.
80. 
81.         sample_weight : array-like of shape (n_samples,), default=None
82.             Weights applied to individual samples (1. for unweighted).
83. 
84.             .. versionadded:: 0.17
85.                Gaussian Naive Bayes supports fitting with *sample_weight*.
86. 
87.         Returns
88.         -------
89.         self : object
90.         """
91.         X, y = self._validate_data(X, y)
92.         y = column_or_1d(y, warn=True)
93. return self._partial_fit(X, y, np.unique(y), _refit=True, 
94.             sample_weight=sample_weight)
95. 
96. def _check_X(self, X):
97. return check_array(X)
98. 
99.     @staticmethod
100. def _update_mean_variance(n_past, mu, var, X, sample_weight=None):
101. """Compute online update of Gaussian mean and variance.
102. 
103.         Given starting sample count, mean, and variance, a new set of
104.         points X, and optionally sample weights, return the updated mean and
105.         variance. (NB - each dimension (column) in X is treated as independent
106.         -- you get variance, not covariance).
107. 
108.         Can take scalar mean and variance, or vector mean and variance to
109.         simultaneously update a number of independent Gaussians.
110. 
111.         See Stanford CS tech report STAN-CS-79-773 by Chan, Golub, and 
112.          LeVeque:
113. 
114.         http://i.stanford.edu/pub/cstr/reports/cs/tr/79/773/CS-TR-79-773.pdf
115. 
116.         Parameters
117.         ----------
118.         n_past : int
119.             Number of samples represented in old mean and variance. If sample
120.             weights were given, this should contain the sum of sample
121.             weights represented in old mean and variance.
122. 
123.         mu : array-like of shape (number of Gaussians,)
124.             Means for Gaussians in original set.
125. 
126.         var : array-like of shape (number of Gaussians,)
127.             Variances for Gaussians in original set.
128. 
129.         sample_weight : array-like of shape (n_samples,), default=None
130.             Weights applied to individual samples (1. for unweighted).
131. 
132.         Returns
133.         -------
134.         total_mu : array-like of shape (number of Gaussians,)
135.             Updated mean for each Gaussian over the combined set.
136. 
137.         total_var : array-like of shape (number of Gaussians,)
138.             Updated variance for each Gaussian over the combined set.
139.         """
140. if X.shape[0] == 0:
141. return mu, var
142. # Compute (potentially weighted) mean and variance of new datapoints
143. if sample_weight is not None:
144.             n_new = float(sample_weight.sum())
145.             new_mu = np.average(X, axis=0, weights=sample_weight)
146.             new_var = np.average((X - new_mu) ** 2, axis=0, 
147.              weights=sample_weight)
148. else:
149.             n_new = X.shape[0]
150.             new_var = np.var(X, axis=0)
151.             new_mu = np.mean(X, axis=0)
152. if n_past == 0:
153. return new_mu, new_var
154.         n_total = float(n_past + n_new)
155. # Combine mean of old and new data, taking into consideration
156. # (weighted) number of observations
157.         total_mu = (n_new * new_mu + n_past * mu) / n_total
158. # Combine variance of old and new data, taking into consideration
159. # (weighted) number of observations. This is achieved by combining
160. # the sum-of-squared-differences (ssd)
161.         old_ssd = n_past * var
162.         new_ssd = n_new * new_var
163.         total_ssd = old_ssd + new_ssd + (n_new * n_past / n_total) * (mu - 
164.          new_mu) ** 2
165.         total_var = total_ssd / n_total
166. return total_mu, total_var
167. 
168. def partial_fit(self, X, y, classes=None, sample_weight=None):
169. """Incremental fit on a batch of samples.
170. 
171.         This method is expected to be called several times consecutively
172.         on different chunks of a dataset so as to implement out-of-core
173.         or online learning.
174. 
175.         This is especially useful when the whole dataset is too big to fit in
176.         memory at once.
177. 
178.         This method has some performance and numerical stability overhead,
179.         hence it is better to call partial_fit on chunks of data that are
180.         as large as possible (as long as fitting in the memory budget) to
181.         hide the overhead.
182. 
183.         Parameters
184.         ----------
185.         X : array-like of shape (n_samples, n_features)
186.             Training vectors, where n_samples is the number of samples and
187.             n_features is the number of features.
188. 
189.         y : array-like of shape (n_samples,)
190.             Target values.
191. 
192.         classes : array-like of shape (n_classes,), default=None
193.             List of all the classes that can possibly appear in the y vector.
194. 
195.             Must be provided at the first call to partial_fit, can be omitted
196.             in subsequent calls.
197. 
198.         sample_weight : array-like of shape (n_samples,), default=None
199.             Weights applied to individual samples (1. for unweighted).
200. 
201.             .. versionadded:: 0.17
202. 
203.         Returns
204.         -------
205.         self : object
206.         """
207. return self._partial_fit(X, y, classes, _refit=False, 
208.             sample_weight=sample_weight)
209. 
210. def _partial_fit(self, X, y, classes=None, _refit=False, 
211.         sample_weight=None):
212. """Actual implementation of Gaussian NB fitting.
213. 
214.         Parameters
215.         ----------
216.         X : array-like of shape (n_samples, n_features)
217.             Training vectors, where n_samples is the number of samples and
218.             n_features is the number of features.
219. 
220.         y : array-like of shape (n_samples,)
221.             Target values.
222. 
223.         classes : array-like of shape (n_classes,), default=None
224.             List of all the classes that can possibly appear in the y vector.
225. 
226.             Must be provided at the first call to partial_fit, can be omitted
227.             in subsequent calls.
228. 
229.         _refit : bool, default=False
230.             If true, act as though this were the first time we called
231.             _partial_fit (ie, throw away any past fitting and start over).
232. 
233.         sample_weight : array-like of shape (n_samples,), default=None
234.             Weights applied to individual samples (1. for unweighted).
235. 
236.         Returns
237.         -------
238.         self : object
239.         """
240.         X, y = check_X_y(X, y)
241. if sample_weight is not None:
242.             sample_weight = _check_sample_weight(sample_weight, X)
243. # If the ratio of data variance between dimensions is too small, it
244. # will cause numerical errors. To address this, we artificially
245. # boost the variance by epsilon, a small fraction of the standard
246. # deviation of the largest dimension.
247.         self.epsilon_ = self.var_smoothing * np.var(X, axis=0).max()
248. if _refit:
249.             self.classes_ = None
250. if _check_partial_fit_first_call(self, classes):
251. # This is the first call to partial_fit:
252. # initialize various cumulative counters
253.             n_features = X.shape[1]
254.             n_classes = len(self.classes_)
255.             self.theta_ = np.zeros((n_classes, n_features))
256.             self.sigma_ = np.zeros((n_classes, n_features))
257.             self.class_count_ = np.zeros(n_classes, dtype=np.float64)
258. # Initialise the class prior
259. # Take into account the priors
260. if self.priors is not None:
261.                 priors = np.asarray(self.priors)
262. # Check that the provide prior match the number of classes
263. if len(priors) != n_classes:
264. raise ValueError('Number of priors must match number of'
265. ' classes.')
266. # Check that the sum is 1
267. if not np.isclose(priors.sum(), 1.0):
268. raise ValueError('The sum of the priors should be 1.') # Check that 
269.                      the prior are non-negative
270. if (priors < 0).any():
271. raise ValueError('Priors must be non-negative.')
272.                 self.class_prior_ = priors
273. else:
274.                 self.class_prior_ = np.zeros(len(self.classes_), 
275.                     dtype=np.float64) # Initialize the priors to zeros for each class
276. else:
277. if X.shape[1] != self.theta_.shape[1]:
278.                 msg = "Number of features %d does not match previous data %d."
279. raise ValueError(msg % (X.shape[1], self.theta_.shape[1]))
280. # Put epsilon back in each time
281.             ::]self.epsilon_
282.         self.sigma_[ -= 
283.         classes = self.classes_
284.         unique_y = np.unique(y)
285.         unique_y_in_classes = np.in1d(unique_y, classes)
286. if not np.all(unique_y_in_classes):
287. raise ValueError("The target label(s) %s in y do not exist in the "
288. "initial classes %s" % 
289.                 (unique_y[~unique_y_in_classes], classes))
290. for y_i in unique_y:
291.             i = classes.searchsorted(y_i)
292.             X_i = X[y == y_i:]
293. if sample_weight is not None:
294.                 sw_i = sample_weight[y == y_i]
295.                 N_i = sw_i.sum()
296. else:
297.                 sw_i = None
298.                 N_i = X_i.shape[0]
299.             new_theta, new_sigma = self._update_mean_variance(
300.                 self.class_count_[i], self.theta_[i:], self.sigma_[i:], 
301.                 X_i, sw_i)
302.             self.theta_[i:] = new_theta
303.             self.sigma_[i:] = new_sigma
304.             self.class_count_[i] += N_i
305. 
306.         self.sigma_[::] += self.epsilon_
307. # Update if only no priors is provided
308. if self.priors is None:
309. # Empirical prior, with sample_weight taken into account
310.             self.class_prior_ = self.class_count_ / self.class_count_.sum()
311. return self
312. 
313. def _joint_log_likelihood(self, X):
314.         joint_log_likelihood = []
315. for i in range(np.size(self.classes_)):
316.             jointi = np.log(self.class_prior_[i])
317.             n_ij = -0.5 * np.sum(np.log(2. * np.pi * self.sigma_[i:]))
318.             n_ij -= 0.5 * np.sum(((X - self.theta_[i:]) ** 2) / 
319.                 (self.sigma_[i:]), 1)
320.             joint_log_likelihood.append(jointi + n_ij)
321. 
322.         joint_log_likelihood = np.array(joint_log_likelihood).T
323. return joint_log_likelihood
324. 
325. 
326. 
327. class MultinomialNB Found at: sklearn.naive_bayes
328. 
329. class MultinomialNB(_BaseDiscreteNB):
330. """
331.     Naive Bayes classifier for multinomial models
332.     
333.     The multinomial Naive Bayes classifier is suitable for classification with
334.     discrete features (e.g., word counts for text classification). The
335.     multinomial distribution normally requires integer feature counts. However,
336.     in practice, fractional counts such as tf-idf may also work.
337.     
338.     Read more in the :ref:`User Guide <multinomial_naive_bayes>`.
339.     
340.     Parameters
341.     ----------
342.     alpha : float, default=1.0
343.     Additive (Laplace/Lidstone) smoothing parameter
344.     (0 for no smoothing).
345.     
346.     fit_prior : bool, default=True
347.     Whether to learn class prior probabilities or not.
348.     If false, a uniform prior will be used.
349.     
350.     class_prior : array-like of shape (n_classes,), default=None
351.     Prior probabilities of the classes. If specified the priors are not
352.     adjusted according to the data.
353.     
354.     Attributes
355.     ----------
356.     class_count_ : ndarray of shape (n_classes,)
357.     Number of samples encountered for each class during fitting. This
358.     value is weighted by the sample weight when provided.
359.     
360.     class_log_prior_ : ndarray of shape (n_classes, )
361.     Smoothed empirical log probability for each class.
362.     
363.     classes_ : ndarray of shape (n_classes,)
364.     Class labels known to the classifier
365.     
366.     coef_ : ndarray of shape (n_classes, n_features)
367.     Mirrors ``feature_log_prob_`` for interpreting MultinomialNB
368.     as a linear model.
369.     
370.     feature_count_ : ndarray of shape (n_classes, n_features)
371.     Number of samples encountered for each (class, feature)
372.     during fitting. This value is weighted by the sample weight when
373.     provided.
374.     
375.     feature_log_prob_ : ndarray of shape (n_classes, n_features)
376.     Empirical log probability of features
377.     given a class, ``P(x_i|y)``.
378.     
379.     intercept_ : ndarray of shape (n_classes, )
380.     Mirrors ``class_log_prior_`` for interpreting MultinomialNB
381.     as a linear model.
382.     
383.     n_features_ : int
384.     Number of features of each sample.
385.     
386.     Examples
387.     --------
388.     >>> import numpy as np
389.     >>> rng = np.random.RandomState(1)
390.     >>> X = rng.randint(5, size=(6, 100))
391.     >>> y = np.array([1, 2, 3, 4, 5, 6])
392.     >>> from sklearn.naive_bayes import MultinomialNB
393.     >>> clf = MultinomialNB()
394.     >>> clf.fit(X, y)
395.     MultinomialNB()
396.     >>> print(clf.predict(X[2:3]))
397.     [3]
398.     
399.     Notes
400.     -----
401.     For the rationale behind the names `coef_` and `intercept_`, i.e.
402.     naive Bayes as a linear classifier, see J. Rennie et al. (2003),
403.     Tackling the poor assumptions of naive Bayes text classifiers, ICML.
404.     
405.     References
406.     ----------
407.     C.D. Manning, P. Raghavan and H. Schuetze (2008). Introduction to
408.     Information Retrieval. Cambridge University Press, pp. 234-265.
409.     https://nlp.stanford.edu/IR-book/html/htmledition/naive-bayes-text-
410.      classification-1.html
411.     """
412.     @_deprecate_positional_args
413. def __init__(self, *, alpha=1.0, fit_prior=True, class_prior=None):
414.         self.alpha = alpha
415.         self.fit_prior = fit_prior
416.         self.class_prior = class_prior
417. 
418. def _more_tags(self):
419. return {'requires_positive_X':True}
420. 
421. def _count(self, X, Y):
422. """Count and smooth feature occurrences."""
423.         check_non_negative(X, "MultinomialNB (input X)")
424.         self.feature_count_ += safe_sparse_dot(Y.T, X)
425.         self.class_count_ += Y.sum(axis=0)
426. 
427. def _update_feature_log_prob(self, alpha):
428. """Apply smoothing to raw counts and recompute log probabilities"""
429.         smoothed_fc = self.feature_count_ + alpha
430.         smoothed_cc = smoothed_fc.sum(axis=1)
431.         self.feature_log_prob_ = np.log(smoothed_fc) - np.log(smoothed_cc.
432.          reshape(-1, 1))
433. 
434. def _joint_log_likelihood(self, X):
435. """Calculate the posterior log probability of the samples X"""
436. return safe_sparse_dot(X, self.feature_log_prob_.T) + self.class_log_prior_
437. 
438. 
439. 
440. 
441. 
442. class BernoulliNB Found at: sklearn.naive_bayes
443. 
444. class BernoulliNB(_BaseDiscreteNB):
445. """Naive Bayes classifier for multivariate Bernoulli models.
446.     
447.     Like MultinomialNB, this classifier is suitable for discrete data. The
448.     difference is that while MultinomialNB works with occurrence counts,
449.     BernoulliNB is designed for binary/boolean features.
450.     
451.     Read more in the :ref:`User Guide <bernoulli_naive_bayes>`.
452.     
453.     Parameters
454.     ----------
455.     alpha : float, default=1.0
456.     Additive (Laplace/Lidstone) smoothing parameter
457.     (0 for no smoothing).
458.     
459.     binarize : float or None, default=0.0
460.     Threshold for binarizing (mapping to booleans) of sample features.
461.     If None, input is presumed to already consist of binary vectors.
462.     
463.     fit_prior : bool, default=True
464.     Whether to learn class prior probabilities or not.
465.     If false, a uniform prior will be used.
466.     
467.     class_prior : array-like of shape (n_classes,), default=None
468.     Prior probabilities of the classes. If specified the priors are not
469.     adjusted according to the data.
470.     
471.     Attributes
472.     ----------
473.     class_count_ : ndarray of shape (n_classes)
474.     Number of samples encountered for each class during fitting. This
475.     value is weighted by the sample weight when provided.
476.     
477.     class_log_prior_ : ndarray of shape (n_classes)
478.     Log probability of each class (smoothed).
479.     
480.     classes_ : ndarray of shape (n_classes,)
481.     Class labels known to the classifier
482.     
483.     feature_count_ : ndarray of shape (n_classes, n_features)
484.     Number of samples encountered for each (class, feature)
485.     during fitting. This value is weighted by the sample weight when
486.     provided.
487.     
488.     feature_log_prob_ : ndarray of shape (n_classes, n_features)
489.     Empirical log probability of features given a class, P(x_i|y).
490.     
491.     n_features_ : int
492.     Number of features of each sample.
493.     
494.     Examples
495.     --------
496.     >>> import numpy as np
497.     >>> rng = np.random.RandomState(1)
498.     >>> X = rng.randint(5, size=(6, 100))
499.     >>> Y = np.array([1, 2, 3, 4, 4, 5])
500.     >>> from sklearn.naive_bayes import BernoulliNB
501.     >>> clf = BernoulliNB()
502.     >>> clf.fit(X, Y)
503.     BernoulliNB()
504.     >>> print(clf.predict(X[2:3]))
505.     [3]
506.     
507.     References
508.     ----------
509.     C.D. Manning, P. Raghavan and H. Schuetze (2008). Introduction to
510.     Information Retrieval. Cambridge University Press, pp. 234-265.
511.     https://nlp.stanford.edu/IR-book/html/htmledition/the-bernoulli-
512.      model-1.html
513.     
514.     A. McCallum and K. Nigam (1998). A comparison of event models 
515.      for naive
516.     Bayes text classification. Proc. AAAI/ICML-98 Workshop on Learning 
517.      for
518.     Text Categorization, pp. 41-48.
519.     
520.     V. Metsis, I. Androutsopoulos and G. Paliouras (2006). Spam filtering 
521.      with
522.     naive Bayes -- Which naive Bayes? 3rd Conf. on Email and Anti-Spam 
523.      (CEAS).
524.     """
525.     @_deprecate_positional_args
526. def __init__(self, *, alpha=1.0, binarize=.0, fit_prior=True, 
527.         class_prior=None):
528.         self.alpha = alpha
529.         self.binarize = binarize
530.         self.fit_prior = fit_prior
531.         self.class_prior = class_prior
532. 
533. def _check_X(self, X):
534.         X = super()._check_X(X)
535. if self.binarize is not None:
536.             X = binarize(X, threshold=self.binarize)
537. return X
538. 
539. def _check_X_y(self, X, y):
540.         X, y = super()._check_X_y(X, y)
541. if self.binarize is not None:
542.             X = binarize(X, threshold=self.binarize)
543. return X, y
544. 
545. def _count(self, X, Y):
546. """Count and smooth feature occurrences."""
547.         self.feature_count_ += safe_sparse_dot(Y.T, X)
548.         self.class_count_ += Y.sum(axis=0)
549. 
550. def _update_feature_log_prob(self, alpha):
551. """Apply smoothing to raw counts and recompute log 
552.          probabilities"""
553.         smoothed_fc = self.feature_count_ + alpha
554.         smoothed_cc = self.class_count_ + alpha * 2
555.         self.feature_log_prob_ = np.log(smoothed_fc) - np.log
556.          (smoothed_cc.reshape(-1, 1))
557. 
558. def _joint_log_likelihood(self, X):
559. """Calculate the posterior log probability of the samples X"""
560.         n_classes, n_features = self.feature_log_prob_.shape
561.         n_samples, n_features_X = X.shape
562. if n_features_X != n_features:
563. raise ValueError(
564. "Expected input with %d features, got %d instead" % 
565.                  (n_features, n_features_X))
566.         neg_prob = np.log(1 - np.exp(self.feature_log_prob_))
567. # Compute  neg_prob · (1 - X).T  as  ∑neg_prob - X · neg_prob
568.         jll = safe_sparse_dot(X, (self.feature_log_prob_ - neg_prob).T)
569.         jll += self.class_log_prior_ + neg_prob.sum(axis=1)
570. return jll

 

相关实践学习
【文生图】一键部署Stable Diffusion基于函数计算
本实验教你如何在函数计算FC上从零开始部署Stable Diffusion来进行AI绘画创作,开启AIGC盲盒。函数计算提供一定的免费额度供用户使用。本实验答疑钉钉群:29290019867
建立 Serverless 思维
本课程包括: Serverless 应用引擎的概念, 为开发者带来的实际价值, 以及让您了解常见的 Serverless 架构模式
相关文章
|
2月前
|
机器学习/深度学习 算法 数据挖掘
K-means聚类算法是机器学习中常用的一种聚类方法,通过将数据集划分为K个簇来简化数据结构
K-means聚类算法是机器学习中常用的一种聚类方法,通过将数据集划分为K个簇来简化数据结构。本文介绍了K-means算法的基本原理,包括初始化、数据点分配与簇中心更新等步骤,以及如何在Python中实现该算法,最后讨论了其优缺点及应用场景。
145 4
|
2月前
|
JSON 算法 数据挖掘
基于图论算法有向图PageRank与无向图Louvain算法构建指令的方式方法 用于支撑qwen agent中的统计相关组件
利用图序列进行数据解读,主要包括节点序列分析、边序列分析以及结合节点和边序列的综合分析。节点序列分析涉及节点度分析(如入度、出度、度中心性)、节点属性分析(如品牌、价格等属性的分布与聚类)、节点标签分析(如不同标签的分布及标签间的关联)。边序列分析则关注边的权重分析(如关联强度)、边的类型分析(如管理、协作等关系)及路径分析(如最短路径计算)。结合节点和边序列的分析,如子图挖掘和图的动态分析,可以帮助深入理解图的结构和功能。例如,通过子图挖掘可以发现具有特定结构的子图,而图的动态分析则能揭示图随时间的变化趋势。这些分析方法结合使用,能够从多个角度全面解读图谱数据,为决策提供有力支持。
111 0
|
2月前
|
算法 C# 索引
C#线性查找算法
C#线性查找算法!
|
4月前
|
机器学习/深度学习 人工智能 算法
【新闻文本分类识别系统】Python+卷积神经网络算法+人工智能+深度学习+计算机毕设项目+Django网页界面平台
文本分类识别系统。本系统使用Python作为主要开发语言,首先收集了10种中文文本数据集("体育类", "财经类", "房产类", "家居类", "教育类", "科技类", "时尚类", "时政类", "游戏类", "娱乐类"),然后基于TensorFlow搭建CNN卷积神经网络算法模型。通过对数据集进行多轮迭代训练,最后得到一个识别精度较高的模型,并保存为本地的h5格式。然后使用Django开发Web网页端操作界面,实现用户上传一段文本识别其所属的类别。
127 1
【新闻文本分类识别系统】Python+卷积神经网络算法+人工智能+深度学习+计算机毕设项目+Django网页界面平台
|
4月前
|
机器学习/深度学习 算法 Java
[算法与数据结构] 谈谈线性查找法~
该文章详细介绍了线性查找法的基本概念与实现方法,通过Java代码示例解释了如何在一个数组中查找特定元素,并分析了该算法的时间复杂度。
|
3月前
|
人工智能 算法 BI
【算法】 线性DP(C/C++)
【算法】 线性DP(C/C++)
|
5月前
|
数据采集 机器学习/深度学习 算法
【python】python客户信息审计风险决策树算法分类预测(源码+数据集+论文)【独一无二】
【python】python客户信息审计风险决策树算法分类预测(源码+数据集+论文)【独一无二】
|
5月前
|
算法 5G vr&ar
基于1bitDAC的MU-MIMO的非线性预编码算法matlab性能仿真
在现代无线通信中,1-bit DAC的非线性预编码技术应用于MU-MIMO系统,旨在降低成本与能耗。本文采用MATLAB 2022a版本,深入探讨此技术,并通过算法运行效果图展示性能。核心代码支持中文注释与操作指导。理论部分包括信号量化、符号最大化准则,并对比ZF、WF、MRT及ADMM等算法,揭示了在1-bit量化条件下如何优化预编码以提升系统性能。
|
5月前
|
存储 算法 Java
LeetCode初级算法题:反转链表+统计N以内的素数+删除排序数组中的重复项Java详解
LeetCode初级算法题:反转链表+统计N以内的素数+删除排序数组中的重复项Java详解
50 0
|
1天前
|
算法 数据安全/隐私保护
室内障碍物射线追踪算法matlab模拟仿真
### 简介 本项目展示了室内障碍物射线追踪算法在无线通信中的应用。通过Matlab 2022a实现,包含完整程序运行效果(无水印),支持增加发射点和室内墙壁设置。核心代码配有详细中文注释及操作视频。该算法基于几何光学原理,模拟信号在复杂室内环境中的传播路径与强度,涵盖场景建模、射线发射、传播及接收点场强计算等步骤,为无线网络规划提供重要依据。

热门文章

最新文章