本节书摘来自异步社区《数据科学与大数据分析——数据的发现 分析 可视化与表示》一书中的第2章,第2.10节,作者【美】EMC Education Services(EMC教育服务团队),更多章节内容可以访问云栖社区“异步社区”公众号查看
2.10 练习
1.团队会在哪个阶段花费最多的时间?为什么?团队会在哪个阶段花费最少的时间?
2.在全面推广新的分析方法之前做一个试点项目的好处是什么?
3.以下阶段可能会使用什么样的工具,分别针对哪些类型的应用场景?
a.阶段2:数据准备
b.阶段4:模型建立
参考书目
[1] T. H. Davenport and D. J. Patil, “Data Scientist: The Sexiest Job of the 21st Century,” HarvardBusiness Review, October 2012.
[2] J. Manyika, M. Chiu, B. Brown, J. Bughin, R. Dobbs, C. Roxburgh, and A. H. Byers, “Big Data: The NextFrontier for Innovation, Competition, and Productivity,” McKinsey Global Institute, 2011.
[3]“Scientific Method” [Online]. Available: http://en.wikipedia.org/wiki/Scientific_method.
[4]“CRISP-DM” [Online]. Available: http://en.wikipedia.org/wiki/CrossIndustry_Standard Process_for_Data_Mining.
[5] T. H. Davenport, J. G. Harris, and R. Morison, Analytics at Work: Smarter Decisions, Better Results,2010, Harvard Business Review Press.
[6] D. W. Hubbard, How to Measure Anything: Finding the Value of Intangibles in Business, 2010,Hoboken, NJ: John Wiley & Sons.
[7] J. Cohen, B. Dolan, M. Dunlap, J. M. Hellerstein and C. Welton, MAD Skills: New Analysis Practicesfor Big Data, Watertown, MA 2009.
[8]“List of APIs” [Online]. Available: http://www.programmableweb.com/apis.
[9] B. Shneiderman [Online]. Available: http://www.ifp.illinois.edu/nabhcs/abstracts/shneiderman.html.
[10]“Hadoop” [Online]. Available: http://hadoop.apache.org.
[11]“Alpine Miner” [Online]. Available: http://alpinenow.com.
[12]“OpenRefine” [Online]. Available: http://openrefine.org.
[13]“Data Wrangler” [Online]. Available: http://vis.stanford.edu/wrangler/.
[14]“CRAN” [Online]. Available: http://cran.us.r-project.org.
[15]“SQL” [Online]. Available: http://en.wikipedia.org/wiki/SQL.
[16]“SAS/ACCESS” [Online]. Available: http://www.sas.com/en_us/software/data-management/access.htm.
[17]“SAS Enterprise Miner” [Online]. Available: http://www.sas.com/en_us/software/analytics/ enterprise-miner.html.
[18]“SPSS Modeler” [Online]. Available: http://www-03.ibm.com/software/products/en/category/ business-analytics.
[19]“Matlab” [Online]. Available: http://www.mathworks.com/products/matlab/.
[20]“Statistica” [Online]. Available: https://www.statsoft.com.
[21]“Mathematica” [Online]. Available: http://www.wolfram.com/mathematica/.
[22]“Octave” [Online]. Available: https://www.gnu.org/software/octave/.
[23]“WEKA” [Online]. Available: http://www.cs.waikato.ac.nz/ml/weka/.
[24]“MADlib” [Online]. Available: http://madlib.net.
[25] K. L. Higbee, Your Memory—How It Works and How to Improve It, New York: Marlowe &Company, 1996.
[26] S. Todd, “Data Science and Big Data Curriculum” [Online]. Available: http://stevetodd.typepad.com/my_weblog/data-science-and-big-data-curriculum/.
[27] T. H Davenport and D. J. Patil, “Data Scientist: The Sexiest Job of the 21st Century,” HarvardBusiness Review, October 2012.