今天打开了好多年没有用的大数据虚拟机集群,发现了一些莫名其妙的问题,比如我用 spark-shell
,竟然报错了:
Caused by: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient ... Caused by: java.lang.reflect.InvocationTargetException ... Caused by: javax.jdo.JDOFatalInternalException: Error creating transactional connection factory ... Caused by: org.datanucleus.exceptions.NucleusException: Attempt to invoke the "BONECP" plugin to create a ConnectionPool gave an error : The specified datastore driver ("com.mysql.jdbc.Driver") was not found in the CLASSPATH. Please check your CLASSPATH specification, and the name of the driver. ... Caused by: org.datanucleus.store.rdbms.connectionpool.DatastoreDriverNotFoundException: The specified datastore driver ("com.mysql.jdbc.Driver") was not found in the CLASSPATH. Please check your CLASSPATH specification, and the name of the driver. ... <console>:14: error: not found: value spark import spark.implicits._ ^ <console>:14: error: not found: value spark import spark.sql
主要是下面这句:
The specified datastore driver ("com.mysql.jdbc.Driver ") was not found in the CLASSPATH. Please check your CLASSPATH s
mysql驱动是啥?
驱动你应该知道是啥吧,比如摄像头驱动,win要使用摄像头,可能要装一个驱动,才能使用。mysql驱动,就是你如果想要用代码去连接mysql的话,就需要使用到一个驱动,而这个驱动,其实是jar包来的。spark sql和接下来会简单学到的hive都会用到mysql,在连接的过程中,都需要用到这个驱动。
原因:这个报错,是因为你spark没找到这个驱动。
解决思路:一般这种框架,在启动的时候都会去加载相应的jar包和配置文件,比如hadoop、hive、spark、zookeeper等等,他们默认加载的jar包都会有一个稍微固定的文件夹的,你可以去他们的安装路径里面找一下,我相信你能找到的。操作办法:你需要把hive里的这个mysql驱动,拷贝到spark默认加载的jar包路径里去,再重启一下应该就可以解决了。