问题描述
从Sqoop导入MySQL导入TINYINT(1)类型数据到hive(tinyint),数据为null。
问题原因
Sqoop在抽取数据到Hive或者HDFS时,会自动将类型为tinyint(1)的列转为boolean类型,这就是导致抽取到Hive或HDFS中的数据中只有0和1的原因。因为默认情况下,MySQL JDBC connector 会将tinyint(1)映射为java.sql.Types.BIT类型,而Sqoop默认会映射为boolean类型。
解决方案
MySQLJDBCconnector上添加tinyInt1isBit=false。比如:jdbc:mysql://pc-uf6ehtk5iia47i303.rwlb.rds.aliyuncs.com:3306/micro_user?tinyInt1isBit=false
注意:如果有多个参数,需要用&符号拼接,如果是在shell脚本中执行,&符号需要转义 ‘&’:
jdbc:mysql://14.21.xx.21:51x3x/${database}?zeroDateTimeBehavior=convertToNull&tinyInt1isBit=false
官网
https://sqoop.apache.org/docs/1.4.6/SqoopUserGuide.html#_mysql_import_of_tinyint_1_from_mysql_behaves_strangely
27.2.5. MySQL: Import of TINYINT(1) from MySQL behaves strangely Problem: Sqoop is treating TINYINT(1) columns as booleans, which is for example causing issues with HIVE import. This is because by default the MySQL JDBC connector maps the TINYINT(1) to java.sql.Types.BIT, which Sqoop by default maps to Boolean.
Solution: A more clean solution is to force MySQL JDBC Connector to stop converting TINYINT(1) to java.sql.Types.BIT by adding tinyInt1isBit=false into your JDBC path (to create something like jdbc:mysql://localhost/test?tinyInt1isBit=false). Another solution would be to explicitly override the column mapping for the datatype TINYINT(1) column. For example, if the column name is foo, then pass the following option to Sqoop during import: --map-column-hive foo=tinyint. In the case of non-Hive imports to HDFS, use --map-column-java foo=integer.