DataWorks我把包传上去了,但是解析不了excel。 我试了csv 和 text 都可以,包我也装了?
就是要给你加一个tar,两个都选过都一样,现在我可以解压也可以导入包了,但是解析excel的时候报错。File "", line 18, in
excelData = pd.read_excel('a.xlsx')
File "/home/tops/lib/python3.7/site-packages/pandas/io/excel/_base.py", line 304, in read_excel
io = ExcelFile(io, engine=engine)
File "/home/tops/lib/python3.7/site-packages/pandas/io/excel/_base.py", line 824, in init
self._reader = self._enginesengine
File "/home/tops/lib/python3.7/site-packages/pandas/io/excel/_xlrd.py", line 21, in init
super().init(filepath_or_buffer)
File "/home/tops/lib/python3.7/site-packages/pandas/io/excel/_base.py", line 353, in init
self.book = self.load_workbook(filepath_or_buffer)
File "/home/tops/lib/python3.7/site-packages/pandas/io/excel/_xlrd.py", line 36, in load_workbook
return open_workbook(filepath_or_buffer)
File "/home/admin/alisatasknode/taskinfo/20230908/datastudio/13/48/05/a7vn7osd4tfj85n6fqwy793d/packages/xlrd/init.py", line 166, in open_workbook
file_format = inspect_format(filename, file_contents)
File "/home/admin/alisatasknode/taskinfo/20230908/datastudio/13/48/05/a7vn7osd4tfj85n6fqwy793d/packages/xlrd/init.py", line 67, in inspect_format
zf = zipfile.ZipFile(timemachine.BYTES_IO(content) if content else path)
File "/home/tops/lib/python3.7/zipfile.py", line 1222, in init
self._RealGetContents()
File "/home/tops/lib/python3.7/zipfile.py", line 1307, in _RealGetContents
fp.seek(self.start_dir, 0)
OSError: [Errno 22] Invalid argument,这里没问题,但是在下载到沙箱环境里面就多了一个tar,
python 执行操作系统命令看起来就是这个容器里面的这个文件多了一个tar你建一个pyodps3的任务,import os
import sys
os.system('ls -trl') 这样试下,其他文件都是正常的,就这个tar文件不正常,可能是dataworks 识别他是 tar 格式的文件,然后重新命名了吧
现在我可以解压也可以导入包了,但是解析excel的时候报错。File "", line 18, in
excelData = pd.read_excel('a.xlsx')
File "/home/tops/lib/python3.7/site-packages/pandas/io/excel/_base.py", line 304, in read_excel
io = ExcelFile(io, engine=engine)
File "/home/tops/lib/python3.7/site-packages/pandas/io/excel/_base.py", line 824, in init
self._reader = self._enginesengine
File "/home/tops/lib/python3.7/site-packages/pandas/io/excel/_xlrd.py", line 21, in init
super().init(filepath_or_buffer)
File "/home/tops/lib/python3.7/site-packages/pandas/io/excel/_base.py", line 353, in init
self.book = self.load_workbook(filepath_or_buffer)
File "/home/tops/lib/python3.7/site-packages/pandas/io/excel/_xlrd.py", line 36, in load_workbook
return open_workbook(filepath_or_buffer)
File "/home/admin/alisatasknode/taskinfo/20230908/datastudio/13/48/05/a7vn7osd4tfj85n6fqwy793d/packages/xlrd/init.py", line 166, in open_workbook
file_format = inspect_format(filename, file_contents)
File "/home/admin/alisatasknode/taskinfo/20230908/datastudio/13/48/05/a7vn7osd4tfj85n6fqwy793d/packages/xlrd/init.py", line 67, in inspect_format
zf = zipfile.ZipFile(timemachine.BYTES_IO(content) if content else path)
File "/home/tops/lib/python3.7/zipfile.py", line 1222, in init
self._RealGetContents()
File "/home/tops/lib/python3.7/zipfile.py", line 1307, in _RealGetContents
fp.seek(self.start_dir, 0)
OSError: [Errno 22] Invalid argument
资源类型是不是选了file
试了一下没能复现 是不是python输出做了处理 list resources;看下
建一个odps sql任务执行,解析excel的问题 如果是pyodps脚本的问题,此回答整理自钉群“DataWorks交流群(答疑@机器人)”
版权声明:本文内容由阿里云实名注册用户自发贡献,版权归原作者所有,阿里云开发者社区不拥有其著作权,亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容,填写侵权投诉表单进行举报,一经查实,本社区将立刻删除涉嫌侵权内容。
DataWorks基于MaxCompute/Hologres/EMR/CDP等大数据引擎,为数据仓库/数据湖/湖仓一体等解决方案提供统一的全链路大数据开发治理平台。