请教大家个大数据计算MaxCompute问题：在使用dataworks py3 脚本时候将？

请教大家个大数据计算MaxCompute问题：在使用dataworks py3 脚本时候将DataFrame数据落到表中报错。
from odps import options
import pandas as pd

t = o.get_table('tmp_malei_test_df')

records = [[2, 'aaa'],
[222, 'bbb']
]
df = pd.DataFrame(records, columns=["mer_id", "tag_obj_type"])

with t.open_writer(partition='pt_tag_source=malei',create_partition=True) as writer:
writer.write(df)
Executing user script with PyODPS 0.10.7
Traceback (most recent call last):
File "", line 52, in
writer.write(df)
File "/home/tops/lib/python3.7/site-packages/odps/models/table.py", line 666, in write
raise ValueError('Unsupported record type.')
ValueError: Unsupported record type.

展开

收起

真的很搞笑 2023-07-30 15:37:16 223 0

2 条回答

写回答

取消提交回答

Star时光
这个错误通常是由于DataWorks PyODPS版本不支持直接将Pandas DataFrame写入MaxCompute表而引起的。解决此问题的方法有两种：
1. 使用PyODPS提供的write_table方法：PyODPS 0.10.7版本中，odps.TableWriter对象不支持直接写入Pandas DataFrame。相反，你可以使用pyodps.DataFrame的write_table方法将DataFrame数据写入MaxCompute表。在你的代码中，可以将以下代码替换到with语句块中：
```
from pyodps import options

options.sql.use_odps2_extension = True

df.to_sql('your_table_name', o, partition='pt_tag_source=malei', if_exists='append')
```
上述代码中，'your_table_name'应该替换为你要写入的目标表的名称。
1. 将DataFrame转换为列表：如果第一种方法无法使用或不适用于你的情况，你可以将DataFrame转换为Python列表，然后使用TableWriter的write方法逐行写入数据。示例如下：
```
with t.open_writer(partition='pt_tag_source=malei', create_partition=True) as writer:
    for row in df.itertuples(index=False):
        writer.write(row)
```
这样，你可以遍历DataFrame的每一行，并使用writer.write逐行写入数据。

请注意，以上两种方法都需要确保你的DataFrame和MaxCompute表的列顺序和数据类型匹配。否则，在写入期间可能会遇到其他错误。

希望这些解决方案对你有帮助！如果还有其他问题，请随时提问。
2023-07-31 18:29:11

赞同展开评论打赏
芯在这

我这边测试了一下
Unsupported record type.的报错原因是，with t.open_writer(partition='pt_tag_source=malei',create_partition=True) as writer: 这里没有加arrow=True这个参数

如果不加arrow=True参数的话，records 直接写入表里是支持的，转成DataFrame就不支持了。

两个方法，加参数arrow=True试一下，或者不用DataFrame写入表里。，此回答整理自钉群“MaxCompute开发者社区2群”

2023-07-30 16:08:12

赞同展开评论打赏