python 代码中,SDK函数 ocr_api_20210707_models.RecognizeHandwritingRequest()返回的json 字段单引号,不符合json 规范,请问这个OCR问题怎么解决? "errorMessage": "the JSON object must be str, bytes or bytearray, not RecognizeHandwritingResponse", {'headers': {'date': 'Sat, 03 Jun 2023 00:57:09 GMT', 'content-type': 'application/json;charset=utf-8', 'content-length': '5861', 'connection': 'keep-alive', 'vary': 'Accept-Encoding', 'access-control-allow-origin': '', 'access-control-expose-headers': '', 'x-acs-request-id': '1115FB4E-863E-583F-A1C6-325F47AB1FFB', 'x-acs-trace-id': '493f13c7aad82948daa1737481a34e53'}, 'statusCode': 200, 'body': {'Data': '{"algo_version":"2b125695688a89ff0d971fe9e7be5b4d6aac4c92","content":"杭州大成学校信纸 i“
Python 中的 JSON 解析库只能解析符合 JSON 规范的字符串,即 JSON 字符串中的键名和字符串必须使用双引号。如果 SDK 函数返回的 JSON 字符串中键名和字符串使用的是单引号,您可以使用 Python 的字符串替换函数将单引号替换成双引号,然后再将替换后的字符串传递给 JSON 解析库进行解析。
例如,假设 SDK 函数返回的 JSON 字符串为 response_str:
Copy response_str = "{'key1': 'value1', 'key2': 'value2'}" 您可以使用字符串替换函数将单引号替换成双引号:
Copy response_str = response_str.replace("'", """) 然后再将替换后的字符串传递给 JSON 解析库进行解析:
Copy import json
response_dict = json.loads(response_str) 这样就可以将 SDK 函数返回的 JSON 字符串解析成 Python 字典了。
可以使用Python内置的json库中的loads()函数将字符串转换为json对象。例如:
import json
response = ocr_api_20210707_models.RecognizeHandwritingRequest()
json_str = response.body['Data']
json_obj = json.loads(json_str)
这样就可以将返回的json字符串转换为json对象,并且可以进行后续的处理。需要注意的是,如果返回的json字符串中包含单引号,而不是双引号,那么在将字符串转换为json对象时,需要将单引号替换为双引号,以遵守json规范。可以使用replace()函数来替换字符串中的单引号,例如:
json_str = json_str.replace("'", "\"")
这样就可以将单引号替换为双引号,然后再使用loads()函数将字符串转换为json对象。
本地测试了一下,返回数据没问题,请对照sdk示例代码检查一下, {"headers": {"date": "Mon, 05 Jun 2023 06:24:42 GMT", "content-type": "application/json;charset=utf-8", "content-length": "5819", "connection": "keep-alive", "keep-alive": "timeout=25", "vary": "Accept-Encoding", "access-control-allow-origin": "", "access-control-expose-headers": "", "x-acs-request-id": "7519D8D9-1901-5D47-A2A5-924A96C0B928", "x-acs-trace-id": "a30682a433677aca2ac09ce0335ed25f"}, "statusCode": 200, "body": {"Data": "{"algo_version":"2b125695688a89ff0d971fe9e7be5b4d6aac4c92","content":"\u592a\u539f\u5927\u6210\u5b66\u6821\u4fe1\u7eb8 inchude<stdio.h> intmainl) { intdis[10]\uff0cbak[10]\uff0ci\uff0ck\uff0cnm\uff0cu[10]\uff0cV[10]\uff0cw[10]\uff0ccheck\uff0cflag\uff1b intinf=99999999\uff1b scanfl\u201c%d%od\u201d\uff0cn\uff0c8(m)i forlizl\uff1bi=m\uff1bi+t) scanf(\"%d%d%d\"\uff0cSXu[i]\u533av[i]\uff0cXw[i])\uff1b for(i=1\uff1bt=n\uff1bi+t) dis[i]=inf\uff1b dis[1]=0\uff1b for(k=\u5341ik\u2190=n-1\uff1bk+t) check=0\uff1b forli=1\uff1bc=mii+t) 1 if(dis[v[i]]>dis[uli]]tw[i]) { dis[v[i]]=dis[u[illtw[i]\uff1b check=l\uff1b \u4ea7 (2)\u7b2c \u9875 ","height":1280,"orgHeight":1280,"orgWidth":960,"prism_version":"1.0.9","prism_wnum":23,"prism_wordsInfo":[{"angle":-89,"direction":0,"height":518,"pos":[{"x":182,"y":81},{"x":701,"y":81},{"x":701,"y":121},{"x":182,"y":121}],"prob":95,"width":40,"word":"\u592a\u539f\u5927\u6210\u5b66\u6821\u4fe1\u7eb8","x":421,"y":-158},{"angle":0,"direction":0,"height":29,"pos":[{"x":88,"y":156},{"x":307,"y":154},{"x":307,"y":184},{"x":89,"y":185}],"prob":99,"width":218,"word":"inchude<stdio.h>","x":88,"y":154},{"angle":-2,"direction":0,"height":27,"pos":[{"x":84,"y":215},{"x":211,"y":208},{"x":213,"y":235},{"x":85,"y":242}],"prob":96,"width":127,"word":"intmainl)","x":85,"y":210},{"angle":-90,"direction":0,"height":23,"pos":[{"x":82,"y":256},{"x":106,"y":256},{"x":106,"y":285},{"x":82,"y":285}],"prob":98,"width":29,"word":"{","x":80,"y":258},{"angle":0,"direction":0,"height":31,"pos":[{"x":110,"y":298},{"x":769,"y":294},{"x":769,"y":327},{"x":110,"y":330}],"prob":95,"width":660,"word":"intdis[10]\uff0cbak[10]\uff0ci\uff0ck\uff0cnm\uff0cu[10]\uff0cV[10]\uff0cw[10]\uff0ccheck\uff0cflag\uff1b","x":109,"y":296},{"angle":0,"direction":0,"height":30,"pos":[{"x":109,"y":345},{"x":329,"y":342},{"x":329,"y":372},{"x":110,"y":375}],"prob":99,"width":220,"word":"intinf=99999999\uff1b","x":109,"y":342},{"angle":-90,"direction":0,"height":297,"pos":[{"x":112,"y":389},{"x":409,"y":387},{"x":409,"y":418},{"x":112,"y":420}],"prob":90,"width":32,"word":"scanfl\u201c%d%od\u201d\uff0cn\uff0c8(m)i","x":245,"y":255},{"angle":-88,"direction":0,"height":198,"pos":[{"x":111,"y":436},{"x":309,"y":440},{"x":309,"y":468},{"x":111,"y":464}
此回答整理自钉群“阿里云读光OCR客户交流反馈群 2”
由于SDK函数返回的JSON字符串中使用了单引号而非双引号,导致json.loads
函数无法正确解析,而抛出上述错误。为了解决该问题,我们可以使用Python内置库json
中的loads
方法将其转换为Python中的字典形式,并在解析前进行字符串替换。示例如下:
import json
from ocr_api_20210707 import models
# 假设ocr_result为SDK函数返回的JSON字符串
ocr_result = models.RecognizeHandwritingRequest(...).get_result().get("Data")
# 使用loads方法将其转换为Python中的字典形式,并替换所有单引号为双引号
result_dict = json.loads(ocr_result.replace("'", "\""))
在示例中,我们使用replace
方法将字符串中所有的单引号替换为双引号以符合JSON规范,最后使用json.loads
方法将其转换为Python中的字典形式。
版权声明:本文内容由阿里云实名注册用户自发贡献,版权归原作者所有,阿里云开发者社区不拥有其著作权,亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容,填写侵权投诉表单进行举报,一经查实,本社区将立刻删除涉嫌侵权内容。