问题解决:UnicodeDecodeError: ‘utf-8‘ codec can‘t decode byte 0xcf in position 0: invalid continuation by

简介: 问题解决:UnicodeDecodeError: ‘utf-8‘ codec can‘t decode byte 0xcf in position 0: invalid continuation by

问题场景


1、使用Pandas处理文件,报错 UnicodeDecodeError: 'utf-8' codec can't decode byte 0xcf in position 0: invalid continuation by


Traceback (most recent call last):
  File "D:\workspace_py\pandas_learning\venv\lib\site-packages\pandas\util\_decorators.py", line 311, in wrapper
    return func(*args, **kwargs)
  File "D:\workspace_py\pandas_learning\venv\lib\site-packages\pandas\io\parsers\readers.py", line 586, in read_csv
    return _read(filepath_or_buffer, kwds)
  File "D:\workspace_py\pandas_learning\venv\lib\site-packages\pandas\io\parsers\readers.py", line 482, in _read
    parser = TextFileReader(filepath_or_buffer, **kwds)
  File "D:\workspace_py\pandas_learning\venv\lib\site-packages\pandas\io\parsers\readers.py", line 811, in __init__
    self._engine = self._make_engine(self.engine)
  File "D:\workspace_py\pandas_learning\venv\lib\site-packages\pandas\io\parsers\readers.py", line 1040, in _make_engine
    return mapping[engine](self.f, **self.options)  # type: ignore[call-arg]
  File "D:\workspace_py\pandas_learning\venv\lib\site-packages\pandas\io\parsers\c_parser_wrapper.py", line 69, in __init__
    self._reader = parsers.TextReader(self.handles.handle, **kwds)
  File "pandas\_libs\parsers.pyx", line 542, in pandas._libs.parsers.TextReader.__cinit__
  File "pandas\_libs\parsers.pyx", line 734, in pandas._libs.parsers.TextReader._get_header
  File "pandas\_libs\parsers.pyx", line 843, in pandas._libs.parsers.TextReader._tokenize_rows
  File "pandas\_libs\parsers.pyx", line 1917, in pandas._libs.parsers.raise_parser_error
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xcf in position 0: invalid continuation byte


2、目前的代码是这样的:

info = pd.read_csv("xxx.csv", delimiter=",", encoding="utf-8", names=["xxx","xxx"])


解决办法


其实这个问题非常容易解决:

方法一:只需要将endoding改成你的csv文件的编码格式就可以了。

方法二:将csv文件格式转成你想要的格式,跟代码保持一致即可。

那查看csv文件的编码格式呢?

右击文件,使用Notepad++打开:


image.png


查看右下角:


image.png


修改代码:

info = pd.read_csv("xxx.csv", delimiter=",", encoding="gb2312", names=["xxx","xxx"])


这样子就不会报错了。当然,你也可以!

当然,你将csv文件转换成其他格式也是可以的,比如改成utf-8格式:


image.png

相关文章
|
2月前
|
Java 应用服务中间件
程序启动时报: Invalid byte tag in constant pool: 19
程序启动时报: Invalid byte tag in constant pool: 19
|
6月前
|
编解码 Python
pandas - read_csv报错:‘utf-8‘/‘gbk‘ codec can‘t decode byte 0xb1 in position 0:invalid start byte
pandas - read_csv报错:‘utf-8‘/‘gbk‘ codec can‘t decode byte 0xb1 in position 0:invalid start byte
139 0
|
3月前
|
编解码 IDE 开发工具
python ini文件包含中文时报错UnicodeDecodeError: ‘gbk‘ codec can‘t decode byte 0x8c 的解决办法
python ini文件包含中文时报错UnicodeDecodeError: ‘gbk‘ codec can‘t decode byte 0x8c 的解决办法
44 1
|
3月前
|
Java 应用服务中间件
完美解决tomcat启动异常:Invalid byte tag in constant pool: 19;Unable to process Jar entry [module-info.class]
完美解决tomcat启动异常:Invalid byte tag in constant pool: 19;Unable to process Jar entry [module-info.class]
286 0
|
4月前
|
编解码 Python
pandas读取csv错误UnicodeDecodeError: 'utf-8' codec can't decode byte 0xba in position 0: invalid start byte
pandas读取csv错误UnicodeDecodeError: 'utf-8' codec can't decode byte 0xba in position 0: invalid start byte
148 0
|
8月前
|
编解码 Python
Python ‘utf-8‘ codec can‘t decode byte 0x8b in position 1: invalid start byte
Python ‘utf-8‘ codec can‘t decode byte 0x8b in position 1: invalid start byte
152 0
|
5月前
|
Java
java 读取文件 获取byte[]字节 并执行Gzip的压缩和解压
java 读取文件 获取byte[]字节 并执行Gzip的压缩和解压
49 0
|
8月前
|
存储 Java 计算机视觉
java 之byte
当涉及到处理数据时,Java 提供了多种数据类型,其中包括 `byte` 类型。在本文中,我们将深入探讨 Java 中的 `byte` 数据类型,了解它的特点、用途以及在编程中的实际应用。
|
8月前
|
Java
Java中 String与基本数据类型,包装类,char[],byte[]之间的转换
Java中 String与基本数据类型,包装类,char[],byte[]之间的转换
56 0
|
10月前
|
存储 Java
[java 基础知识] byte int 互转
[java 基础知识] byte int 互转
99 0