导航 问题场景 解决办法 问题场景 1、使用Pandas处理文件,报错 UnicodeDecodeError: 'utf-8' codec can't decode byte 0xcf in position 0: invalid continuation by Traceback (most recent c
导航
- 问题场景
- 解决办法
问题场景
1、使用Pandas处理文件,报错 UnicodeDecodeError: 'utf-8' codec can't decode byte 0xcf in position 0: invalid continuation by
Traceback (most recent call last):File "D:\workspace_py\pandas_learning\venv\lib\site-packages\pandas\util\_decorators.py", line 311, in wrapper
return func(*args, **kwargs)
File "D:\workspace_py\pandas_learning\venv\lib\site-packages\pandas\io\parsers\readers.py", line 586, in read_csv
return _read(filepath_or_buffer, kwds)
File "D:\workspace_py\pandas_learning\venv\lib\site-packages\pandas\io\parsers\readers.py", line 482, in _read
parser = TextFileReader(filepath_or_buffer, **kwds)
File "D:\workspace_py\pandas_learning\venv\lib\site-packages\pandas\io\parsers\readers.py", line 811, in __init__
self._engine = self._make_engine(self.engine)
File "D:\workspace_py\pandas_learning\venv\lib\site-packages\pandas\io\parsers\readers.py", line 1040, in _make_engine
return mapping[engine](self.f, **self.options) # type: ignore[call-arg]
File "D:\workspace_py\pandas_learning\venv\lib\site-packages\pandas\io\parsers\c_parser_wrapper.py", line 69, in __init__
self._reader = parsers.TextReader(self.handles.handle, **kwds)
File "pandas\_libs\parsers.pyx", line 542, in pandas._libs.parsers.TextReader.__cinit__
File "pandas\_libs\parsers.pyx", line 734, in pandas._libs.parsers.TextReader._get_header
File "pandas\_libs\parsers.pyx", line 843, in pandas._libs.parsers.TextReader._tokenize_rows
File "pandas\_libs\parsers.pyx", line 1917, in pandas._libs.parsers.raise_parser_error
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xcf in position 0: invalid continuation byte
2、目前的代码是这样的:
info = pd.read_csv("xxx.csv", delimiter=",", encoding="utf-8", names=["xxx","xxx"])解决办法
其实这个问题非常容易解决:
方法一:只需要将endoding改成你的csv文件的编码格式就可以了。
方法二:将csv文件格式转成你想要的格式,跟代码保持一致即可。
那查看csv文件的编码格式呢?
右击文件,使用Notepad++打开:
查看右下角:
修改代码:
info = pd.read_csv("xxx.csv", delimiter=",", encoding="gb2312", names=["xxx","xxx"])这样子就不会报错了。当然,你也可以!
当然,你将csv文件转换成其他格式也是可以的,比如改成utf-8格式: