Reputation: 930
pandas
could not read text as follows:
NothGrassland Meteor Sites
MTCLIM v4.3 OUTPUT FILE : Mon Jun 26 16:57:31 2017
year yday Tmax Tmin Tday prcp VPD srad daylen
(deg C) (deg C) (deg C) (cm) (Pa) (W m-2) (s)
1961 1 -24.08 -36.19 -27.41 0.00 36.81 128.45 28460
1961 2 -16.08 -29.79 -19.85 0.02 75.12 135.12 28524
1961 3 -16.08 -26.19 -18.86 0.05 65.86 118.79 28594
1961 4 -23.58 -33.29 -26.25 0.00 34.87 116.98 28668
1961 5 -24.28 -37.49 -27.91 0.00 37.27 163.75 28748
1961 6 -20.68 -33.19 -24.12 0.01 49.79 133.63 28832
1961 7 -19.48 -31.29 -22.73 0.18 53.78 131.91 28922
when reading text use code as follows:
df=pd.read_csv(file,sep=' ',header=0,skiprows=[0,1,3])
hint errors:
runfile('C:/temp/python/Models/GSI.py', wdir='C:/temp/python')
Traceback (most recent call last):
File "<ipython-input-115-7bbdd08f49f8>", line 1, in <module>
runfile('C:/temp/python/Models/GSI.py', wdir='C:/temp/python')
File "C:\Program Files\Winpython\python-3.6.1.amd64\lib\site-packages\spyder\utils\site\sitecustomize.py", line 880, in runfile
execfile(filename, namespace)
File "C:\Program Files\Winpython\python-3.6.1.amd64\lib\site-packages\spyder\utils\site\sitecustomize.py", line 102, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)
File "C:/temp/python/Models/GSI.py", line 23, in <module>
df=pd.read_csv(file,header=0,sep=' ')
File "C:\Program Files\Winpython\python-3.6.1.amd64\lib\site-packages\pandas\io\parsers.py", line 646, in parser_f
return _read(filepath_or_buffer, kwds)
File "C:\Program Files\Winpython\python-3.6.1.amd64\lib\site-packages\pandas\io\parsers.py", line 401, in _read
data = parser.read()
File "C:\Program Files\Winpython\python-3.6.1.amd64\lib\site-packages\pandas\io\parsers.py", line 939, in read
ret = self._engine.read(nrows)
File "C:\Program Files\Winpython\python-3.6.1.amd64\lib\site-packages\pandas\io\parsers.py", line 1508, in read
data = self._reader.read(nrows)
File "pandas\parser.pyx", line 848, in pandas.parser.TextReader.read (pandas\parser.c:10415)
File "pandas\parser.pyx", line 870, in pandas.parser.TextReader._read_low_memory (pandas\parser.c:10691)
File "pandas\parser.pyx", line 924, in pandas.parser.TextReader._read_rows (pandas\parser.c:11437)
File "pandas\parser.pyx", line 911, in pandas.parser.TextReader._tokenize_rows (pandas\parser.c:11308)
File "pandas\parser.pyx", line 2024, in pandas.parser.raise_parser_error (pandas\parser.c:27037)
CParserError: Error tokenizing data. C error: Expected 10 fields in line 3, saw 34
If remove sep=' '
as follow:
df=pd.read_csv(file,header=None,skiprows=4)
the code run.
Upvotes: 1
Views: 57
Reputation: 862921
For me works sep="\s+"
or delim_whitespace=True
:
import pandas as pd
from pandas.compat import StringIO
temp=u"""NothGrassland Meteor Sites
MTCLIM v4.3 OUTPUT FILE : Mon Jun 26 16:57:31 2017
year yday Tmax Tmin Tday prcp VPD srad daylen
(deg C) (deg C) (deg C) (cm) (Pa) (W m-2) (s)
1961 1 -24.08 -36.19 -27.41 0.00 36.81 128.45 28460
1961 2 -16.08 -29.79 -19.85 0.02 75.12 135.12 28524
1961 3 -16.08 -26.19 -18.86 0.05 65.86 118.79 28594
1961 4 -23.58 -33.29 -26.25 0.00 34.87 116.98 28668
1961 5 -24.28 -37.49 -27.91 0.00 37.27 163.75 28748
1961 6 -20.68 -33.19 -24.12 0.01 49.79 133.63 28832
1961 7 -19.48 -31.29 -22.73 0.18 53.78 131.91 28922"""
#after testing replace 'StringIO(temp)' to 'filename.csv'
df = pd.read_csv(StringIO(temp), sep="\s+", skiprows=[0,1,3], header=0)
print (df)
year yday Tmax Tmin Tday prcp VPD srad daylen
0 1961 1 -24.08 -36.19 -27.41 0.00 36.81 128.45 28460
1 1961 2 -16.08 -29.79 -19.85 0.02 75.12 135.12 28524
2 1961 3 -16.08 -26.19 -18.86 0.05 65.86 118.79 28594
3 1961 4 -23.58 -33.29 -26.25 0.00 34.87 116.98 28668
4 1961 5 -24.28 -37.49 -27.91 0.00 37.27 163.75 28748
5 1961 6 -20.68 -33.19 -24.12 0.01 49.79 133.63 28832
6 1961 7 -19.48 -31.29 -22.73 0.18 53.78 131.91 28922
And also:
#after testing replace 'StringIO(temp)' to 'filename.csv'
df = pd.read_csv(StringIO(temp), delim_whitespace=True, skiprows=[0,1,3], header=0)
print (df)
year yday Tmax Tmin Tday prcp VPD srad daylen
0 1961 1 -24.08 -36.19 -27.41 0.00 36.81 128.45 28460
1 1961 2 -16.08 -29.79 -19.85 0.02 75.12 135.12 28524
2 1961 3 -16.08 -26.19 -18.86 0.05 65.86 118.79 28594
3 1961 4 -23.58 -33.29 -26.25 0.00 34.87 116.98 28668
4 1961 5 -24.28 -37.49 -27.91 0.00 37.27 163.75 28748
5 1961 6 -20.68 -33.19 -24.12 0.01 49.79 133.63 28832
6 1961 7 -19.48 -31.29 -22.73 0.18 53.78 131.91 28922
Upvotes: 2