Reputation: 9109
I have this code, where wine.csv
(I added the names of columns) is data from here:
import pandas
data = pandas.read_csv("wine.csv")
df = pandas.DataFrame(data, columns=['W', 'E', 'R', 'T', 'Y', 'U', 'I', 'O', 'P', 'A', 'S', 'D', 'F'])
print df
But it gives only NaN
s.
Do you have any advice on how to solve this problem?
Upvotes: 0
Views: 301
Reputation: 862406
I think you can use read_csv
with url
address and parameter names
for columns names
:
import pandas
df = pandas.read_csv("https://archive.ics.uci.edu/ml/machine-learning-databases/wine/wine.data", names=['W', 'E', 'R', 'T', 'Y', 'U', 'I', 'O', 'P', 'A', 'S', 'D', 'F'])
print df.head()
W E R T Y U I O P A S D \
1 14.23 1.71 2.43 15.6 127 2.80 3.06 0.28 2.29 5.64 1.04 3.92
1 13.20 1.78 2.14 11.2 100 2.65 2.76 0.26 1.28 4.38 1.05 3.40
1 13.16 2.36 2.67 18.6 101 2.80 3.24 0.30 2.81 5.68 1.03 3.17
1 14.37 1.95 2.50 16.8 113 3.85 3.49 0.24 2.18 7.80 0.86 3.45
1 13.24 2.59 2.87 21.0 118 2.80 2.69 0.39 1.82 4.32 1.04 2.93
F
1 1065
1 1050
1 1185
1 1480
1 735
If you need columns from E
to D
, use parameter usecols
as filter of columns:
import pandas
df = pandas.read_csv("https://archive.ics.uci.edu/ml/machine-learning-databases/wine/wine.data",
header=0,
index_col=0,
names=['IDX', 'W', 'E', 'R', 'T', 'Y', 'U', 'I', 'O', 'P', 'A', 'S', 'D', 'F'],
usecols=['IDX', 'E', 'R', 'T', 'Y', 'U', 'I', 'O', 'P', 'A', 'S', 'D', 'F'])
print df.head()
E R T Y U I O P A S D F
IDX
1 1.78 2.14 11.2 100 2.65 2.76 0.26 1.28 4.38 1.05 3.40 1050
1 2.36 2.67 18.6 101 2.80 3.24 0.30 2.81 5.68 1.03 3.17 1185
1 1.95 2.50 16.8 113 3.85 3.49 0.24 2.18 7.80 0.86 3.45 1480
1 2.59 2.87 21.0 118 2.80 2.69 0.39 1.82 4.32 1.04 2.93 735
1 1.76 2.45 15.2 112 3.27 3.39 0.34 1.97 6.75 1.05 2.85 1450
EDIT:
If you dont want use first column as index, use index_col=None
and remove index column IDX
from list of columns in usecols
:
import pandas
df = pandas.read_csv("https://archive.ics.uci.edu/ml/machine-learning-databases/wine/wine.data",
header=0,
index_col=None,
names=['IDX', 'W', 'E', 'R', 'T', 'Y', 'U', 'I', 'O', 'P', 'A', 'S', 'D', 'F'],
usecols=[ 'E', 'R', 'T', 'Y', 'U', 'I', 'O', 'P', 'A', 'S', 'D', 'F'])
print df.head()
E R T Y U I O P A S D F
0 1.78 2.14 11.2 100 2.65 2.76 0.26 1.28 4.38 1.05 3.40 1050
1 2.36 2.67 18.6 101 2.80 3.24 0.30 2.81 5.68 1.03 3.17 1185
2 1.95 2.50 16.8 113 3.85 3.49 0.24 2.18 7.80 0.86 3.45 1480
3 2.59 2.87 21.0 118 2.80 2.69 0.39 1.82 4.32 1.04 2.93 735
4 1.76 2.45 15.2 112 3.27 3.39 0.34 1.97 6.75 1.05 2.85 1450
Upvotes: 3
Reputation: 8164
I think you are looking for this
import pandas
df = pandas.read_csv("https://archive.ics.uci.edu/ml/machine-learning-databases/wine/wine.data", names=['W', 'E', 'R', 'T', 'Y', 'U', 'I', 'O', 'P', 'A', 'S', 'D', 'F'])
s=df[['E', 'R', 'T', 'Y', 'U', 'I', 'O', 'P', 'A', 'S', 'D', 'F']]
print s
I got
E R T Y U I O P A S D F
1 1.71 2.43 15.6 127 2.80 3.06 0.28 2.29 5.640000 1.04 3.92 1065
1 1.78 2.14 11.2 100 2.65 2.76 0.26 1.28 4.380000 1.05 3.40 1050
1 2.36 2.67 18.6 101 2.80 3.24 0.30 2.81 5.680000 1.03 3.17 1185
1 1.95 2.50 16.8 113 3.85 3.49 0.24 2.18 7.800000 0.86 3.45 1480
1 2.59 2.87 21.0 118 2.80 2.69 0.39 1.82 4.320000 1.04 2.93 735
1 1.76 2.45 15.2 112 3.27 3.39 0.34 1.97 6.750000 1.05 2.85 1450
1 1.87 2.45 14.6 96 2.50 2.52 0.30 1.98 5.250000 1.02 3.58 1290
1 2.15 2.61 17.6 121 2.60 2.51 0.31 1.25 5.050000 1.06 3.58 1295
1 1.64 2.17 14.0 97 2.80 2.98 0.29 1.98 5.200000 1.08 2.85 1045
Upvotes: 1