user1149518
user1149518

Reputation: 101

Pandas Dataframe related problem. Getting syntax error

I am hung up here trying to fix the problem here. https://app.pluralsight.com/guides/machine-learning-concepts-python-and-scikit-learn

I am hung up here and can somebody explain how to fix the problem?

import pandas as pd
wine_data_frame = pd.DataFrame(data=wine_data['data'], columns=wine_data['feature_names'])
wine_data_frame['class'] = wine_data['target']

wine_classes = [wine_data_frame[wine_data_frame['class'] == x for x in range(3)]

I get error "Invalid syntax" for the wine_classes line and tried to fix the error without any luck! Can anybody help?

Upvotes: 0

Views: 1038

Answers (3)

Scott Boston
Scott Boston

Reputation: 153510

Missing a bracket after 'x', bad documentation on that site.

wine_classes = [wine_data_frame[wine_data_frame['class'] == x] for x in range(3)]

Complete code:

import pandas as pd
from sklearn.datasets import load_wine

wine_data = load_wine()
wine_data_frame = pd.DataFrame(data=wine_data['data'], columns=wine_data['feature_names'])
wine_data_frame['class'] = wine_data['target']

wine_classes = [wine_data_frame[wine_data_frame['class'] == x] for x in range(3)]

testing_data = []
for wine_class in wine_classes:
    row = wine_class.sample()
    testing_data.append(row)
    wine_data_frame = wine_data_frame.drop(row.index)

Output:

[   alcohol  malic_acid   ash  alcalinity_of_ash  magnesium  total_phenols  flavanoids  nonflavanoid_phenols  proanthocyanins  color_intensity   hue  od280/od315_of_diluted_wines  proline  class
2    13.16        2.36  2.67               18.6      101.0            2.8        3.24                   0.3             2.81             5.68  1.03                          3.17   1185.0      0,     alcohol  malic_acid   ash  alcalinity_of_ash  magnesium  total_phenols  flavanoids  nonflavanoid_phenols  proanthocyanins  color_intensity   hue  od280/od315_of_diluted_wines  proline  class
91     12.0        1.51  2.42               22.0       86.0           1.45        1.25                   0.5             1.63              3.6  1.05                          2.65    450.0      1,      alcohol  malic_acid   ash  alcalinity_of_ash  magnesium  total_phenols  flavanoids  nonflavanoid_phenols  proanthocyanins  color_intensity   hue  od280/od315_of_diluted_wines  proline  class
150     13.5        3.12  2.62               24.0      123.0            1.4        1.57                  0.22             1.25              8.6  0.59                           1.3    500.0      2]

and wine_dataframe:

     alcohol  malic_acid   ash  alcalinity_of_ash  magnesium  total_phenols  flavanoids  nonflavanoid_phenols  proanthocyanins  color_intensity   hue  od280/od315_of_diluted_wines  proline  class
0      14.23        1.71  2.43               15.6      127.0           2.80        3.06                  0.28             2.29             5.64  1.04                          3.92   1065.0      0
1      13.20        1.78  2.14               11.2      100.0           2.65        2.76                  0.26             1.28             4.38  1.05                          3.40   1050.0      0
3      14.37        1.95  2.50               16.8      113.0           3.85        3.49                  0.24             2.18             7.80  0.86                          3.45   1480.0      0
4      13.24        2.59  2.87               21.0      118.0           2.80        2.69                  0.39             1.82             4.32  1.04                          2.93    735.0      0
5      14.20        1.76  2.45               15.2      112.0           3.27        3.39                  0.34             1.97             6.75  1.05                          2.85   1450.0      0
..       ...         ...   ...                ...        ...            ...         ...                   ...              ...              ...   ...                           ...      ...    ...
173    13.71        5.65  2.45               20.5       95.0           1.68        0.61                  0.52             1.06             7.70  0.64                          1.74    740.0      2
174    13.40        3.91  2.48               23.0      102.0           1.80        0.75                  0.43             1.41             7.30  0.70                          1.56    750.0      2
175    13.27        4.28  2.26               20.0      120.0           1.59        0.69                  0.43             1.35            10.20  0.59                          1.56    835.0      2
176    13.17        2.59  2.37               20.0      120.0           1.65        0.68                  0.53             1.46             9.30  0.60                          1.62    840.0      2
177    14.13        4.10  2.74               24.5       96.0           2.05        0.76                  0.56             1.35             9.20  0.61                          1.60    560.0      2

[175 rows x 14 columns]

Upvotes: 1

fmarm
fmarm

Reputation: 4284

First, add a bracket after ==x, second, to filter on a pandas dataframe you need to add .loc before the bracket

wine_classes = [wine_data_frame.loc[wine_data_frame['class'] == x] for x in range(3)]

This will give you a list of dataframes

Upvotes: 1

Searching_answers
Searching_answers

Reputation: 11

I am assuming the "wine_data" is already in a dataframe.

wine_data_frame = wine_data_frame.rename(columns = {'target':'class'})

wine_classes = wine_data_frame[wine_data_frame['class'].isin(range(3))]

Upvotes: 0

Related Questions