Reputation: 425
I'm trying to filter a larger csv that does not contain any headers. I would like to return a second dataframe that only returns the rows where there is positive values in the last column.
Here is what I'm trying;
input_data = pd.read_csv(infile, delimiter=',').values
print(input_data.shape) # (832650, 200)
pos_data = input_data.iloc[:, 199] > 0
The last line gives the error: AttributeError: 'numpy.ndarray' object has no attribute 'iloc'
I'm on 0.24.1 of pandas and 1.16.1 of numpy.
Thank you
EDIT: Removing values, gets rid of the error, but I still can't filter the dataframe.
input_data = pd.read_csv(infile, delimiter=',')
print(input_data.shape) # (832650, 200)
pos_data = input_data.iloc[:, -1] > 0
print(pos_data.shape) # (832650,)
Upvotes: 3
Views: 5641
Reputation: 862851
Use boolean indexing
:
input_data = pd.read_csv(infile)
df = input_data[input_data.iloc[:, -1] > 0]
Upvotes: 4