Windy71
Windy71

Reputation: 909

Pythonic way to iterate over a Dataframe when column row value matches condition

I want to plot a density graph and need to extract two values from the data to use as coordinates and a third value which will be the density value. I have a text file that I want to read and when one row/column value matches a condition I want to save two other values of that row. I have something that works for one of the conditions but it saves the row values within a nested list, I was wondering if there was a more pythonic way to do this as I think it may be easier to plot later.

Data:

Accum   EdgeThr NumberOfBlobs   durationMin Vol Perom   X   Y
50  0   0   0.03    0   0   0   0
50  2   0   0.03    0   0   0   0
50  4   0   0.03    0   0   0   0
50  6   0   0.03    0   0   0   0
50  8   2   0.03    27.833133599054975  0.0 1032.0  928.0
50  10  2   0.03    27.833133599054975  0.0 1032.0  928.0
46  30  2   0.17    27.833133599054975  0.0 968.0   962.0
46  32  2   0.17    27.833133599054975  0.0 1028.0  1020.0
46  34  2   0.17    27.833133599054975  0.0 978.0   1122.0
46  36  2   0.17    27.833133599054975  0.0 1000.0  1080.0
46  38  2   0.18    27.833133599054975  0.0 1010.0  1062.0

Code:

import pandas as pd

# load data as a pandas dataframe
df = pd.read_csv('dummy.txt', sep='\t', lineterminator='\r')

# to find the rows matching one condition ==2
blob2 = []
for index, row in df.iterrows():
    temp = [row['Accum'], row['EdgeThr']]
    if row['NumberOfBlobs']==2:
        blob2.append(temp)
        print(index, row['Accum'], row['EdgeThr'], row['NumberOfBlobs'])
print(blob2)

Upvotes: 0

Views: 50

Answers (1)

JohanC
JohanC

Reputation: 80329

df[df['NumberOfBlobs'] == 2] will select all rows that fulfill the condition.

df[df['NumberOfBlobs'] == 2][['Accum', 'EdgeThr']] will select those two columns.

Here is an example:

import pandas as pd
import numpy as np

N = 10
df = pd.DataFrame({'Accum': np.random.randint(40, 50, N), 'EdgeThr': np.random.randint(0, 50, N),
                   'NumberOfBlobs': np.random.randint(0, 2, N) * 2})
blobs2 = df[df['NumberOfBlobs'] == 2][['Accum', 'EdgeThr']]

Example of df:

   Accum  EdgeThr  NumberOfBlobs
0     42       44              2
1     47       32              0
2     45        9              2
3     48       15              2
4     44        6              0
5     42       24              0
6     46       20              0
7     46        9              0
8     40       36              0
9     41        3              0

blobs2:

   Accum  EdgeThr
0     42       44
2     45        9
3     48       15

Upvotes: 1

Related Questions