Reputation: 1495
I'm trying to develop a function that checks the length of a VIN and returns an output.
To test, I have it print out the vin if the length = 17. However, it clearly prints everything out. Not sure what I'm doing wrong.
Code example below.
import pandas as pd
#initialization
df = pd.DataFrame(columns = ["vin"], data = ['LHJLC79U58B001633','SZC84294845693987','LFGTCKPA665700387','L8YTCKPV49Y010001',
'LJ4TCBPV27Y010217','LFGTCKPM481006270','LFGTCKPM581004253','LTBPN8J00DC003107',
'1A9LPEER3FC596536','1A9LREAR5FC596814','1A9LKEER2GC596611','1A9L0EAH9C596099',
'22A000018'])
df['manufacturer'] = ['A','A','A','A','B','B','B','B','B','C','C','D','D']
#develop function
def check_vin(df):
if len(df['vin'][1]) == 17:
print(df['vin'])
else:
print('nogo')
#test the function
for index, row in df.iterrows():
check_vin(df)
Upvotes: 1
Views: 44
Reputation: 164663
Don't iterate rows manually. You can use assign a new column with conditional logic:
df['check'] = np.where(df['vin'].str.len().eq(17), df['vin'], 'nogo')
print(df)
vin manufacturer check
0 LHJLC79U58B001633 A LHJLC79U58B001633
1 SZC84294845693987 A SZC84294845693987
2 LFGTCKPA665700387 A LFGTCKPA665700387
3 L8YTCKPV49Y010001 A L8YTCKPV49Y010001
4 LJ4TCBPV27Y010217 B LJ4TCBPV27Y010217
5 LFGTCKPM481006270 B LFGTCKPM481006270
6 LFGTCKPM581004253 B LFGTCKPM581004253
7 LTBPN8J00DC003107 B LTBPN8J00DC003107
8 1A9LPEER3FC596536 B 1A9LPEER3FC596536
9 1A9LREAR5FC596814 C 1A9LREAR5FC596814
10 1A9LKEER2GC596611 C 1A9LKEER2GC596611
11 1A9L0EAH9C596099 D nogo
12 22A000018 D nogo
Upvotes: 1
Reputation: 6091
That's because you're asking for the length of the same thing at every iteration (df['vin'][1])
Change to
for index, row in df.iterrows():
check_vin(row)
And
def check_vin(r):
if len(r.vin) == 17:
print(r)
else:
print('nogo')
Output
LHJLC79U58B001633
SZC84294845693987
LFGTCKPA665700387
L8YTCKPV49Y010001
LJ4TCBPV27Y010217
LFGTCKPM481006270
LFGTCKPM581004253
LTBPN8J00DC003107
1A9LPEER3FC596536
1A9LREAR5FC596814
1A9LKEER2GC596611
nogo
nogo
Upvotes: 1
Reputation: 3634
You could just locate your 17
length rows as following
condition = (df["vin"].str.len() == 17)
print(df[condition])
Upvotes: 0