Reputation: 1093
I have a data frame with 20 columns, two of the columns being Company1 and Company2. I want a resultant data frame with only those rows in which length of Company1 and Company2 don't differ by more than 5 characters. How do I accomplish this task using pandas?
Upvotes: 1
Views: 697
Reputation: 214957
You can use .str.len()
to get access to the number of characters in the Company
columns, then calculate the difference with vectorized subtraction of pandas series and create a logic vector with the condition for subsetting:
df[abs(df.Company1.str.len() - df.Company2.str.len()) <= 5]
Upvotes: 2