Reputation:
Input:
DF1:
name, message
adam, hello, i'am
viola, hi, my name is
data:
name, message
adam, hello, i'am
viola, hi, my name
I want to compare, if length of messages for specific name (for example: adam and adam) are same pass, else print this row.
Code:
if df['message'].apply(lambda x: len(x)) == data['name'].apply(lambda x: len(x)):
pass
else:
df['message'].apply(lambda x: print(x))
#edit: i can use maybe df.loc[:,'message'] as well i think
But I am receiving:
TypeError: object of type 'float' has no len()
, why?
Upvotes: 1
Views: 48
Reputation: 163
There might be a better way, but this could work for you:
import pandas
dt = pandas.DataFrame([["Adam","Hello, I am Adam"], ["Viola", "How are you"]], columns=["name", "message"])
data = pandas.DataFrame([["Adam","Hello, I am Adam"], ["Viola", "How are ya"]], columns=["name", "message"])
print(dt)
print(data)
data.columns = ["name", "message_data"]
merged = dt.merge(data, on=["name"])
merged[merged.message.str.len() != merged.message_data.str.len()]
First, you need to rename the ["message"]
column, so that it doesn't clash in the merge. Then you merge both dataframes, keeping only names which exist in both dataframes. Finaly, you compare the lengths of the strings in ["message"]
with those in ["message_data"]
and use that to extract those rows of the merged table that are different.
If you specifically want only the message, you can do:
merged.loc[merged.message.str.len() != merged.message_data.str.len(), "message"]
Printing the result line-by-line should be straightforward.
Upvotes: 1
Reputation: 206
A better approach would be to merge the two dataframes based on the name.
import pandas as pd
#construct df1
#construct df2
#merge two df based on name
df=pd.merge(df1,df2,on="name")
#get the length of messages and filter out unequal length
df_same_length=df[~df["message_x"].astype(str).str.len()==df["message_y"].astype(str).str.len()]
print(df_same_length["name"])
Upvotes: 1