Amrith Krishna
Amrith Krishna

Reputation: 2853

Convert a pandas column containing tuples to string

I have pandas dataframe df of 800 rows with one of its column containing tuples:

        conComb             insOrDel    supp
580     ('r', '>', 'ins')   36272       0.199807
449     ('ar', '>', 'ins')  31596       0.174049
594     ('tar', '>', 'ins') 4398        0.024227
529     ('lar', '>', 'ins') 3037        0.016730

df.dtypes results in the following

conComb      object
insOrDel      int64
supp        float64
dtype: object

I would like to convert the conComb column into a string. But use of

df["conComb"] = df["conComb"].astype(str)

df["conComb"] = df["conComb"].astype(|S1) or

df["conComb"] = df["conComb"].values.astype(str),

does not change the type.

How can the type of the column conComb be changed to a string?

Extension to the question as discussed in the comments

Further, I have another dataframe confDF with 24,000 rows

    conComb                     objF    insOrDel
0   ('<ablucar', '>', 'ins')    (a)     11
1   ('<ablucar', '>', 'ins')    (ai)    3
2   ('<ablucar', '>', 'ins')    (ais)   3
3   ('<ablucar', '>', 'ins')    (amos)  2

Applying join operation between dfand confDF throws the following message ValueError: You are trying to merge on object and int64 columns. If you wish to proceed you should use pd.concat

confDF["conComb"] = confDF["conComb"].astype(str)
pd.DataFrame.join(df,confDF, on ="conComb")

How can this be rectified?

Upvotes: 1

Views: 3877

Answers (1)

jezrael
jezrael

Reputation: 862761

I think there is difference between dtypes and types.

strings, dicts, tuples and lists has same dtype object.

But each has different type.

For check dtypes is used:

print (df.dtypes)

For check types use :

print (df.iloc[0].apply(type))

EDIT: I think error is raised because join working by default with index values and column, if is specify on parameter.

SO I think if want join 2 Dataframes by 2 columns use:

confDF["conComb"] = confDF["conComb"].astype(str)
df1 = pd.merge(df,confDF, on ="conComb", how='left')

Or:

confDF["conComb"] = confDF["conComb"].astype(str)
df1 = df.set_index('conComb').join(confDF, on ="conComb")

Upvotes: 3

Related Questions