Simon Lindgren
Simon Lindgren

Reputation: 2031

Extracting from dataframe

I have a pandas dataframe that looks like this:

letter;Pairs;Count
abandon;frozenset(['abandon', 'dm']);1
abattoir;frozenset(['abattoir', 'year']);1
abbey;frozenset(['abbey', 'mean']);1

I want to write to a csv that looks like:

abandon;dm
abbattoir;year
abbey;mean

Standard pandas dataframe selection does not seem to work as frozensetcomplicates things.

Upvotes: 0

Views: 846

Answers (3)

jhonsfran
jhonsfran

Reputation: 31

You can do this...

df["Pairs"].apply(lambda x: list(x)[0]).astype("unicode")

Upvotes: 1

bunji
bunji

Reputation: 5213

I'm assuming that the first line in your data frame is the header line so that:

print(df)

     letter             Pairs Count
0   abandon     (dm, abandon)     1
1  abattoir  (abattoir, year)     1
2     abbey     (abbey, mean)     1

(the round brackets around the elements in Pairs are how pandas prints frozensets)

You can change this into a data frame called df2 that looks like this:

     letter Pairs
0   abandon    dm
1  abattoir  year
2     abbey  mean

By doing:

df2 = pd.DataFrame([df['letter'],(df['Pairs']-set(df['letter'])).str.join('')]).T

This works by first doing a set difference between your letter and Pairs column in order to get the element in your frozenset that is not the same as the element in letter. You can then create a new DataFrame using this element and the elements in the letter column. Finally, you transpose that DataFrame in order to orient in the way you want.

Upvotes: 4

yeharav
yeharav

Reputation: 441

I think that

print(X.apply(lambda x: ";".join(x[1]),axis=1).to_csv(index=False))

or

print(X.apply(lambda x: ";".join(x.Pairs),axis=1).to_csv(index=False))

where X is your dataframe might work

Upvotes: 0

Related Questions