musbur
musbur

Reputation: 679

How to convert a series of tuples to DataFrame?

I'm trying to add some extra data in additional columns to a data frame.

Consider this example code:

import pandas
import numpy

def more_data(d):
    return 1, 2

df = pandas.DataFrame({"A":[1, 2, 3], "B":[17, 16, 15]})

x = df.apply(more_data, axis=1)
df1 = pandas.DataFrame(x)
df2 = pandas.concat([df, df1], axis=1)

print(df2)

The output is:

A  B       0
0  1  17  (1, 2)
1  2  16  (1, 2)
2  3  15  (1, 2)

No surprise because apply() returns a sequence of tuples, which is faithfully added as a column of tuples to the data frame. What I want, however, is two more columns with the values returned in the tuples. How would that work?

Upvotes: 0

Views: 412

Answers (3)

Erfan
Erfan

Reputation: 42916

You were quite close with your own solution, if you convert your dataframe to a list, then construct it as a dataframe again defining the columns, it works:

def more_data(d):
    return 1, 2

df = pd.DataFrame({"A":[1, 2, 3], "B":[17, 16, 15]})

x = df.apply(more_data, axis=1)
df1 = pd.DataFrame(x.tolist(), columns=['Col1', 'Col2']) # <-- line which is different
df2 = pd.concat([df, df1], axis=1)

   A   B  Col1  Col2
0  1  17     1     2
1  2  16     1     2
2  3  15     1     2

Upvotes: 1

Mike
Mike

Reputation: 858

The closest you seem to be able to get is with df.assign:

df = pd.DataFrame({'x': [1, 2, 3], 'y': [4, 5, 6]})
df.assign(temp1=0, temp2=5)
#    x  y  temp1  temp2
# 0  1  4      0      5
# 1  2  5      0      5
# 2  3  6      0      5

You can assign to series, too, so you'd have to get each column result into a series, first. That could be as easy as:

s = pd.Series([('a', 'b')] * len(df))
df.assign(s0 = s.apply(lambda x: x[0]), s1 = s.apply(lambda x: x[1]))
#    x  y s0 s1
# 0  1  4  a  b
# 1  2  5  a  b
# 2  3  6  a  b

Upvotes: 0

skillsmuggler
skillsmuggler

Reputation: 1902

Try this

import pandas
import numpy

def more_data(d):
    return 1, 2

df = pandas.DataFrame({"A":[1, 2, 3], "B":[17, 16, 15]})

x = df.apply(more_data, axis=1)
df1 = pandas.DataFrame(x)
df1= pandas.concat([df, df1], axis=1)

df1[['new_1', 'new_2']] = pandas.DataFrame([list(x) for x in df1[0]])

# Result
print(df1)
   A   B       0  new_1  new_2
0  1  17  (1, 2)      1      2
1  2  16  (1, 2)      1      2
2  3  15  (1, 2)      1      2

Upvotes: 0

Related Questions