Sam
Sam

Reputation: 565

How can I improve this Pandas DataFrame construction?

I wrote this ugly piece of code. It does the job, but it is not elegant. Any suggestion to improve it?

Function returns a dict given i, j.

pairs = [dict({"i":i, "j":j}.items() + function(i, j).items()) for i,j in my_iterator]
pairs = pd.DataFrame(pairs).set_index(['i', 'j'])

The dict({}.items() + function(i, j).items()) is supposed to merge both dict in one as dict().update() does not return the merged dict.

Upvotes: 3

Views: 128

Answers (1)

Andy Hayden
Andy Hayden

Reputation: 375685

A favourite trick* to return an updated a newly created dictionary:

dict(i=i, j=j, **function(i, j))

*and of much debate on whether this is actually "valid"...

Perhaps also worth mentioning the DataFrame from_records method:

In [11]: my_iterator = [(1, 2), (3, 4)]

In [12]: df = pd.DataFrame.from_records(my_iterator, columns=['i', 'j'])

In [13]: df
Out[13]:
   i  j
0  1  2
1  3  4

I suspect there would be a more efficient method by vectorizing your function (but it's hard to say what makes more sense without more specifics of your situation)...

Upvotes: 4

Related Questions