MattiH
MattiH

Reputation: 654

How to combine large number of pandas Series into DataFrame?

I have 180 000 pandas Series that I would need to combine into one DataFrame. Adding them one by one takes a lot of time, apparently because appending gets increasingly slower when the size of the dataframe increases. The same problem persists even if I use numpy which is faster than Pandas in this.

What could be an even better way to create a DataFrame from the Series?

Edit: Some more background info. The Series were stored in a list. It is sports data, and the list was called player_library with 180 000 + items. I didn't realise that it is enough to write just

pd.concat(player_library, axis=1) 

instead of listing all the individual items. Now it works fast and nicely.

Upvotes: 0

Views: 329

Answers (2)

Akshat Sirohi
Akshat Sirohi

Reputation: 1

Input-

series = pd.Series(["BMW", "Toyota", "Honda"]) series

Output-

0 BMW

1 Toyota

2 Honda

dtype: object

Input-

colours = pd.Series(["Red", "Blue", "White"]) colours

Output-

0 Red

1 Blue

2 White

dtype: object

Input-

car_data = pd.DataFrame({"Car make": series, "Colour": colours}) car_data

Output-

Car make Colour
0 BMW Red
1 Toyota Blue
2 Honda White

Upvotes: -1

RichieV
RichieV

Reputation: 5183

You could try pd.concat instead of append.

If you want each series to be a column then

df = pd.concat([list_of_series_objects], axis=1)

For more detail on why it is expensive to iterate and append read this question

Upvotes: 2

Related Questions