How to append several data frame into one

Question

I have write down a code to append several dummy DataFrame into one. After appending, the expected "DataFrame.shape" would be (9x3). But my code producing something unexpected output (6x3). How can i rectify the error of my code.

import pandas as pd


a = [[1,2,4],[1,3,4],[2,3,4]]
b = [[1,1,1],[1,6,4],[2,9,4]]
c = [[1,3,4],[1,1,4],[2,0,4]]
d = [[1,1,4],[1,3,4],[2,0,4]]


df1 = pd.DataFrame(a,columns=["a","b","c"])
df2 = pd.DataFrame(b,columns=["a","b","c"])
df3 = pd.DataFrame(c,columns=["a","b","c"])

for df in (df1, df2, df3):
    df =  df.append(df, ignore_index=True)
print df

I don't want use "pd.concat" because in this case i have to store all the data frame into memory and my real data set contains hundred of data frame with huge shape. I just want a code which can open one CSV file at once into loop update the final DF with the progress of loop

thanks

EdChum · Accepted Answer

Firstly use concat to concatenate a bunch of dfs it's quicker:

In [308]:
df = pd.concat([df1,df2,df3], ignore_index=True)
df

Out[308]:
   a  b  c
0  1  2  4
1  1  3  4
2  2  3  4
3  1  1  1
4  1  6  4
5  2  9  4
6  1  3  4
7  1  1  4
8  2  0  4

secondly you're reusing the iterable in your loop which is why it overwrites it, if you did this it would work:

In [307]:
a = [[1,2,4],[1,3,4],[2,3,4]]
b = [[1,1,1],[1,6,4],[2,9,4]]
c = [[1,3,4],[1,1,4],[2,0,4]]
d = [[1,1,4],[1,3,4],[2,0,4]]


df1 = pd.DataFrame(a,columns=["a","b","c"])
df2 = pd.DataFrame(b,columns=["a","b","c"])
df3 = pd.DataFrame(c,columns=["a","b","c"])

df = pd.DataFrame()

for d in (df1, df2, df3):
    df =  df.append(d, ignore_index=True)
df

Out[307]:
   a  b  c
0  1  2  4
1  1  3  4
2  2  3  4
3  1  1  1
4  1  6  4
5  2  9  4
6  1  3  4
7  1  1  4
8  2  0  4

Here I changed the iterable to be d and declared an empty df outside the loop:

df = pd.DataFrame()

for d in (df1, df2, df3):
    df =  df.append(d, ignore_index=True)

How to append several data frame into one

Answers (1)

Related Questions