Reputation: 2342
I have several dataframes in a list, obtained after using np.array_split
and I want to concat some of then into a single dataframe. In this example, I want to concat 3 dataframes contained in b (all but the 2nd one, which is the element b[1] in the list):
df = pd.DataFrame({'country':['a','b','c','d'],
'gdp':[1,2,3,4],
'iso':['x','y','z','w']})
a = np.array_split(df,4)
i = 1
b = a[:i]+a[i+1:]
desired_final_df = pd.DataFrame({'country':['a','c','d'],
'gdp':[1,3,4],
'iso':['x','z','w']})
I have tried to create an empty df and then use append through a loop for the elements in b but with no complete success:
CV = pd.DataFrame()
CV = [CV.append[(b[i])] for i in b] #try1
CV = [CV.append(b[i]) for i in b] #try2
CV = pd.DataFrame([CV.append[(b[i])] for i in b]) #try3
for i in b:
CV.append(b) #try4
I have reached to a solution which works but it is not efficient:
CV = pd.DataFrame()
CV = [CV.append(b) for i in b][0]
In this case, I get in CV three times the same dataframe with all the rows and I just get the first of them. However, in my real case, in which I have big datasets, having three times the same would result in much more time of computation.
How could I do that without repeating operations?
Upvotes: 0
Views: 85
Reputation: 8903
To cancatenate multiple DFs, resetting index, use pandas.concat
:
pd.concat(b, ignore_index=True)
output
country gdp iso
0 a 1 x
1 c 3 z
2 d 4 w
Upvotes: 1
Reputation: 7509
According to the docs, DataFrame.append
does not work in-place, like lists. The resulting DataFrame object is returned instead. Catching that object should be enough for what you need:
df = pd.DataFrame()
for next_df in list_of_dfs:
df = df.append(next_df)
You may want to use the keyword argument ignore_index=True
in the append
call so that the indices become continuous, instead of starting from 0 for each appended DataFrame (assuming that the index of the DataFrames in the list all start from 0).
Upvotes: 2