Why I get different size on pandas dataframe after append or concat?

Question

My code looks like this:

import pandas as pd

candle_data = pd.DataFrame()

for fileName in files:
    csv_data = pd.read_csv(fileName, header=None)
    candle_data = pd.concat([candle_data, csv_data])
    #candle_data = candle_data.append(csv_data)  

print(candle_data)
print(candle_data.tail(3))

the result is:

                0      1        2        3        4        5  6
0      2000.05.30  17:27  0.93020  0.93020  0.93020  0.93020  0
1      2000.05.30  17:35  0.93040  0.93050  0.93040  0.93050  0
2      2000.05.30  17:38  0.93040  0.93040  0.93030  0.93030  0
...
29781  2016.04.29  16:55  1.14512  1.14524  1.14503  1.14515  0
29782  2016.04.29  16:56  1.14515  1.14517  1.14491  1.14495  0
29783  2016.04.29  16:57  1.14494  1.14505  1.14482  1.14482  0
29784  2016.04.29  16:58  1.14477  1.14511  1.14457  1.14457  0

[5171932 rows x 7 columns]
                0      1        2        3        4        5  6
29782  2016.04.29  16:56  1.14515  1.14517  1.14491  1.14495  0
29783  2016.04.29  16:57  1.14494  1.14505  1.14482  1.14482  0
29784  2016.04.29  16:58  1.14477  1.14511  1.14457  1.14457  0

Why did I get 5171932x7 as the dimension while printing the whole dataframe, but 29784 as the last row index? What is the correct way to merge all rows of two dataframes?

jezrael · Accepted Answer

I think there are duplicates in index:

You can add parameter ignore_index=True to concat if don't have a meaningful index:

pd.concat([candle_data, csv_data], ignore_index=True)

Docs

Why I get different size on pandas dataframe after append or concat?

Answers (1)

Related Questions