How to merge many CSV files?

Question

I have about 7500 csv that needs to be merged into a single one in order to create an easy-readable table. The files format are as follow:

each file is denominated as a ticker of a stock (example: AA.csv, AAL.csv, AAPL.csv,etc...)
every file itself contains a date and a number in this format
```
2018-10-11,1
2018-10-12,3
2018-10-15,2
...
```

Now I want to merge them into a single csv file where the resulting table has in the header the name of the tickers, in the first column the dates and following the numbers (obviously keeping the csv format).

Example:

Note that some csv files are empty and some have differents starting dates or dates are just discontinued

&#193;ngel Igualada · Accepted Answer

You could do something like this:

import pandas as pd
import numpy as np
from glob import glob

dfs_list = []
for csv_file in glob('Tickers List/*.csv'):
    stock_ticker = csv_file.split(".")[0]
    df = pd.read_csv(csv_file,header=None, names=["date","num"])
    if df.shape[0] >0:
        df["date"] = pd.to_datetime(df["date"],format="%Y-%m-%d")
        df["stock_ticker"] = stock_ticker
        dfs_list.append(df)

final_df = pd.concat(dfs_list)

With glob('dir/*.csv') we get all the csv files on a folder.

After this, you will have a DataFrame that looks like this:

If you want to change to your format, you can do this: (note that the dates are automatically sorted because is used as index).

final_df = pd.pivot_table(final_df, values='num', index=['date'],
               columns=['stock_ticker'], fill_value=np.nan)

And you will have a DataFrame that looks like this:

Now you can write this DataFrame to a new csv with:

final_df.to_csv("merged.csv")

FULL CODE

import pandas as pd
import numpy as np
from glob import glob

dfs_list = []
for csv_file in glob('Tickers List/*.csv'):
    stock_ticker = csv_file.split(".")[0]
    df = pd.read_csv(csv_file,header=None, names=["date","num"])
    if df.shape[0] >0:
        df["date"] = pd.to_datetime(df["date"],format="%Y-%m-%d")
        df["stock_ticker"] = stock_ticker
        dfs_list.append(df)

final_df = pd.concat(dfs_list)

final_df = pd.pivot_table(final_df, values='num', index=['date'],
               columns=['stock_ticker'], fill_value=np.nan)

final_df.to_csv("merged.csv")

How to merge many CSV files?

Answers (2)

I'd do something like this with the `csv` module:

FULL CODE

Related Questions

How to merge many CSV files?

Answers (2)

I'd do something like this with the csv module:

FULL CODE

Related Questions

I'd do something like this with the `csv` module: