Reputation: 472

Read CSVs with selected column headers into one CSV file in Python (read by line)

I have a question. I would like to iterate through the folder for csv files that contain e.g. "usr666" in name and then load them into pandas dataframe only by selected column headers and merge them into one file as in the following example:

BT_usr666.csv: 
number|size|person|car    |
---------------------------
31     |2   |Ringo |Tesla  |
82     |3   |Paul  |Audi   |
93     |2   |John  |BMW    |
74     |3   |George|MG     |


RS_usr666.csv:

number|color|person|doors|car    |
---------------------------------
33    |black|Mick  |2    |Porsche|
12    |red  |Keith |4    |Saab   |
55    |blue |Ron   |6    |Volvo  |

into FINAL_usr666.csv

person|car    |
---------------
Ringo |Tesla  |
Paul  |Audi   |
John  |BMW    |
George|MG     |
Mick  |Porsche|
Keith |Saab   |
Ron   |Volvo  |

Any ideas?

Upvotes: 1

Answers (2)

null

Reputation: 2137

You can try the following script.

Code

import glob
import os

import pandas as pd

def get_final_df(files):
    df = pd.DataFrame()

    your_columns = ['person', 'car']

    for file in files:
        temp_df = pd.read_csv(file, usecols = your_columns)
        df = df.append(temp_df, ignore_index=True)

    return df

if __name__ == '__main__':
    wd = os.getcwd() # I've set this as working dir, you can change the path to your files.
    files = [file for file in glob.glob(os.path.join(wd, '*')) if 'usr666' in file]
    final_df = get_final_df(files)
    final_df.to_csv('final_df.csv', index=False) # Write to file

Upvotes: 1

ArunJose

Reputation: 2159

This could do it

This searches for the file in "." ie the current directory and finds files starting with usr666 and does what you asks for

import pandas as pd
import os
x=pd.DataFrame()
for filename in sorted(os.listdir(".")):
    if filename.startswith("usr666"):
        y=pd.read_csv(filename)
        selected=y[["person","car"]]
        x=x.append(selected)
        x.to_csv('file1.csv',index=True)

Upvotes: 1

Read CSVs with selected column headers into one CSV file in Python (read by line)

Answers (2)

Related Questions