Reputation: 53
I have several csv files in one folder. it is about 60th files. So, i want to make a new csv file and using only two columns from each file, like below :
This is an example :
1.csv 2.csv 3.csv 4.csv and so on...
a b c d a b c d a b c d a b c d
1 2 3 4 8 3 5 7 2 9 4 6 3 6 8 3
4 2 8 3 6 3 6 7 9 3 4 5 3 6 6 8
3 9 4 8 9 3 4 2 4 7 4 4 1 8 3 5
I want to add a only two column from each csv file...
I have tried my script below :
import os
import glob
import pandas as pd
import csv
path = "C:/Users/SYIFAAZRA/Documents/belajar_wradlib/Skew-T/"
os.chdir(path)
file = glob.glob("*.csv")
x=0
for files in file:
x=x+1
df = pd.read_csv(files, delimiter=',', skiprows=7, usecols=[11, 22])
df1 = df.merge(df, how='right')
df2 = pd.concat([df1])
print (df2)
I want to make a new column such as below :
a b a b a b a b
1 2 8 3 2 9 3 6
4 2 6 3 9 3 3 6
3 9 9 3 4 7 1 8
can some one help me ?
Upvotes: 1
Views: 54
Reputation: 862641
Because already filtered columns by parameter usecols
create list of DataFrames in loop:
path = "C:/Users/SYIFAAZRA/Documents/belajar_wradlib/Skew-T/"
os.chdir(path)
filenames = glob.glob("*.csv")
dfs = []
for f in filenames:
df = pd.read_csv(f, delimiter=',', skiprows=7, usecols=[11, 22])
dfs.append(df)
Or in list comprehension:
dfs = [pd.read_csv(f, delimiter=',', skiprows=7, usecols=[11, 22]) for f in filenames]
And join together by concat
with axis=1
:
df2 = pd.concat(dfs, axis=1)
print (df2)
Better solution with avoid duplicated columns names is create MultiIndex
with keys
parameter and then flatten it by f-string
s:
df2 = pd.concat(dfs, axis=1, keys=np.arange(len(dfs)))
df2.columns = [f'{a}_{b}' for a, b in df2.columns]
print (df2)
Upvotes: 1