Reputation: 3719
I am trying concat several csv files by customer group using the below code:
files = glob.glob(file_from + "/*.csv") <<-- Path where the csv resides
df_v0 = pd.concat([pd.read_csv(f) for f in files]) <<-- Dataframe that concat all csv files from files mentioned above
The problem is the number of column in the csv varies by customer and they do not have a header file.
I am trying to see if I could add in a dummmy header column with labels such as col_1, col_2 ... depending on the number of columns in that csv.
Could anyone guide as to how could I get this done. Thanks.
Update on trying to search for a specific string in the Dataframe:
Sample Dataframe
col_1,col_2,col_3
fruit,grape,green
fruit,watermelon,red
fruit,orange,orange
fruit,apple,red
Trying to filter out rows having the word red and expect it to return rows 2 and 4.
Tried the below code:
df[~df.apply(lambda x: x.astype(str).str.contains('red')).any(axis=1)]
Upvotes: 1
Views: 1025
Reputation: 862661
Use parameters header=None
for default range columns 0, 1, 2
and skiprows=1
if necessary remove original columns names:
df_v0 = pd.concat([pd.read_csv(f, header=None, skiprows=1) for f in files])
If want also change columns names add rename
:
dfs = [pd.read_csv(f, header=None, skiprows=1).rename(columns = lambda x: f'col_{x + 1}')
for f in files]
df_v0 = pd.concat(dfs)
Upvotes: 1