dark horse
dark horse

Reputation: 3719

Pandas - Adding dummy header column in csv

I am trying concat several csv files by customer group using the below code:

files = glob.glob(file_from + "/*.csv") <<-- Path where the csv resides
df_v0 = pd.concat([pd.read_csv(f) for f in files]) <<-- Dataframe that concat all csv files from files mentioned above

The problem is the number of column in the csv varies by customer and they do not have a header file.

I am trying to see if I could add in a dummmy header column with labels such as col_1, col_2 ... depending on the number of columns in that csv.

Could anyone guide as to how could I get this done. Thanks.

Update on trying to search for a specific string in the Dataframe:

Sample Dataframe

col_1,col_2,col_3
fruit,grape,green
fruit,watermelon,red
fruit,orange,orange
fruit,apple,red

Trying to filter out rows having the word red and expect it to return rows 2 and 4.

Tried the below code:

df[~df.apply(lambda x: x.astype(str).str.contains('red')).any(axis=1)]

Upvotes: 1

Views: 1025

Answers (1)

jezrael
jezrael

Reputation: 862661

Use parameters header=None for default range columns 0, 1, 2 and skiprows=1 if necessary remove original columns names:

df_v0 = pd.concat([pd.read_csv(f, header=None, skiprows=1) for f in files])

If want also change columns names add rename:

dfs = [pd.read_csv(f, header=None, skiprows=1).rename(columns = lambda x: f'col_{x + 1}') 
        for f in files]
df_v0 = pd.concat(dfs)

Upvotes: 1

Related Questions