Reputation: 151
I have a csv where not all column headers are specified.
temp.csv reads,
a, b
1, 2, 3, 4
5, 6, 7, 8
When I try to read this with pandas, i get a multi-index dataframe.
pd.read_csv('temp.csv')
produces the output,
a b
1 2 3 4
5 6 7 8
What I want is for the [1, 5] column header to be 'a', and the [2, 6] column to be 'b'. Explicitly setting index_col=None does not fix the problem. Any ideas?
Edit: Thanks ALollz. I modified your answer slightly so I only read the file once. (I'll be reading a lot of files.)
df = pd.read_csv('temp.csv')
names = df.columns.tolist()
df.reset_index(inplace=True)
df.columns = names + [i for i in range(df.shape[1] - len(names))]
Upvotes: 0
Views: 269
Reputation: 59579
You can ignore the broken header with a combination of header=0
and the names
you want to specify:
pd.read_csv('temp.csv', header=0, names=['a', 'b', 'col1', 'col2'])
# a b col1 col2
#0 1 2 3 4
#1 5 6 7 8
If you don't want to manually specify things you can read the first row to use the headers and then figure out how many other names you need to supply.
names = pd.read_csv('temp.csv', nrows=1)
names = names.columns.tolist() + [f'col{i}' for i in range(1, df.shape[1] - len(names))]
df = pd.read_csv('temp.csv', header=0, names=names)
Upvotes: 1