Reputation: 3967
Given a CSV file with duplicate column A
, I need to read the file excluding the duplicate column -
A A C
306 306 506
3238 3238 591
4159 4159 366
1847 1847 2898
Available alternative options include usecols
, and names
. However, in Pandas version 0.24.1
we have mangle_dupe_cols
parameter too, which if set to False
should merge duplicate columns as mentioned in the docs.
But, when I do so, I get ValueError-
pd.read_csv('file.csv', mangle_dupe_cols=False, engine='python').head()
ValueError: Setting mangle_dupe_cols=False is not supported yet
Pandas version used for this problem - 0.24.1
What are your views on this problem?
Upvotes: 6
Views: 1677
Reputation: 863166
I check pandas github and found ENH: Support mangle_dupe_cols=False in pd.read_csv().
Unfortunately answer for comment is this comment:
What is the ETA on this issue?
when / if a community pull request happens
One possible solution is read file twice:
c = pd.read_csv('some.csv', header=None, nrows=1).iloc[0]
#or
#with open('some.csv', newline='') as f:
# reader = csv.reader(f)
# c = next(reader)
df = pd.read_csv('some.csv', header=None, skiprows=1)
df.columns = c
Upvotes: 3