Reputation: 1377
I know it's possible to name columns when using DataFrame.read_csv()
in pandas by passing the optional names = ['X', 'Y', 'Z', ...]
parameter. However, my question is can you name only the first X columns and the rest get autonamed?
Basically, I have a csv with 23 columns that I want to name, and a further 1023 columns that I need to keep in the DataFrame but don't care about what they're called. Here's an image to illustrate the requirement:
Upvotes: 1
Views: 440
Reputation: 6018
I don't see a setting in pandas to do this, so I just generated a list of column column names and rename the columns in the DataFrame.
This will work even if you don't know how many columns to expect at the end
import pandas
#Read file
myFile = pandas.read_csv("C:\\python_work_area\\TestFile.csv",header=None)
#Set known column names
arr_colName = ["MyColName1","MyColName2","MyColName3"]
numOfUnkownCols = len(myFile.columns) - len(arr_colName)
#Generate array of numbers, 1 for each unknown column. Could hard code numOfUnkownCols if column count is known
arr_nums = list(range(1,numOfUnkownCols+1))
#Add numbered unnamed column names to arr_colName
for i in arr_nums:
arr_colName.append("UnnamedColumn" + str(i))
#Rename column names. inplace = true renames the columns in the existing object, rather than generating a copy
myFile.set_axis(arr_colName, axis=1, inplace=True)
print (myFile)
Upvotes: 1