Reputation: 3677
I am assigning column names to a dataframe in pandas but the column names are creating new columns how do I go around this issue?
What dataframe looks like now:
abs_subdv_cd abs_subdv_desc
0 A0001A ASHTON ... NaN
1 A0002A J. AYERS ... NaN
2 A0003A NEWTON ALLSUP ... NaN
3 A0004A M. AUSTIN ... NaN
4 A0005A RICHARD W. ALLEN ... NaN
What I want dataframe look like:
abs_subdv_cd abs_subdv_desc
0 A0001A ASHTON
1 A0002A J. AYERS
2 A0003A NEWTON ALLSUP
3 A0004A M. AUSTIN
4 A0005A RICHARD W. ALLEN
code so far:
import pandas as pd
###Declaring path###
path = ('file_path')
###Calling file in folder###
appraisal_abstract_subdv = pd.read_table(path + '/2015-07-28_003820_APPRAISAL_ABSTRACT_SUBDV.txt',
encoding = 'iso-8859-1' ,error_bad_lines = False,
names = ['abs_subdv_cd','abs_subdv_desc'])
print(appraisal_abstract_subdv.head())
-edit-
When I try appraisal_abstract_subdv.shape..the dataframe is showing shape as (4000,1) where as the data has two columns.
this example of data I am using:
A0001A ASHTON
A0002A J. AYERS
Thank you in advance.
Upvotes: 1
Views: 49
Reputation: 210982
it looks like your data file has another delimiter (not a TAB, which is a default separator for pd.read_table()
), so try to use: sep='\s+'
or delim_whitespace=True
parameter.
In order to check your columns after reading your data file do the following:
print(df.columns.tolist())
Upvotes: 1
Reputation: 568
There is a rename function in pandas that you can use to get the column names
appraisal_abstract_subdv.columns.values
then with those column names use this method to rename them appropriately
df.rename(columns={'OldColumn1': 'Newcolumn1', 'OldColumn2': 'Newcolumn2'}, inplace=True)
Upvotes: 1