Reputation: 695
I am appending data to a dataframe using the pandas library and am renaming my columns in the code below:
chunksize = 20000
j = 0
index_start = 1
for df in pd.read_csv('C:/Users/xxx/Desktop/jjj.tsv', chunksize=chunksize, header = None,sep='\t', iterator=True, encoding='utf-8'):
df=df.rename(columns = {0:'screen',1:'user'})
df.to_sql('data', disk_engine, if_exists='append')
index_start = df.index[-1] + 1
j=j+1
if j >=2:
break
else:
count = j*chunksize
print(count)
print(j)
However, when I execute a simple query:
df = pd.read_sql_query('SELECT * FROM data LIMIT 3', disk_engine)
I get:
index 0 1
0 0 293 3785616
1 1 3250207 5540538
2 2 1000052 5542982
Why are my columns not renamed? Should have df=df.rename(columns = {0:'screen',1:'user'})
renamed the columns?
Upvotes: 0
Views: 2416
Reputation: 11
If you need to rename columns, the best strategy is to use the inplace flag,
df.rename(columns = {0:'screen',1:'user'}, inplace=True)
Upvotes: 1
Reputation: 210912
you can rename your columns on the fly while reading CSV by specifying the names
parameter:
pd.read_csv('C:/Users/xxx/Desktop/jjj.tsv',
chunksize=chunksize,
header = None,
sep='\t',
iterator=True,
encoding='utf-8',
names=['screen','user'])
PS most probably you've created your SQL table first time, not renaming your columns - try to drop and recreate it
Upvotes: 1