Reputation: 498
I have a short script to pivot data. The first column is a 9 digit ID number, often beginning with zeros such as 000123456
Here is the script:
df = pd.read_csv('source')
new_df = df.pivot_table(index = 'id', columns = df.groupby('id').cumcount().add(1), values = ['prog_id', 'prog_type'], aggfunc='first').sort_index(axis=1,level=1)
new_df.columns = [f'{x}_{y}' for x,y in new_df.columns]
new_df.to_csv('destination')
print(new_df)
Although the CSV is being read with an id such as 000123456
, the output only contains 123456
Even when setting an explicit dtype, Pandas removes the leading zeros. Is there a work around for telling Pandas to leave the leading zeros?
Upvotes: 4
Views: 7459
Reputation: 21
You could use pandas' zfill()
method right after reading your csv file "source"
. Basically, you would fill the values of your attribute "id"
, with as many zeros as you would like, in this particular case, making the number 9 digits long (3 zeros + 6 original digits). So, we would have:
df = pd.read_csv('source')
df.index = df.index.str.zfill(9)
# (...)
Upvotes: 2
Reputation: 498
Per comment on original post, set dtype as string:
df = pd.read_csv('source', dtype={'id':np.str})
Upvotes: 4