Reputation: 7966
Lots of questions address this, but none of the solutions seem to work exactly as I need.
I have a dataframe with two columns of numbers with 10-20 digits each. These are actually ID #s, and I'd like to concatenate them. It looks like that's best done by first converting the values to strings.
However, when converting with .astype(str)
, pandas keeps the scientific notation, which won't fly.
Things I've tried:
tried: dtype arg ('str') or converters (using str()
) in read_csv()
outcome: df.dtypes
still lists 'objects,' and values still display in sci. notation
tried: pd.set_option('display.float_format', lambda x: '%.0f' % x)
outcome: displays good in df.head()
, but reverts to scientific notation upon coercion to string & concatenation using + operator
tried: coercing to int, str, or str(int(x))
.
outcome: int works when i coerce one value with int()
, but not when I use astype(int)
. using .apply()
with int()
throws an 'invalid literal long() with base 10' error.
This feels like it should be pretty straightforward, anxious to figure out what I'm missing.
Upvotes: 2
Views: 2104
Reputation: 3689
What you tried sets the display format. You could just format the float as a string in the dataframe.
import numpy as np
import pandas as pd
import numpy as np
import pandas as pd
df=pd.DataFrame(data={'a':np.random.randint(low=1,high=100,size=10)*1e20,'b':np.random.randint(low=1,high=100,size=10)*1e20})
df.apply(lambda x: '{0:20.0f}|{1:20.0f}'.format(x.a,x.b),axis=1)
Out[34]:
0 9699999999999998951424|4600000000000000000000
1 300000000000000000000|2800000000000000000000
2 9400000000000000000000|9000000000000000000000
3 2100000000000000000000|4500000000000000000000
4 5900000000000000000000|4800000000000000000000
5 7700000000000000000000|6200000000000000000000
6 1600000000000000000000|8000000000000000000000
7 100000000000000000000|400000000000000000000
8 9699999999999998951424|8000000000000000000000
9 4500000000000000000000|3500000000000000000000
Upvotes: 2