Reputation: 288
Lots of posts about object vs. string dtypes in pandas. I understand that distinction already, for the most part. What I don't understand is the difference between these three options:
some_series.astype(str)
some_series.astype('string')
some_series.astype(pd.StringDtype())
Furthermore, if after executing astype() I check dtype of the second and third options, both return the same output: string[python]
.
For the sake of simplicity, can I just use astype('string') instead of astype(pd.StringDtype()) and get exactly the same behavior, including conversion of series with only ints/floats or of nullable versions of such numeric data types? Are both astype('string') and astype(pd.StringDtype()) mapped to StringDtype internally? I could not find clarity on this point within pandas documentation (or within other stackoverflow posts). Thanks for the help.
Using:
Upvotes: 7
Views: 6282
Reputation: 288
Pandas documentation explains that 'string' is an alias for StringDtype. See at link below:
Pandas dtype aliases
Upvotes: 5