spitfiredd
spitfiredd

Reputation: 3135

Pandas read_excel explicitly read a column in as text?

I am reading in a bunch of columns from excel spreadsheets, a few of the fields are 'ID' codes which are text that looks like numbers (because they have leading zeros). I would like to keep these ID fields as text/strings, is there a way to either read everything in as text and the convert the number fields to numbers or explicitly read certain fields in as text/string?

Upvotes: 0

Views: 949

Answers (1)

piRSquared
piRSquared

Reputation: 294458

Consider the txt proxy for a csv file

from io import StringIO
import pandas as pd

txt = """col1,col2,col3
1,01,a
2,02,b
3,03,c"""

Create a dictionary where keys are the ordinal positions of the columns and the values are the callables that are called on those columns. The catch is that I need to know the number of columns. Here, I know there are 3.

converters = dict(enumerate([str] * 3))
print(converters)

{0: <class 'str'>, 1: <class 'str'>, 2: <class 'str'>}

Then we pass that to pd.read_csv

df = pd.read_csv(StringIO(txt), converters=converters)

print(df)

  col1 col2 col3
0    1   01    a
1    2   02    b
2    3   03    c

And

print(df.dtypes)

col1    object
col2    object
col3    object
dtype: object

Upvotes: 2

Related Questions