Reputation: 711
I am new to Python, and I want to determine the type of each column in a data frame, I wrote the code below, but the results are not as expected, I only get 'object' for type.
This is my data frame (just the first 7 th column):
IDINDUSANALYSE IDINDUS IDINDUSEFFLUENT DATEANALYSE IDTYPEECHANTILLON IDPRELEVEUR IDLABO IDORIGINEVAL CONFORME CONFCALC IDINDDOSS CONFFORCE
672 635 6740 10/01/13 2 1 3 1 1 1 531 0
673 635 6740 11/01/13 2 1 3 1 1 1 531 0
674 635 6740 14/01/13 2 1 3 1 1 1 531 0
675 635 6740 15/01/13 2 1 3 1 1 1 531 0
676 635 6740 16/01/13 2 1 3 1 1 1 531 0
677 635 6740 18/01/13 2 1 3 1 1 1 531 0
This is my code:
import pandas as pd
import csv
with open("/home/***/Documents/Table3.csv") as f:
r = csv.reader(f)
df = pd.DataFrame().from_records(r)
for index, row in df.iterrows():
print(df.dtypes)
As a result I get this :
0 object
1 object
2 object
3 object
4 object
Please tell we what I did wrong ?
Upvotes: 0
Views: 1312
Reputation: 33940
Please show your actual CSV file. If all columns were stored as object
, it seems like they were detected as string, probably because your CSV file quotes each field. But post your actual CSV file.
To read in quoted fields in pandas and convert them back to their type (numeric/categorical), do either of:
pd.read_csv(..., quoting = pd.QUOTE_ALL)
pd.read_csv(..., quoting = pd.QUOTE_NONNUMERIC)
and read the section 'quoting' in https://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html
But also it's a good practice to explicitly pass pd.read_csv(..., dtype={...}
a dictionary telling it which type to use for each column name.
e.g. {‘a’: np.float64, ‘b’: np.int32}
Upvotes: 0
Reputation: 165
Try this
import pandas as pd
df = pd.read_csv("/home/***/Documents/Table3.csv")
types = [df['{0}'.format(i)].dtype for i in df.columns]
print(types)
which results as
[dtype('float64'), dtype('O'), dtype('O')]
Considering your actual dataframe has 4 columns yet you got object
as result 5 times, which was your first hint for you.
Upvotes: 1
Reputation: 1202
types = df.columns.to_series().groupby(df.dtypes).groups
Then print out types
, and you would get all of the column types (grouped by type).
Also, you can open the .csv file directly to a data frame using: pd.read_csv(filepath)
If you want a specific column's type - df.column.dtype
or df['column'].dtype
Upvotes: 1