Reputation: 5569
I have the following parameters:
param_grid = dict(par1 = [0.1, 1.1, 1.2],
par2 = [3, 4, 5],
par3 = [6, 7, 8])
I would like to create a table with all the possible combination of parameters. I tried with the following code
hyperParamSpace = pd.DataFrame([row for row in itertools.product(*param_grid.values())],
columns=param_grid.keys())
When I take the first combination with
hyperParamSpace.iloc[1]
it converts all the parameters in floats:
par3 6.0
par2 3.0
par1 1.1
Name: 1, dtype: float64
How can I keep the integer as integer type?
Upvotes: 1
Views: 607
Reputation: 4290
The reason it does that is because each column of the DataFrame
in pandas
is essentially a numpy
array. The elements of the array must be all of the same type, otherwise it loses a lot of its computational advantages. Therefore, if one of the elements in a column is a float, it will automatically convert all of the elements to floats.
You can control dtype
of the array, and by extension, the DataFrame
, manually and set it to int
, but you will lose your floats in this case.
However, in your example elements of the columns with ints are actually of the type int64
(you can verify it by running hyperParamSpace.par2.dtype
), but when you slice a row with iloc
, it converts them to floats in the output, because of the same principle: to create an array, where all elements have same type.
What you can do to avoid the conversion is to specify dtype
of your DataFrame
as object
:
hyperParamSpace = pd.DataFrame([row for row in itertools.product(*param_grid.values())],
columns=param_grid.keys(), dtype=object)
This will drastically decrease the efficiency, but since your parameter table is small, it shouldn't be a problem.
Upvotes: 4