Reputation: 48916
I have the following list:
[[1.01782362e-05 1.93798303e-04 7.96163586e-05 5.08812627e-06
1.39600188e-05 3.94912873e-04 2.33748418e-04 1.22856018e-05]]
When I return its type, I get:
<class 'str'>
Is the reason for that the scientific notation used for instance (i.e. e-04)?
In this case, how can I convert the above list to an integer or float?
Thanks.
EDIT
The above list snippet comes from this CSV file under the "Feature" column.
Upvotes: 0
Views: 89
Reputation: 16184
It looks a lot like you have NumPy's string representation of an array. As I linked above, there doesn't seem to be a nice way of parsing this back, but in your case it might not matter, Pandas and Numpy can sort of get there reasonably easily:
import pandas as pd
import numpy as np
# read in the data
df = pd.read_csv("features_thresholds.csv")
# use numpy to parse that column
df.Feature = df.Feature.apply(lambda x: np.fromstring(x[2:-2], sep=' '))
note that the x[2:-2]
is trimming off the leading [[
and trailing ]]
, otherwise it's mostly standard Pandas usage that most data science tutorials will go through.
Upvotes: 1
Reputation: 702
We can use python ast (Abstract Syntax Tree) to process it efficiently
import ast
x = '[[1.01782362e-05 1.93798303e-04 7.96163586e-05 5.08812627e-06 1.39600188e-05 3.94912873e-04 2.33748418e-04 1.22856018e-05]]'
x = ast.literal_eval(x.replace(" ",","))
print(x)
Upvotes: 0
Reputation: 51998
What you posted must be part of a string literal:
s = '[[1.01782362e-05 1.93798303e-04 7.96163586e-05 5.08812627e-06 1.39600188e-05 3.94912873e-04 2.33748418e-04 1.22856018e-05]]'
In which case
list(map(float, s.lstrip('[').rstrip(']').split()))
evaluates to
[1.01782362e-05, 0.000193798303, 7.96163586e-05, 5.08812627e-06, 1.39600188e-05, 0.000394912873, 0.000233748418, 1.22856018e-05]
Upvotes: 0