dumbledad
dumbledad

Reputation: 17527

Converting string containing nan to a numpy float array

I have strings like these

'[15, 8.0, 5.0, 5.0, nan, nan, nan, nan, nan, nan]'
'[8, 6.0, nan, nan, nan, nan, nan, nan, nan, nan]'

Which I would like to convert to Numpy arrays of floats. Is there an easy conversion function?

I have tried json.loads, but that seems to fail at the NaNs even if I string replace them with np.nan. I have tried stripping off the brackets and using numpy's from string but that fails too.

Upvotes: 1

Views: 735

Answers (1)

Jussi Nurminen
Jussi Nurminen

Reputation: 2408

Quick and dirty solution:

from numpy import nan, array
a = '[15, 8.0, 5.0, 5.0, nan, nan, nan, nan, nan, nan]'
arr = array(eval(a))

This brings nan into the namescape and then evaluates a as a Python expression. The result is a list that can be readily converted into a numpy array.

Be aware that using eval is risky if your strings come from an untrusted source (such as user input), since it may lead to arbitrary code execution. A safer version would be:

import ast
import numpy as np

def convert(s):
    # replace the nan values so that ast.literal_eval() can interpret them
    s = s.replace('nan', 'None')
    # safely evaluate s as Python expression
    l = ast.literal_eval(s)
    # replace the Nones back to np.nans
    l = [x if x is not None else np.nan for x in l]
    return np.array(l)

Then:

a = '[15, 8.0, 5.0, 5.0, nan, nan, nan, nan, nan, nan]'
convert(a)

returns

array([15.,  8.,  5.,  5., nan, nan, nan, nan, nan, nan])

Upvotes: 1

Related Questions