Zed
Zed

Reputation: 113

Using numpy genfromtxt to load data with special characters as missing values

I am trying to load data with "#" as missing value

data

I am using genfromtxt

genfromtxt("data.txt",missing_values="#",filling_values=0)

but I keep getting the following error

raise ValueError(errmsg) ValueError: Some errors were detected ! Line #25 (got 1 columns instead of 5)

I have tried to fill the missing values as can be seen in the code above but keeping this error

Upvotes: 1

Views: 669

Answers (2)

Warren Weckesser
Warren Weckesser

Reputation: 114956

The default comments argument of genfromtxt is #, so you need to change that. Also, make filling_values the string "0".

For example,

In [30]: !cat data_missing.txt 
1.23e+00 4.56e+01 2.00e+00
7.89e+01 #.##e+## 3.00e+00

In [31]: a = np.genfromtxt("data_missing.txt",
   ....:                   comments=None,
   ....:                   missing_values="#.##e+##",
   ....:                   filling_values="0")

In [32]: a
Out[32]: 
array([[  1.23,  45.6 ,   2.  ],
       [ 78.9 ,   0.  ,   3.  ]])

Upvotes: 1

sashkello
sashkello

Reputation: 17871

You need to specify the column separator (I assume, in your case it is tabular):

genfromtxt(delimiter = "\t")

Also, your missing values are missing_values="#.########e+##"

Upvotes: 0

Related Questions