salma
salma

Reputation: 55

How can i delete missing values in my csv_file

How can i delete missing values in my csv_file,either in a,b,c...: This is my coding:

import numpy as np
FNAME ="C:/Users/lenovo/Desktop/table.csv"

my_data = np.genfromtxt (FNAME, delimiter = ',')
a= my_data [:,0]
b= my_data [:,1]
c= my_data [:,2]
d=my_data[:,3]
e= my_data[:,4]
f= my_data[:,5]
g= my_data[:,6]

An extract of my csv_file:

0,1,135,3,82,4,1
0,1,98,5,82,3,1
21175,1,98,5,82,3,1
9147,2,80,5,82,2,2
1829,2,80,5,82,2,2
3659,2,80,5,82,2,2
10976,2,80,5,82,2,2
0,2,40,2,24,1,2
0,2,40,2,24,1,2
29710,2,40,2,24,1,2
0,1,90,3,31,2,2
0,1,90,3,31,2,2
11434,1,90,3,31,2,2
0,2,85,4,72,3,2
6039,2,85,4,72,3,2
34758,1,100,,52,,
0,1,100,,52,,

Thanx

Upvotes: 0

Views: 3865

Answers (2)

Felix Zumstein
Felix Zumstein

Reputation: 7070

Pandas has a built-in method for this:

from pandas import DataFrame, read_csv

FNAME ="C:/Users/lenovo/Desktop/table.csv"
df = read_csv(FNAME, header=None, index_col=None)
print df.dropna()

Upvotes: 1

Greg
Greg

Reputation: 12234

It isn't clear from your question if you want to delete just values from the columns (which sounds wrong to me) or from all the columns. Either way it is better to use the power of genfromtxt. I recommend you read this marvellous guide or just the docs.

In there you will find an argument missing values with this you could specify how you want to handle such occurrences when it is imported. There are many different ways to do this but one example could be using the fact that genfromtxt replaces missing floats with nan. Checking for the occurrence of nan in a row and disregarding if true:

import numpy as np
from StringIO import StringIO

data = """
0,4,1
34758,1,100
52,,
"""

my_data = np.genfromtxt(StringIO(data), delimiter=",")

index_to_use=[]
for i, row in enumerate(my_data):
    if True not in np.isnan(row):
        index_to_use.append(i)

print my_data[index_to_use]

>>>
[[  0.00000000e+00   4.00000000e+00   1.00000000e+00]
[  3.47580000e+04   1.00000000e+00   1.00000000e+02]]

For readability I have reduced your data sample.

Upvotes: 0

Related Questions