Reputation: 55
How can i delete missing values in my csv_file,either in a,b,c...: This is my coding:
import numpy as np
FNAME ="C:/Users/lenovo/Desktop/table.csv"
my_data = np.genfromtxt (FNAME, delimiter = ',')
a= my_data [:,0]
b= my_data [:,1]
c= my_data [:,2]
d=my_data[:,3]
e= my_data[:,4]
f= my_data[:,5]
g= my_data[:,6]
An extract of my csv_file:
0,1,135,3,82,4,1
0,1,98,5,82,3,1
21175,1,98,5,82,3,1
9147,2,80,5,82,2,2
1829,2,80,5,82,2,2
3659,2,80,5,82,2,2
10976,2,80,5,82,2,2
0,2,40,2,24,1,2
0,2,40,2,24,1,2
29710,2,40,2,24,1,2
0,1,90,3,31,2,2
0,1,90,3,31,2,2
11434,1,90,3,31,2,2
0,2,85,4,72,3,2
6039,2,85,4,72,3,2
34758,1,100,,52,,
0,1,100,,52,,
Thanx
Upvotes: 0
Views: 3865
Reputation: 7070
Pandas has a built-in method for this:
from pandas import DataFrame, read_csv
FNAME ="C:/Users/lenovo/Desktop/table.csv"
df = read_csv(FNAME, header=None, index_col=None)
print df.dropna()
Upvotes: 1
Reputation: 12234
It isn't clear from your question if you want to delete just values from the columns (which sounds wrong to me) or from all the columns. Either way it is better to use the power of genfromtxt
. I recommend you read this marvellous guide or just the docs.
In there you will find an argument missing values
with this you could specify how you want to handle such occurrences when it is imported. There are many different ways to do this but one example could be using the fact that genfromtxt
replaces missing floats with nan
. Checking for the occurrence of nan
in a row and disregarding if true:
import numpy as np
from StringIO import StringIO
data = """
0,4,1
34758,1,100
52,,
"""
my_data = np.genfromtxt(StringIO(data), delimiter=",")
index_to_use=[]
for i, row in enumerate(my_data):
if True not in np.isnan(row):
index_to_use.append(i)
print my_data[index_to_use]
>>>
[[ 0.00000000e+00 4.00000000e+00 1.00000000e+00]
[ 3.47580000e+04 1.00000000e+00 1.00000000e+02]]
For readability I have reduced your data sample.
Upvotes: 0