aloha
aloha

Reputation: 4784

Use For loop in an If statement

Problem

I have a table made of 380 rows and 20 columns. I want to remove rows from this table following a certain condition.

To clarify things, let's say I have the list:

names = ['John', 'Amy', 'Daniel']

I want to remove the data of all the people whose name is found in the list names.

Example, let's say my data looks something like this:

John    82    3.12    boy
Katy    12    1.12    girl
Amy     42    2.45    girl
Robert  32    1.56    boy
Daniel  47    2.10    boy

I want to remove the data of John, Amy, and Daniel. So the output should be:

Katy    12    1.12    girl
Robert  32    1.56    boy

Attempt to solve it

import csv
import numpy as np

# loading data
data = np.genfromtxt('file.txt', dtype = None)

csvfile = "home/paula/Desktop/test.txt"
with open(csvfile, 'w') as output:
    writer = csv.writer(output, delimiter = '\t')

    for row in range(len(data)):
        if data[row][0] == (i for i in names):
            print 'removing the data of', i, '...'
        else:
            writer.writerow([data[row][0], data[row][1], 
                             data[row][2], data[row][3]])

My code is working, however the data was not deleted from my original data. When I open the new test.txt file, I can see that the data was not deleted.

I am certain that the bug is in if data[row][0] == (i for i in names): How can I fix this?

Upvotes: 0

Views: 131

Answers (3)

YXD
YXD

Reputation: 32521

The condition should be written:

if data[row][0] in names:

In your current code, (i for i in names) creates a generator and you are then testing if the string is equal to the generator object, which will be false

>>> (i for i in names)
<generator object <genexpr> at 0x1060564b0>
>>> 'John' == (i for i in names)
False
>>>

Instead, you can test if an item is in a list as follows

>>> names = ['John', 'Amy', 'Daniel']
>>> 'John' in names
True
>>> 'Bob' in names
False
>>>

As mentioned in the comments, you can make this check more efficient by converting names to a set before iterating over the rows. But ideally you would use the Pandas library to manipulate csv/table data. See this answer for a similar example. You can negate the condition with df[~df.Name.isin(...)].

Upvotes: 4

Sakib Ahammed
Sakib Ahammed

Reputation: 2480

if data[row][0] == (i for i in names):
            print 'removing the data of', i, '...'

in that portion i is use in (i for i in names) as a local veriable. But in next print line you use i. Here you can not use this.

you can use for check as if data[row][0] in names:. You can try like:

if data[row][0] ==  names:
            print 'removing the data of', data[row][0], '...'

Upvotes: 0

KSFT
KSFT

Reputation: 1774

You're checking whether data[row][0] is the same as (i for i in names). What you want to do is check whether it's the same as one of the elements of (i for i in names). You could do that this way:

any([data[row][0]==i for i in names])

You could also do it the non-ridiculous way, with the in operator:

data[row][0] in names

This checks whether any of the elements of names is the same as data[row][0].

Upvotes: 0

Related Questions