user7970547
user7970547

Reputation: 147

Data Cleanup: Eliminating values contained in csv file from result

I want my final data to not have elements of the initial test data which I want to clean up. The process of copying and pasting data in the code has been extremely tedious and gets complicated as more and more criteria is added.

Original Values:

(1, 2, 3), (1, 2, 4),  (1, 2, 5), (1, 3, 4), (1, 3, 5)
(1, 4, 5), (2, 3, 4),  (2, 3, 5), (2, 4, 5), (3, 4, 5)

I want a combination which excludes the combinations contained in Test.csv

(1,2,3),   (2,3,4),     (3,4,5),

Expected Values

(1, 2, 4),
(1, 2, 5),
(1, 3, 4),
(1, 3, 5),
(1, 4, 5),
(2, 3, 5),
(2, 4, 5)

Code Attempt 1

a = [1,2,3,4,5]

import csv

with open('Test.csv', newline='') as myFile:  
    reader = csv.reader(myFile)
    list_a = list(reader)

combo_a = [(p,q,r) for p in a for q in a for r in a
                 if q > p and r > q and r > p
                 and (p,q,r) not in list_a]

print (combo_a)

Code Attempt 2

 a = [1,2,3,4,5]

import csv

with open('Test.csv', newline='') as myFile:  
    reader = csv.reader(myFile)
    list_a = list(map(tuple, reader))

combo_a = [(p,q,r) for p in a for q in a for r in a
                 if q > p and r > q and r > p
                 and (p,q,r) not in list_a]

print (combo_a)

Both Codes output Incorrect Result

(1, 2, 3),
(1, 2, 4),
(1, 2, 5),
(1, 3, 4),
(1, 3, 5),
(1, 4, 5),
(2, 3, 4),
(2, 3, 5),
(2, 4, 5),
(3, 4, 5),

Upvotes: 0

Views: 82

Answers (4)

C.Nivs
C.Nivs

Reputation: 13106

Part of the problem is that you aren't actually processing int's, you are processing lists of strings, because the tuples are comma delimited along with the entries themselves:

from io import StringIO
import csv

c = """(1,2,3),   (2,3,4),     (3,4,5),"""

fh = StringIO(c, newline='')

reader = csv.reader(fh)
next(reader)

# ['(1', '2', '3)', '   (2', '3', '4)', '     (3', '4', '5)', '']

This is not a list of tuples, so to get it to be one:

import ast
from io import StringIO # this simulates your file handle

fh = StringIO(c, newline='')

# it's only one line, so call next(fh)
lst = ast.literal_eval(f"[{next(fh)}]")

# [(1, 2, 3), (2, 3, 4), (3, 4, 5)]

Where ast will process them into their native data structures. Translated into your code:

import ast

with open('Test.csv', newline='') as fh:
    list_a = ast.literal_eval(f"[{next(fh)}]")

Now list_a is a list of tuples of integers. Then you can just exclude ones in the list:

from itertools import combinations

checked = set()

for c in combinations(list(range(1,6)), 3):
    a = tuple(sorted(c))
    if a not in list_a and a not in checked:
        print(a)
        checked.add(a)



Upvotes: 0

Andrej Kesely
Andrej Kesely

Reputation: 195408

With contents of the file.csv:

(1,2,3),   (2,3,4),     (3,4,5),

and using csv and ast.literal_eval:

a = [1,2,3,4,5]

import csv
from ast import literal_eval
from itertools import combinations

excluded = set()
with open('file.csv', newline='') as myFile:
    reader = csv.reader(myFile, delimiter=' ')
    for row in reader:
        l = list(map(literal_eval, [val for val in row if val]))
        excluded.update(tuple(i[0]) for i in l)

print(',\n'.join(map(str, sorted(set(combinations(a, 3)) - excluded))))

Prints:

(1, 2, 4),
(1, 2, 5),
(1, 3, 4),
(1, 3, 5),
(1, 4, 5),
(2, 3, 5),
(2, 4, 5)

Upvotes: 2

mujjiga
mujjiga

Reputation: 16856

Looks like your list_a is a tuple of strings, not integers. So if your

list_a = [('1', '2', '3'), ('2', '3', '4'), ('3', '4', '5')]

Then convert it into integers using

list_a = [tuple(map(int, i)) for i in list_a]

Once it is in the form of a list of integer tuples then you can proceed with your combo_a operation.

Upvotes: 1

Matthew Mulhall
Matthew Mulhall

Reputation: 53

So you are trying to filter specific values?

What I would do is keep a list with the values you don't want. After that just check if the tuples you are filter are in the list.

So load all values from test.csv into your list.

dont_want = [some set of tuples you dont want]


combo_a = [(p,q,r) for p in a for q in a for r in a
                 if (p,q,r) not in dont_want]

Forgive me if I misinterpreted your problem but I think I know what you're asking.

Upvotes: 0

Related Questions