Python to evaluate duplicate elements in csv files

Question

I have 2 csv files:

csv 1:

CHANNEL
3
3
4
1
2
1
4
5

csv 2:

CHANNEL
1
2
2
3
4
4
4
5

I want to evaluate the state of the channel by finding the duplicate channels. If number of channel > 1, the state is 0, else the state is 1.

output csv:

index  channel 1  channel 2  channel 3  channel 4  channel 5
  1        0          1         0          0           0
  2        1          0         1          0           1

So far I have counted the duplicate channels but for 1 file only. Now I don't know how to read 2 csv files and create the output file.

import csv
import collections

with open("csvfile.csv") as f:
    csv_data = csv.reader(f,delimiter=",")
    next(csv_data)
    count = collections.Counter()
    for row in csv_data:
        channel = row[0]
        count[channel] += 1
    for channel, nb in count.items():
        if nb>1:

Mike67 · Accepted Answer

You can read each file into a list then check the channel counts of each list.

Try this code:

ss1 = '''
CHANNEL
3
3
4
1
2
1
4
5
'''.strip()

ss2 = '''
CHANNEL
1
2
2
3
4
4
4
5
'''.strip()


with open("csvfile1.csv",'w') as f: f.write(ss1)  # write test file 1
with open("csvfile2.csv",'w') as f: f.write(ss2)  # write test file 2

#############################

with open("csvfile1.csv") as f:
   lines1 = f.readlines()[1:]  # skip header
   lines1 = [int(x) for x in lines1] # convert to ints
   
with open("csvfile2.csv") as f:
   lines2 = f.readlines()[1:]  # skip header
   lines2 = [int(x) for x in lines2] # convert to ints

lines = [lines1,lines2] # make list for iteration

state = [[0]*5,[0]*5]  # default zero for each state

for ci in [0,1]: # each file 
   for ch in range(5):  # each channel
      state[ci][ch] = 0 if lines[ci].count(ch+1) > 1 else 1 # check channel count, set state

# write to terminal
print('Index','Channel 1','Channel 2','Channel 3','Channel 4','Channel 5', sep = '  ')
print('  ',1,'     ','          '.join(str(c) for c in state[0]))
print('  ',2,'     ','          '.join(str(c) for c in state[1]))
    
# write to csv
with open('state.csv','w') as f:
   f.write('Index,Channel 1,Channel 2,Channel 3,Channel 4,Channel 5
')
   f.write('1,' + ','.join(str(c) for c in state[0]) + '
')
   f.write('2,' + ','.join(str(c) for c in state[1]) + '
')

Output (terminal)

Index  Channel 1  Channel 2  Channel 3  Channel 4  Channel 5
   1       0          1          0          0          1
   2       1          0          1          0          1

Output (state.csv)

Index,Channel 1,Channel 2,Channel 3,Channel 4,Channel 5
1,0,1,0,0,1
2,1,0,1,0,1

Python to evaluate duplicate elements in csv files

Answers (2)

Related Questions