Reputation: 5344
I have got two lists. The first one contains names and second one names and corresponding values. The names of the first list in a subset of the name of the name of the second lists. The values are a true or false. I want to find the co-occurrences of the names of both lists and count the true values. My code:
data1 = [line.strip() for line in open("text_files/first_list.txt", 'r')]
ins = open( "text_files/second_list.txt", "r" ) # the "r" is not really needed - default
parseTable = []
for line in ins:
row = line.rstrip().split(' ') # <- note use of rstrip()
parseTable.append(row)
new_data = []
indexes = []
for index in range(len(parseTable)):
new_data.append(parseTable[index][0])
indexes.append(parseTable[index][1])
in1 =return_indices_of_a(new_data, data1)
def return_indices_of_a(a, b):
b_set = set(b)
return [i for i, v in enumerate(a) if v in b_set] #return the co-occurrences
I am reading both text files which containing the lists, i found the co-occurrences and then I want to keep from the parseTable[][1] only the in1 indices . Am I doing it right? How can I keep the indices I want? My two lists:
['SITNC', 'porkpackerpete', 'teensHijab', '1DAlert', 'IsmodoFashion',....
[['SITNC', 'true'], ['1DFAMlLY', 'false'], ['tibi', 'true'], ['1Dneews', 'false'], ....
Upvotes: 1
Views: 703
Reputation: 8441
Here's a one liner to get the matches:
matches = [(name, dict(values)[name]) for name in set(names) if name in dict(values)]
and then to get the number of true matches:
len([name for (name, value) in matches if value == 'true'])
Edit
You might want to move dict(values)
into a named variable:
value_map = dict(values)
matches = [(name, value_map[name]) for name in set(names) if name in value_map]
Upvotes: 2
Reputation: 12713
If you need just the sum of true
values, then use in
operator and list comprehension:
In [1]: names = ['SITNC', 'porkpackerpete', 'teensHijab', '1DAlert', 'IsmodoFashion']
In [2]: values = [['SITNC', 'true'], ['1DFAMlLY', 'false'], ['tibi', 'true'], ['1Dneews', 'false']]
In [3]: sum_of_true = len([v for v in values if v[0] in names and v[1] == "true"])
In [4]: sum_of_true
Out[4]: 1
To get also indices of co-occurrences, this one-liner may come in handy:
In [6]: true_indices = [names.index(v[0]) for v in values if v[0] in names and v[1] == "true"]
In [7]: true_indices
Out[7]: [0]
Upvotes: 1
Reputation: 34292
There are two ways, one is what Andrey suggests (you may want to convert names
to set
), or, alternatively, convert the second list into a dictionary:
mapping = dict(values)
sum_of_true = sum(mapping[n] for n in names)
The latter sum
works because bool
is essentially int
in Python (True == 1
).
Upvotes: 1