Reputation: 397
I have a problem about a logic on Python array, I don't know if I need to use 2D array. Let say, I have a data that was retrieved from the database and I wanted to compare each row that was retrieved (for example I wanted to compare row1 and row2 then row1 and row3), I think I will need to use a for loop there and here's the added condition:
if row1 == row2
:
I need to append two array index value(ex. row1[1] and row1[2]) to an empty array (ex. I declared an empty array from the beginning), wherein every time a row matches from the data retrieved from the database it will append the two values to that empty string until it finishes to compare all data rows retrieved.
And just in case if that two values already exist in the array where I used to append the two values it will not append.
sample:
emp_arr = [] #empty list
#code here
# if there are matches from the rows retrieved from database,the value of
# emp_arr probably
emp_arr = [[2,3], [5,9], [3,7], [2, 5]]
# note:there should be no the same list index value inside(ex. emp_arr = [[2,3],
# [5,9], [3,7], [2, 3]]---this should not happen so i need to a condition first
# before making an append)
Thanks in advance guys.
Upvotes: 1
Views: 283
Reputation: 104712
Its not clear if you're asking for values from the matching rows or if you want the indexes of those rows. I'm assuming you want the indexes, which means my answer is notably different than J.F. Sebastian's, which is probably the best if you want the values.
If you do want the indexes, it's not clear how you want to deal with multiple matches. If row[1] == row[2] == row[3] you could get [1,2], [1,3] and [2,3] as matching indexes, or you might want only one of those. I'm assuming that you want only one of them, and that it doesn't especially matter which (both functions below will always provide [1,2] and not the others, though they could be modified to prefer a different pair if necessary).
Here's a similar approach that explicitly loops over the indexes, skipping any that have been matched already:
def findMatchedRowPairsWithoutDuplicates(rows):
matched = set()
result = []
for i in range(len(rows)):
if i in matched:
continue
for j in range(i+1, len(rows)):
if j in matched:
continue
if row[i] == row[j]:
result.append([i,j])
matched.add(i)
matched.add(j)
break # can't match with the current i again!
return result
Here's an alternative implementation that exploits sorting to potentially find the duplicates faster (time complexity O(N log(N)) rather than O(N2)), but it requires that your row values have a partial ordering (that is, row1 < row2
must be defined). That's probably true for most kinds of database values, but perhaps not always guaranteed by a given library's implementation. The key to understanding this code is that the indexes of equal rows will always be adjacent in the indexes
list after it has been sorted, so we only need to check each adjacent index pair rather than all pairs.
findMatchedRowPairsWithoutDuplicates2(rows):
indexes = list(range(len(rows)))
indexes.sort(key=lambda index: rows[index])
results = []
i = 0
while i < len(indexes)-1:
if rows[indexes[i]] == rows[indexes[i+1]]:
results.append(indexes[i], [indexes[i+1])
i += 2
else:
i += 1
return results
Upvotes: 1
Reputation: 414235
It seems you'd like to do something like this pseudo-SQL:
SELECT DISTINCT left_tbl.some_column, left_tbl.another_column
FROM table_name left_tbl, table_name right_tbl
WHERE left_tbl.* = right_tbl.*
AND left_tbl.id != right_tbl.id
-- where * is everything except id column
In Python (all rows that was retrieved are in rows
iterable):
from itertools import combinations
result = set((row1[1], row1[2])
for row1, row2 in combinations(rows, 2)
if row1 == row2)
Upvotes: 1