Reputation: 46909
I have the following two arrays , i am trying to see whether if the elements in invalid_id_arr exists in valid_id_arr if it doesn't exist then i would form the diff array.But from the below code i see the following in diff array ['id123', 'id124', 'id125', 'id126', 'id789', 'id666']
, i expect the output to be ["id789","id666"]
what am i doing wrong here
tag_file= {}
tag_file['invalid_id_arr']=["id123-3431","id124-4341","id125-4341","id126-1w","id789-123","id666"]
tag_file['valid_id_arr']=["id123-12345","id124-1122","id125-13232","id126-12332","id1new","idagain"]
diff = [ele.split('-')[0] for ele in tag_file['invalid_id_arr'] if str(ele.split('-')[0]) not in tag_file['valid_id_arr']]
Current Output:
['id123', 'id124', 'id125', 'id126', 'id789', 'id666']
Expected ouptut:
["id789","id666"]
Upvotes: 1
Views: 156
Reputation: 2770
>>> a = ["id123-3431","id124-4341","id125-4341","id126-1w","id789-123","id666"]
>>> b = ["id123-12345","id124-1122","id125-13232","id126-12332","id1new","idagain"]
>>> c = (s.split('-')[0] for s in b)
>>> [ele.split('-')[0] for ele in a if str(ele.split('-')[0]) not in c]
['id789', 'id666']
>>>
Upvotes: 0
Reputation: 214949
Try sets
:
invalid_id_arr = ["id123-3431","id124-4341","id125-4341","id126-1w","id789-123","id666"]
valid_id_arr = ["id123-12345","id124-1122","id125-13232","id126-12332","id1new","idagain"]
set_invalid = set(x.split('-')[0] for x in invalid_id_arr)
print set_invalid.difference(x.split('-')[0] for x in valid_id_arr)
Upvotes: 3
Reputation: 1336
Using a set is more efficient, but your main problem is that you weren't removing the second half of the elements in valid_id_arr.
invalid_id_arr=["id123-3431","id124-4341","id125-4341","id126-1w","id789-123","id666"]
valid_id_arr=["id123-12345","id124-1122","id125-13232","id126-12332","id1new","idagain"]
valid_id_set = set(ele.split('-')[0] for ele in valid_id_arr)
diff = [ele for ele in invalid_id_arr if ele.split('-')[0] not in valid_id_set]
print diff
output:
['id789-123', 'id666']
Upvotes: 4