Reputation: 3
I'm writing a small API listening program, and I'm trying to figure out when something new has been published. I've figured out most of it, but I'm having a problem on the last step -- where I want to print out something new. I can compare the two lists of items as sets and get the set of letters that's in the right answer, but I can't seem to get the actual strings to print.
Here's the code I wrote to compare the two lists (both new_revised_stuff
and old_revised_stuff
are lists of strings, like "Bob likes to eat breakfast at http://bobsburgers.com"
with a few dozen items per list).
new_stuff = set(new_revised_stuff) - set(old_revised_stuff).intersection(new_revised_stuff)
Which returns:
set('b','o','l'...)
I can get rid of the 'set' notation by writing:
list(new_stuff)
But that doesn't really help. I'd really like it to print out "Bob likes..." if that's a new line.
I've also tried:
new_stuff = []
for a in new_revised_stuff:
for b in old_revised_stuff:
if a != b:
''.join(a)
new_stuff.append(a)
Which results in an actual stack overflow, so it's obviously bad code.
Upvotes: 0
Views: 125
Reputation: 365835
If you want to join any iterable of single characters into a string, you do that with ''.join(new_stuff)
. For example:
>>> new_stuff = ['b','o','l']
>>> ''.join(new_stuff)
'bol'
However, there are two problems here, that are inherent in your design:
"Hello, Bob"
, there's only going to be one o
and one l
in the set of diffs."Bob likes"
, converting that into a set and then back to a string will get you something like 'k iboeBls'
.If either of those is a problem (and I suspect they are), you need to rethink your algorithm. You can solve the second one by using an OrderedSet
(there's a recipe for that in the collections
docs), but the first one is going to be more of a problem.
So, how could you do this?
Well, you don't really need new_revised_stuff
to be a set; if you iterate over the characters and keep only the ones that aren't in old_revised_stuff
, as long as old_revised_stuff
is a set, that's just as efficient as intersecting two sets.
But making old_revised_stuff
a set will also eliminate any duplicates there, which I don't think you want. What you really want is a "multiset". In Python, the best way to represent that is usually a Counter
.
So, I think what you want (maybe) is something like this:
old_string = ' to eat breakfast at http://bobsburgers.com'
new_string = 'Bob likes to eat breakfast at http://bobsburgers.com'
old_chars = collections.Counter(old_string)
new_chars = []
for ch in new_string:
if old_chars[ch]:
old_chars[ch] -= 1
else:
new_chars.append(ch)
new_string = ''.join(new_chars)
Upvotes: 1