Reputation: 8988
I need to check if a string can be created with a list of characters and return True or False.
I am using different solutions with list.count or collections.Counter.
I am also using this solution which I dont need to read through the list of characters:
def check(message, characters):
try:
[list(characters).remove(m) for m in message]
return True
except:
return False
Is there a fastest way? for a very very very big list of characters. Counter and list count seems slower. Dont know if there is a fast pythonic way to do this.
Example:
message = "hello"
characters = "hheellooasdadsfgfdgfdhgfdlkgkfd"
check(message, characters) # this should return True or False
# characters can be a veeeeery long string
Duplicates matter so for example characters = "hheloo" would not work for message = "hello"
Upvotes: 3
Views: 125
Reputation: 1975
There is maybe a faster way of doing this, apparently due to the cost of creating the all() generator (Why is Python's 'all' function so slow?) perhaps a for loop is faster, Expanding on @eugene y's answer:
from collections import Counter
import time
message = "hello"
characters = "hheeooasdadsfgfdgfdhgfdlkgkfd"
def check1(message,characters):
c1 = Counter(characters)
c2 = Counter(message)
c1.subtract(c2)
return all(v > -1 for v in c1.values())
def check2(message,characters):
c1 = Counter(characters)
c2 = Counter(message)
c1.subtract(c2)
for v in c1.values():
if v < 0:
return False
return True
st = time.time()
for i in range(350000):
check1(message,characters)
end = time.time()
print ("all(): "+str(end-st))
st = time.time()
for i in range(350000):
check2(message,characters)
end = time.time()
print ("for loop: "+str(end-st))
results:
all(): 5.201688051223755
for loop: 4.864434719085693
Upvotes: 1
Reputation: 575
Here is another solution compared to eugene's solution and jbndlr's solution.
def test1(input_word, alphabet):
alp_set = set(list(alphabet))
in_set = set(list(input_word))
return in_set.issubset(alp_set)
def test2(input_word, alphabet):
c1 = collections.Counter(alphabet)
c2 = collections.Counter(input_word)
c1.subtract(c2)
return all(v >= 0 for v in c1.values())
def check(msg, chars):
c = list(chars) # Creates a copy
try:
for m in msg:
c.remove(m)
except ValueError:
return False
return True
input_word = "hello"
alphabet = "hheellooasdadsfgfdgfdhgfdlkgkfd"
start_time = time.time()
for i in range(10000):
test1(input_word,alphabet)
print("--- %s seconds ---" % (time.time() - start_time))
start_time = time.time()
for i in range(10000):
test2(input_word,alphabet)
print("--- %s seconds ---" % (time.time() - start_time))
start_time = time.time()
for i in range(10000):
check(input_word,alphabet)
print("--- %s seconds ---" % (time.time() - start_time))
>> --- 0.03100299835205078 seconds ---
>> --- 0.24402451515197754 seconds ---
>> --- 0.022002220153808594 seconds ---
⇒ jbndlr's solution is the fastest - for this test case.
Another testcase:
input_word = "hellohellohellohellohellohellohellohellohellohellohellohellohello"
alphabet =
"hheellooasdadsfgfdgfdhgfdlkgkfdhheellooasdadsfgfdgfdhgfdlkgkfdhheellooasdadsfgfdgfdhgfdlkgkfdhheellooasdadsfgfdgfdhgfdlkgkfdhheellooasdadsfgfdgfdhgfdlkgkfdhheellooasdadsfgfdgfdhgfdlkgkfdhheellooasdadsfgfdgfdhgfdlkgkfdhheellooasdadsfgfdgfdhgfdlkgkfdhheellooasdadsfgfdgfdhgfdlkgkfdhheellooasdadsfgfdgfdhgfdlkgkfdhheellooasdadsfgfdgfdhgfdlkgkfdhheellooasdadsfgfdgfdhgfdlkgkfdhheellooasdadsfgfdgfdhgfdlkgkfdhheellooasdadsfgfdgfdhgfdlkgkfdhheellooasdadsfgfdgfdhgfdlkgkfdhheellooasdadsfgfdgfdhgfdlkgkfdhheellooasdadsfgfdgfdhgfdlkgkfdhheellooasdadsfgfdgfdhgfdlkgkfd"
>> --- 0.21964788436889648 seconds ---
>> --- 0.518169641494751 seconds ---
>> --- 1.3148927688598633 seconds ---
⇒ test1 is fastest
Upvotes: 1
Reputation: 5210
This is not feasible in linear time, as the length of both strings matter and they need to be iterated for each character. Without having checked its actual implementation, I assume remove()
is logarithmic.
def check(msg, chars):
c = list(chars) # Creates a copy
try:
for m in msg:
c.remove(m)
except ValueError:
return False
return True
if __name__ == '__main__':
print(check('hello', 'ehlo'))
print(check('hello', 'ehlol'))
print(check('hello', 'ehloijin2oinscubnosinal'))
Upvotes: 1
Reputation: 149813
You could use collections.Counter()
. Just build two counters and use the subtract()
method to check if there're any negative counts:
>>> c1 = Counter(characters)
>>> c2 = Counter(message)
>>> c1.subtract(c2)
>>> all(v >= 0 for v in c1.values())
False
This should work in linear time.
Upvotes: 7