Reputation: 889

speed up python nested loop

Here is my code .It takes more than minute to process large string input.Is there any way to fasten this

q=int(input())
x=input()
count=0
for i in range(q):
    for j in range(q):
        k= j+i+1
        if k<=q:
            u=x[i:k]
            if u[0]==u[-1] or len(u)==1:
                count+=1


print(count)

Upvotes: 0

Answers (6)

user3842449

Reputation:

Here is an interesting solution that is extremely efficient. It looks at the problem from quite a different perspective.

q=int(input())
x=input()

y = x[0: q]

alphabet = {'a': 0, 'b': 0, 'c': 0, 'd': 0, 'e': 0, 'f': 0, 'g': 0, 'h': 0, 'i': 0, 'j': 0,
        'k': 0, 'l': 0, 'm': 0, 'n': 0, 'o': 0, 'p': 0, 'q': 0, 'r': 0, 's': 0, 't': 0,
        'u': 0, 'v': 0, 'w': 0, 'x': 0, 'y': 0, 'z': 0, ' ': 0}

letters = set(list(y))
for letter in letters:
    alphabet[letter] += y.count(letter)

repeats = [i for i in list(alphabet.values()) if i > 1]
singles = len(y)

count = singles
for repeat in repeats:
    count += ((repeat*(repeat - 1))/2)

What is going on here? Look at an artificial example where y = 'abbda efba hia jkla mbnop' and consider the character 'a'. How does it get counted?

1) It is counted each time it appears in the string. This is 5 times, and this is captured in the 'alphabet' dictionary. There is one pass through the unique characters within the string to get the count. This is less then or equal to 27 characters. I have included the space character. That has to be counted too!

2) The character 'a' is also counted in other ways by the original algorithm. It is counted when it appears on both ends of a sub string. In this case, for the first 'a' that appears we get the sub strings:

   abba 
   abba efba 
   abba efba hia 
   abba efba hia jkla

For the second 'a' that appears there are three more such sub strings:

   a efba 
   a efba hia 
   a efba hia jkla

And so on down the line. So we count 'a' 5 times for each time it appears in the string, but also count the 4 + 3 + 2 + 1 sub strings where it is on the ends. This is just r(r-1)/2 more counts for the letter 'a'. No need to loop when there is an analytic expression for the result. This is the basis for the algorithm's efficiency.

Upvotes: 1

Pynchia

Reputation: 11606

Well, you can avoid the slicing, which makes an unnecessary copy:

q = int(input())
x = input()
count = 0
for i in range(q):
    for j in range(q):
        k = j+i+1
        if k<=q:
            if x[i]==x[k] or i==k:
                count+=1

print(count)

Even better, you can get rid of the inner loop altogether

q = int(input())
x = input()
count = q
for i in range(q-1):
    if x[i]==x[-i]:
        count+=1

print(count)

I haven't tested it, I am using a mobile phone...

Upvotes: 0

Kavin Eswaramoorthy

Reputation: 1625

You don't need two for loops to do this you can do it just by counting repeated characters cumulatively. such that you can find result in O(n) time but will result in extra space of O(n).

q=int(input())
x=input()
from collections import defaultdict
d1 = defaultdict(int)
count = 0
for i in x:
    d1[i] += 1
    count += d1[i]

print count

Upvotes: 1

Abhijit

Reputation: 63787

Optimization of algorithm would require you to determine flabs and cutting it out. Flabs stands out clearly if you see a condition with loop variables within a loop without an else block and we see one clearly. We are definitely wasting iteration and that should be trimmed out

Clearly instead of recalculating K, you need to rerun the second loop in a manner such that you can drop the intermediate variable j

As k = j+i+1 where 0 <=j < q, we can rewrite the range as i + 1 <= k < q + i + 1

but we also know k <= q so the above condition can be written as i + 1 <= k < q + 1

so our new function would look like

def bar():
    q=int(raw_input())
    x=raw_input()
    count=0
    for i in range(q):
        for k in range(i + 1, q + 1):
            u=x[i:k]
            if u[0]==u[-1] or len(u)==1:
                count+=1
    return count

The second condition u[0]==u[-1] or len(u)==1 is interesting. You are creating a slice and checking the prefix and suffix along with the length

surely len(u) == k - i == 1, so you need not reevaluate a slice and invoke a function call, instead we can write as k == i + 1

def bar():
    q=int(raw_input())
    x=raw_input()
    count=0
    for i in range(q):
        for k in range(i + 1, q + 1):
            if x[i]==x[k - 1] or k  == i + 1:
                count+=1
    return count

Upvotes: 0

LittleQ

Reputation: 1925

If I got your point, here is the code

length = int(input())
s = input()
count = 0
# i: the head index of substring
# j: the tail index of substring, and len(substring) <=length
for i in range(len(s)):
    for j in range(min(length, len(s) - i)):
        if s[i] == s[i + j]:
            count += 1

print(count)

Upvotes: 0

John Coleman

Reputation: 52008

At least two issues:

1) You are looping through too many values where your k is > q. The inner loop should be over a range which depends on i

2) what is the point of all those slices (u = x[i:k])?. Why check u[0] and u[-1] instead of x[i] and x[k-1] directly? Also -- you can check the length of your u by doing arithmetic on i and k. Those slices are unnecessary and are probably the main culprit.

In any event -- chances are pretty good that there is a vastly simpler way to do what you want to do, one that avoids nested loops altogether.

Upvotes: 1

speed up python nested loop

Answers (6)

Related Questions