user3641381
user3641381

Reputation: 1076

Python: get the distance between pairs

In our datasets we have big sets of sequences e.g. "aismeorisityou" which we like to get the distance between the two adjacent pairs. So in this case between the two 'is's there is 6 other letters. What's the best way to go about this?

This is as far as we got..

count = 0
for i in range(1, len(x)):
    if x[i] == x[i-1]:
        # True if there are pairs - now count the distance
return None

The output should be the distance, 6.

Upvotes: 1

Views: 104

Answers (2)

Autonomous_Vehicle
Autonomous_Vehicle

Reputation: 49

If the sequences are strings, as your example: "aismeorisityou"

s = 'aismeorisityou'

you can String find (or index) the Substring 'is', and then return both of them.

>>> s.index('is')
1
>>> s.rindex('is')
7
>>> s.find('is')
1
>>> s.rfind('is')
7
>>> 

Write a def, then return the spaces between.

However, what you find with the docs:

 |  rfind(...)
 |      S.rfind(sub[, start[, end]]) -> int
 |      
 |      Return the highest index in S where substring sub is found,
 |      such that sub is contained within S[start:end].  Optional
 |      arguments start and end are interpreted as in slice notation.
 |      
 |      Return -1 on failure.
 |  
 |  rindex(...)
 |      S.rindex(sub[, start[, end]]) -> int
 |      
 |      Return the highest index in S where substring sub is found,
 |      such that sub is contained within S[start:end].  Optional
 |      arguments start and end are interpreted as in slice notation.
 |      
 |      Raises ValueError when the substring is not found.

Upvotes: 0

Gamopo
Gamopo

Reputation: 1598

You'll need a second inner loop:

x= 'aismeorisityou'
for i in range(1, len(x)):
    for j in range(i+1, len(x)-1):
        if x[i] == x[j] and x[i+1]==x[j+1]:
            print(x[i]+x[i+1])
            print('separated by: ' + str(j-i))

returns:

is
separated by: 6

I hope it helps!

Upvotes: 2

Related Questions