Darthvazor
Darthvazor

Reputation: 53

Extract specific characters from the string

i= "March 31st 2013 ntp[22123] Time server offset -.00354 sec"

i= "March 1st 2013 ntp[22485] Time server offset -.0070 sec"

The strings seem the same, but once in while, the character count will be different. It won't work if I try to extract only the last part of the string "-.0070 sec" with i=i [-11:].

I wonder if it's possible for me to search for the word "offset", find its position in the string, and use that to eliminate the rest of string in order to keep "-.00354 sec" or "-.0070 sec".

For example, there are 46 characters in "March 31st 2013 Time server offset -.00354 sec" and offset is the start of the 28th place of the string. The total 34 characters from beginning of the string will be eliminated.

Upvotes: 2

Views: 10681

Answers (4)

unutbu
unutbu

Reputation: 880717

text.rfind returns the index to the last occurrence of offset:

In [162]: text = "March 1st 2013 ntp[22485] Time server offset -.0070 sec"

In [181]: text.rfind('offset')
Out[181]: 38

So you can cut the string after 'offset ' like this:

In [183]: text[text.rfind('offset ')+len('offset '):]
Out[183]: '-.0070 sec'

Or, you could use str.rpartition to chop text into three pieces, and pick off the third (and last) piece:

In [179]: text.rpartition('offset ')
Out[179]: ('March 1st 2013 ntp[22485] Time server ', 'offset ', '-.0070 sec')
In [169]: text.rpartition('offset ')[-1]
Out[169]: '-.0070 sec'

Or, you could use str.rsplit to split the string on the last occurrence of 'offset ':

In [180]: text.rsplit('offset ', 1)
Out[180]: ['March 1st 2013 ntp[22485] Time server ', '-.0070 sec']
In [172]: text.rsplit('offset ', 1)[1]
Out[172]: '-.0070 sec'

The 1 in text.rsplit('offset ', 1) tells rsplit to split text in at most 1 location.


rfind, rsplit and rpartition each operate on the string from the right. So even if text contains the substring 'offset ' twice, they will still find the last occurrence of the substring.

Upvotes: 5

icanc
icanc

Reputation: 3577

You could use regular expressions like this:

>>> i = "March 31st 2013 ntp[22123] Time server offset -.00354 sec"
>>> pattern = re.compile('(offset)(.+)$')
>>> offset  = pattern.findall(s)[0][1]
>>> print offset
 -.00354 sec

Upvotes: 0

dawg
dawg

Reputation: 104092

You can use a regex:

import re

strings=['March 31st 2013 ntp[22123] Time server offset -.00354 sec', 
        'March 1st 2013 ntp[22485] Time server offset -.0070 sec']

for s in strings:
    print re.search(r'offset -(\.\d+) sec$',s).group(1)     

Prints:

.00354
.0070

Or move the parenthesis if you want to include the -:

print re.search(r'offset (-\.\d+) sec$',s).group(1) 

Or, if it is an optional sign, do something like this:

strings=['March 31st 2013 ntp[22123] Time server offset -.00354 sec', 
        'March 1st 2013 ntp[22485] Time server offset -.0070 sec',
        'March 1st 2013 ntp[22485] Time server offset .0070 sec']

for s in strings:
    print re.search(r'offset ((?:-)?\.\d+) sec$',s).group(1)      

With the $ anchor, it will only return the last one (if found) in the string.

Upvotes: 0

Martijn Pieters
Martijn Pieters

Reputation: 1124548

Split the string on the word offset then, with the trailing space:

line.split('offset ', 1)[-1]

This takes everything following that word.

Demo:

>>> text = "March 1st 2013 ntp[22485] Time server offset -.0070 sec"
>>> text.split('offset ', 1)[-1]
'-.0070 sec'

Upvotes: 0

Related Questions