Praful Bagai
Praful Bagai

Reputation: 17412

Python Split Operator

I want to split the following on the following basis:-

a = '"64 213121\r\n\r\n64 40771494536\r\n\r\n64 91547531\r\n\r\n64 40771494536\r\n\r\n"'

I want only [213121,40771494536,91547531], ie I want to split on the basis of \r\n\r\n plus sthe unique number 64. This 64 can be some other integer as well.

I'm currently doing like this:-

    a = a.split('\r\n\r\n')
    temp_a = []
    for i in a:
        try: #using try because sometimes , the split function returns '', which cannot be spliited further and hence nothing at index 1 position.
            i = i.split(' ')[1]
            temp_a.append(i)
        except : pass

Any better pythonic solution.

Upvotes: 0

Views: 308

Answers (4)

Hackaholic
Hackaholic

Reputation: 19771

Pythonic way:

>>>a = "64 213121\r\n\r\n64 40771494536\r\n\r\n64 91547531\r\n\r\n64 40771494536\r\n\r\n"
>>>list(set(a.replace('\r\n\r\n',' ').split(' ')[1::2]))
['91547531', '213121', '40771494536']    

using regex:

>>> a = "64 213121\r\n\r\n64 40771494536\r\n\r\n64 91547531\r\n\r\n64 40771494536\r\n\r\n"
>>> [ x for x in re.findall('\d+',a) if len(x)>2 ]
['213121', '40771494536', '91547531', '40771494536']

simply:

>>> re.findall('\d{3,}',a)
['213121', '40771494536', '91547531', '40771494536']

here i am using {n,m} in regex with match from m to n repetations
example a{2,} will match aab means two or more repeatation or a

Upvotes: 0

tianwei
tianwei

Reputation: 1879

Maybe what you want is just a more pythonic one?

print [x.split(' ')[1] for x in a.split('\r\n\r\n') if len(x) > 1]

result:

['213121', '40771494536', '91547531', '40771494536']

just use your split method,more pythonic.

if you do not need the duplicate numbers,use this:

print list(set([x.split(' ')[1] for x in a.split('\r\n\r\n') if len(x) > 1]))

result:

['91547531', '213121', '40771494536']

Upvotes: 2

lakshmen
lakshmen

Reputation: 29094

a = '"64 213121\r\n\r\n64 40771494536\r\n\r\n64 91547531\r\n\r\n64 40771494536\r\n\r\n"'
b = [int(s) for s in a.split() if (s.isdigit() and s != '64')]

This will help to achieve what you want.

Explanation:

It checks whether the string that is split is a digit and whether it is no equal to '64', then it converts it to a string.

Upvotes: 0

xecgr
xecgr

Reputation: 5193

There's two optative parts, first and last 64, so put in optative group ? The middle group should contain a number \d and some trailings [\r\n]+

Try this:

>>> import re
>>> test = '"64 213121\r\n\r\n64 40771494536\r\n\r\n64 91547531\r\n\r\n64 40771494536\r\n\r\n"'
>>> re.findall(r'[64\s]?(\d+?)[\r\n]+[64]?', test)
['213121', '40771494536', '91547531', '40771494536']

Upvotes: 0

Related Questions