vinay
vinay

Reputation: 61

extracting particular number from a line consisting of several numbers using python

As i consists of line having several numbers ,but i need to extract only first 6 digit number

import re
out=['DOT/R9.4x                 4616542  rtpbuild   x. : 20171111184750 p4 p4burtd review','DOT/dex                 4609974  build      ~. : 20171108044757 p4 p4burtd review']
for item in out:
    line=re.findall(r'\d{7}',item)
    print(line)

i'm getting output as:

['4616542','2017111','1184750']
['4609974','2017110','8044757']

but actually i need output of only 1st 7digit number:

[4616542]
[4609974]

i dont need those remaining number which has been divided

Upvotes: 1

Views: 44

Answers (4)

Yorian
Yorian

Reputation: 2062

It's a list. If you need the first item only, then you need index [0]:

line=re.findall(r'\d{7}',item)[0]

Upvotes: 0

Aaditya Ura
Aaditya Ura

Reputation: 12669

findall() finds all the matches and returns them as a list of strings, with each string representing one match.

findall() returns a list(and lists are iterables) and returns str objects. let's check :

import re
    out=['DOT/R9.4x                 4616542  rtpbuild   x. : 20171111184750 p4 p4burtd review','DOT/dex                 4609974  build      ~. : 20171108044757 p4 p4burtd review']
    for item in out:
        line=re.findall(r'\d{7}',item)
        for i in line:
            print(i)

output:

4616542
2017111
1184750
4609974
2017110
8044757

so instead of findall use re.search for first matches :

One line solution:

print([re.search(r'\d{7}',item).group() for item in out])

output:

['4616542', '4609974']

Upvotes: 0

Carles Mitjans
Carles Mitjans

Reputation: 4866

You should use search instead of findall if you are looking for the first occurrence:

for item in out:
    line=re.search(r'\b\d{7}\b',item)
    print(line.group(0))

Notice that re.search returns an SRE_Match object. Another option would be to extract the first value from the list returned by findall.

Edited: Using \b will avoid matching bigger numbers (thanks to @Jean).

Upvotes: 3

Van Peer
Van Peer

Reputation: 2167

import re
out=['DOT/R9.4x                 4616542  rtpbuild   x. : 20171111184750 p4 p4burtd review','DOT/dex                 4609974  build      ~. : 20171108044757 p4 p4burtd review']
for item in out:
    line=re.findall(r'\d{7}',item)[0]
    print([int(line)])

Upvotes: 0

Related Questions