Data Mastery
Data Mastery

Reputation: 2095

Extract date from inside a string with Python

I have the following string, while the first letters can differ and can also be sometimes two, sometimes three or four.

PR191030.213101.ABD

I want to extract the 191030 and convert that to a valid date.

filename_without_ending.split(".")[0][-6:]

PZA191030_392001_USB

Sometimes it looks liket his

This solution is not valid since this is also might differ from time to time. The only REAL pattern is really the first six numbers.

How do I do this?

Thank you!

Upvotes: 1

Views: 71

Answers (4)

SRG
SRG

Reputation: 345

Since first letters can differ we have to ignore alphabets and extract digits.

So using re module (for regular expressions) apply regex pattern on string. It will give matching pattern out of string.

'\d' is used to match [0-9]digits and + operator used for matching 1 digit atleast(1/more).

findall() will find all the occurences of matching pattern in a given string while #search() is used to find matching 1st occurence only.

import re

str="PR191030.213101.ABD"

print(re.findall(r"\d+",str)[0])

print(re.search(r"\d+",str).group())

Upvotes: 0

The fourth bird
The fourth bird

Reputation: 163632

You could get the first 6 digits using a pattern an a capturing group

^[A-Z]{2,4}(\d{6})\.
  • ^ Start of string
  • [A-Z]{2,4} Match 2, 3 or 4 uppercase chars
  • ( Capture group 1
    • \d{6} Match 6 digits
  • )\. Close group and match trailing dot

Regex demo | Python demo

For example

import re

regex = r"^[A-Z]{2,4}(\d{6})\."
test_str = "PR191030.213101.ABD"
matches = re.search(regex, test_str)

if matches:
    print(matches.group(1))

Output

191030

Upvotes: 3

Maximilian Schaller
Maximilian Schaller

Reputation: 26

This can also be done by:

filename_without_ending.split(".")[0][2::]

This splits the string from the 3rd letter to the end.

Upvotes: 1

Aryerez
Aryerez

Reputation: 3495

You can do:

a = 'PR191030.213101.ABD'
int(''.join([c for c in a if c.isdigit()][:6]))

Output:

191030

Upvotes: 3

Related Questions