winnie99
winnie99

Reputation: 123

How do I use regex to get the key information from a string in Python?

I have a string where I want to extract the key information from:

gbk_kings_common_20171201_20180131_66000.0k_2017-12-01_TO_2018-01-31_id12_1277904128.csv

Namely, I would like to find the following:

  1. File identifier, e.g. gbk_kings_common_20171201_20180131
  2. Size, e.g. 330.0k
  3. Date, e.g. 2017-12-01_TO_2018-02-31
  4. Type of id, e.g. id12_12771231518

But I'm having a difficulty compiling the regex since the file identifier can always change in the length, although the rest of the information is pretty fixed when delimited by commas.

Upvotes: 0

Views: 57

Answers (1)

Sunitha
Sunitha

Reputation: 12015

You can use the pattern r'(.*)_(.*)_([\d-]+_TO_[\d-]+)_(id[\d_]*) to search your string.

>>> import re
>>> s = "gbk_kings_common_20171201_20180131_66000.0k_2017-12-01_TO_2018-01-31_id12_1277904128.csv"
>>> sre = re.search(r'(.*)_(.*)_([\d-]+_TO_[\d-]+)_(id[\d_]*)', s)
>>> file_id, size, date, type_id = sre.groups()
>>> print (file_id, size, date, type_id)
gbk_kings_common_20171201_20180131 66000.0k 2017-12-01_TO_2018-01-31 id12_1277904128

Upvotes: 4

Related Questions