How to extract from text with in a range of time

Question

I have a text below, How to extract the text between the time range. Code is available to extract all the values

s = '''00:00:14,099 --> 00:00:19,100
a classic math problem a

00:00:17,039 --> 00:00:28,470
will come from an unexpected place

00:00:18,039 --> 00:00:19,470

00:00:20,039 --> 00:00:21,470

00:00:22,100 --> 00:00:30,119
binary numbers first I'm going to give

00:00:30,119 --> 00:00:35,430
puzzle and then you can try to solve it

00:00:32,489 --> 00:00:37,170
like I said you have a thousand bottles'''

Can i extract the test from 00:00:17,039 --> 00:00:28,470 and 00:00:30,119

code to write back all the values

import re
lines = s.split('
')
dict = {}

for line in lines:
    is_key_match_obj = re.search('([\d\:\,]{12})(\s-->\s)([\d\:\,]{12})', line)
    if is_key_match_obj:
        #current_key = is_key_match_obj.group()
        print (current_key)
        continue

    if current_key:
        if current_key in dict:
            if not line:
                dict[current_key] += '
'
            else:
                dict[current_key] += line
        else:
              dict[current_key] = line

print(dict.values())

Expected Out from 00:00:17,039 --> 00:00:28,470 to 00:00:30,119 --> 00:00:35,430

dict_values(['will come from an unexpected place ', '', '', 'binary numbers first I'm going to give', ' puzzle and then you can try to solve it'])

Karmveer Singh · Accepted Answer

No need to iterate line by line. Try the below code. It will give you a dictionary as you wanted.

import re
dict = dict(re.findall('(\d{2}:\d{2}.*)
(.*)', s))
print(dict.values())

Output

dict_values(['a classic math problem a', 'will come from an unexpected place', '', '', "binary numbers first I'm going to give", 'puzzle and then you can try to solve it', 'like I said you have a thousand bottles'])

How to extract from text with in a range of time

Answers (2)

Related Questions