Victor Sanchez
Victor Sanchez

Reputation: 31

How can I delete this part of the text with regex?

I have a problem that I really hope that somebody could help me. So, I want to delete some parts of text from a notepad++ document using Regex. If there's another software that I can use to delete this part of text, let me know please, I am really really noob with regex

So, my document its like this:

1
00:00:00,859 --> 00:00:03,070
text over here

2
00:00:03,070 --> 00:00:09,589
text over here

3
00:00:09,589 --> 00:00:10,589
some numbers here

4
00:00:10,589 --> 00:00:12,709
Text over here

5
00:00:12,709 --> 00:00:18,610
More text with numbers here

What I want to learn is how can I delete the first 2 lines of numbers in all the document? So I could get only the text parts (the "text over here" parts)

I would really appreciate any kind of help!

Upvotes: 2

Views: 89

Answers (4)

buræquete
buræquete

Reputation: 14688

Simplest solution;

\d+(\r\n|\r|\n)\d{2}:\d{2}.*(\r\n|\r|\n)

Get line with some number \d+ with its line break (\r\n|\r|\n)
Also the next line that starts with two 2-digit numbers and a colon \d{2}:\d{2} with the rest .* and its line break. No need to match all since we already are in the correct line, since subtitle file is defined well with its predictable structure.

Put this as Find what: value in Search -> Replace.. in Notepad++, with Seach Mode: Regular Expression and with replace value (Replace with:) of empty space. Will get you the correct result, lines of expected text with empty line in between each.

to see it on action on regex101

Upvotes: 1

Marcelo Scofano Diniz
Marcelo Scofano Diniz

Reputation: 669

I'm going for a less specific regex:

 ^[0-9]*\n[0-9:,]*\s-->\s[0-9:,]*

Demo @ regex101

Upvotes: 0

Tyl
Tyl

Reputation: 5252

Subtitles, for accuracy you can use this:

\d+(\r\n|\n|\r)(\d\d:){2}\d\d,\d{3}\s*-->\s*(\d\d:){2}\d\d,\d{3}(\r\n|\n|\r)

Check Regular Expression, Find what with this and Replace with empty would do.
Regxe Demo

srt subtitles are basically ordered. And it's better accurate than lose texts.

\d : a single digit.
+ : one or more of occurances of the afore character or group.
\r\n: carriage and return. (newline)
* : zero or more of occurances of the afore character or group.
| : Or, match either one.
{3}: Match afore character or group three times.

Upvotes: 0

Ibrahim
Ibrahim

Reputation: 6088

My solution:

^[\s\S]{1,5}\d{1,3}:\d{1,3}:\d{1,3},\d{1,5}\s-->\s*?\d{1,3}:\d{1,3}:\d{1,3},\d{1,5}\s

This solution match both types: either all data in one line, or numbers in one line and data in the second.

Demo: https://regex101.com/r/nKD0DQ/1/

Upvotes: 1

Related Questions