Reputation: 4577
Given a file in the following format (where X is any text, without newlines):
01st December 2019
0100 X
0200 X
0300 X
1745 X
02nd December 2019
0015 X
1555 X
2335 X
What would the regex be to transform it to put the date at the start of each line, and remove the lines that are just the date, e.g:
01st December 2019 0100 X
01st December 2019 0200 X
01st December 2019 0300 X
01st December 2019 1745 X
02nd December 2019 0015 X
02nd December 2019 1555 X
02nd December 2019 2335 X
I know i can find the lines that start with dates by searching for [0-3][0-9][snrt[tdh]
, and the start of a line by searching for ^
, but how can I say "find ^ and replace with the previous match for the date"?
Upvotes: 0
Views: 455
Reputation: 91430
If the number of lines to be joined is not too high, you can do the following where I've limited the number of lines to 7:
^(\d\d(?:st|nd|rd|th) \w+ \d{4})$\R(^\d{4} .+$)(?:\R(^\d{4} .+$))?(?:\R(^\d{4} .+$))?(?:\R(^\d{4} .+$))?(?:\R(^\d{4} .+$))?(?:\R(^\d{4} .+$))?
$1 $2(?3\n$1 $3)(?4\n$1 $4)(?5\n$1 $5)(?6\n$1 $6)(?7\n$1 $7)
. matches newline
Explanation:
^ # beginning of line
( # group 1
\d\d # 2 digits (the day)
(?:st|nd|rd|th) # any of st or nd or rd or th
\w+ # 1 or more word character (the month)
\d{4} # 4 digits (the year)
) # end group
$ # end of line
\R # any kind of linebreak
( # group 2
^ # beginning of line
\d{4} # 4 digits (the time)
.+ # 1 or more any character but newline
$ # end of line
) # end group 2
(?: # non capture group
\R # any kind of linebreak
(^\d{4} .+$) # group 3, same pattern as in group 2
)? # end group, optional
(?:\R(^\d{4} .+$))? # same as above for group 4
(?:\R(^\d{4} .+$))? # same as above for group 5
(?:\R(^\d{4} .+$))? # same as above for group 6
(?:\R(^\d{4} .+$))? # same as above for group 7
you can add more groups if you need
Replacement:
$1 $2 # content of group 1, space, content of group 2
(?3 # if group 3 exists:
\n # linefeed
$1 $3 # content of group 1, space, content of group 3
) # end condition
(?4\n$1 $4) # same as above fot group 4
(?5\n$1 $5) # same as above fot group 5
(?6\n$1 $6) # same as above fot group 6
(?7\n$1 $7) # same as above fot group 7
Screenshot (before):
Screenshot (after):
Upvotes: 1