William
William

Reputation: 124

Move (or copy) regex match from line to the beginning of same line

Is there a way with sed or awk (or something else I'm not aware of) to move or copy a RegEx match from where it appears on the line to the beginning of that same line? I'm comfortable with the RegEx ( I'm using /2021-06-26T[0-9]{2}:[0-9]{2}:[0-9]{2}\.[0-9]{3}Z/ as I'm specifically matching things on 2021-06-26), just not sure of a clever way to prepend the RegEx match to the line it's found on.

For instance given this:

hello 2021-06-26T20:45:20.111Z
hello 2021-06-26T20:45:20.111Z hi
hi 2021-06-26T20:35:20.111Z yes
hola yes yes 2021-06-27T20:45:20.111Z
hello 2021-06-26T22:45:20.111Z
hey 2021-06-26T23:45:20.111Z
yo 2021-06-26T20:45:20.111Z no
salut 2021-06-26T20:45:20.111Z random words here
bonjour 2021-06-26T20:45:20.111Z

Is there a way to copy these timestamps to the beginning of each line? e.g.

2021-06-26T20:45:20.111Z hello 2021-06-26T20:45:20.111Z
2021-06-26T20:45:20.111Z hello 2021-06-26T20:45:20.111Z hi
2021-06-26T20:35:20.111Z hi 2021-06-26T20:35:20.111Z yes
2021-06-27T20:45:20.111Z hola yes yes 2021-06-27T20:45:20.111Z
2021-06-26T22:45:20.111Z hello 2021-06-26T22:45:20.111Z
2021-06-26T23:45:20.111Z hey 2021-06-26T23:45:20.111Z
2021-06-26T20:45:20.111Z yo 2021-06-26T20:45:20.111Z no
2021-06-26T20:45:20.111Z salut 2021-06-26T20:45:20.111Z random words here
2021-06-26T20:45:20.111Z bonjour 2021-06-26T20:45:20.111Z

EDIT: Googling led me to believe the & represented a sed match. Answers since have taught me about \# to match numbered groups with ()

Things I tried that failed:

sed -E '/[2021-06-26T[0-9]{2}:[0-9]{2}:[0-9]{2}\.[0-9]{3}Z/' ~/sedtest
sed '/2021-06-26T[0-9]{2}:[0-9]{2}:[0-9]{2}\.[0-9]{3}Z/' ~/sedtest
sed -E '/2021-06-26T[0-9]{2}:[0-9]{2}:[0-9]{2}\.[0-9]{3}Z/' ~/sedtest
sed -E -n '/2021-06-26T[0-9]{2}:[0-9]{2}:[0-9]{2}\.[0-9]{3}Z/' ~/sedtest

Upvotes: 1

Views: 330

Answers (4)

user14473238
user14473238

Reputation:

If you're happy with your current regex, and you don't need to match > 1 occurrence of it, then you just need .* + a single subexpression:

sed -E 's/.*(2021-06-26T[0-9]{2}:[0-9]{2}:[0-9]{2}\.[0-9]{3}Z)/\1 &/' file

Upvotes: 2

RavinderSingh13
RavinderSingh13

Reputation: 133458

With awk's match function one could try following code too.

awk '
match($0,/[0-9]+(-[0-9]+){2}T[0-9]+(:[0-9]+){2}\.[0-9]+Z/){
  $0=substr($0,RSTART,RLENGTH) OFS $0
}
1
'  Input_file

Explanation: Simple explanation would be, using match function of awk to match regex [0-9]+(-[0-9]+){2}T[0-9]+(:[0-9]+){2}\.[0-9]+Z which basically matches date and time as per shown samples. Then re-assigning values of matched part along with current line's value, then printing the lines(edited/non-edited ones).

Upvotes: 1

anubhava
anubhava

Reputation: 784998

Your regex seems correct only for date 2021-06-26.

To make it match all dates following regex work for you:

sed -E 's/^(.*([0-9]{4}(-[0-9]{2}){2}T([0-9]{2}:){2}[0-9]{2}\.[0-9]{3}Z))/\2 \1/' file

2021-06-26T20:45:20.111Z hello 2021-06-26T20:45:20.111Z
2021-06-26T20:45:20.111Z hello 2021-06-26T20:45:20.111Z hi
2021-06-26T20:35:20.111Z hi 2021-06-26T20:35:20.111Z yes
2021-06-27T20:45:20.111Z hola yes yes 2021-06-27T20:45:20.111Z
2021-06-26T22:45:20.111Z hello 2021-06-26T22:45:20.111Z
2021-06-26T23:45:20.111Z hey 2021-06-26T23:45:20.111Z
2021-06-26T20:45:20.111Z yo 2021-06-26T20:45:20.111Z no
2021-06-26T20:45:20.111Z salut 2021-06-26T20:45:20.111Z random words here
2021-06-26T20:45:20.111Z bonjour 2021-06-26T20:45:20.111Z

RegEx Demo

Upvotes: 1

Socowi
Socowi

Reputation: 27205

sed can do this using groups () and references \1:

sed -E 's/(.*)(2021-06-26T[0-9]{2}:[0-9]{2}:[0-9]{2}\.[0-9]{3}Z)/\2 \1\2/'

Upvotes: 2

Related Questions