Reuse patterns in awk program

Question

I want to write a somehow long awk program and therefore make my code more readable and easier to maintain. The first code snippet works but it is hard to read and harder to maintain.

/$..-av-es/.*$/ {
    split($0, arr, /$..-av-es/.*$/)
}

Therefore I would like to define the regex once inside the variable and use the variable. $0 ~ PATTERN {...} works but split($0, arr, PATTERN) doesn't. What exactly am I doing wrong?

BEGIN { PATTERN="$..-av-es/.*$"}

$0 ~ PATTERN {
    split($0, arr, PATTERN)

}

EDIT: I have a file structured like this.

aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
abc (fd-av-es/key1) value1sdfsdaff
jjjjjjjjjjjjjjjjjjjjjjjjjjj
(sd-av-es/key2) value2sdfsdaff

my final goal is having an array of strings "key1:value1" "key2:value2"

This snippet

/$..-av-es/.*$/ {
    split($0, arr, /$..-av-es/.*$/)
    for ( i in arr) {print NR arr[i]}
}

returns which brings me a little closer to value1 and value2

2abc
2 value1afjskhslakjhf
4
4 value2jkalshfkjkl

but

BEGIN { PATTERN="$..-av-es/.*$"}
$0 ~ ES_PATTERN {
    split($0, arr, ES_PATTERN)
    for ( i in arr) {print NR arr[i]}
}

however returns:

2abc (
2
4(
4

Thanks

Ed Morton · Accepted Answer

What you have in your question is a regexp so call them that instead of the highly ambiguous "patterns". See How do I find the text that matches a pattern? for more info on that topic.

You don't need to provide the regexp twice, just do this instead:

split($0, arr, /$..-av-es/.*$/) > 1 {
    ...
}

If for some reason you did want to do what you're trying to do then you should do this with GNU awk for strongly typed regexp constants:

BEGIN {
    regexp = @/$..-av-es/.*$/
}

$0 ~ regexp {
    split($0, arr, regexp)
    ...
}

or with any other awk you're defining a dynamic regexp which is a string that will then get parsed twice by awk, first to turn it into a regexp and then to use it as a regexp, so you need to double the escapes:

BEGIN {
    regexp = "$..-av-es\/.*$"
}

$0 ~ regexp {
    split($0, arr, regexp)
    ...
}

See https://www.gnu.org/software/gawk/manual/gawk.html#Using-Constant-Regexps and https://www.gnu.org/software/gawk/manual/gawk.html#Computed-Regexps for more info on the difference between dynamic regexps, constant regexps, and strongly typed regexp constants.

Reuse patterns in awk program

Answers (1)

Related Questions