Reputation: 151
I have multiple files with different job names. The job name is specified as follows.
#SBATCH --job-name=01_job1 #Set the job name
I want to use sed/awk/grep to automatically get the name, that is to say, what follows '--job-name=' and precedes the comment '#Set the job name'. For the example above, I want to get 01_job1. The job name could be longer for several files, and there are multiple = signs in following lines in the file.
I have tried using grep -oP "job-name=\s+\K\w+" file
and get an empty output. I suspect that this doesn't work because there is no space between 'name=' and '01_job1', so they must be understood as a single word.
I also unsuccessfully tried using awk '{for (I=1;I<NF;I++) if ($I == "name=") print $(I+1)}' file
, attempting to find the characters after 'name='.
Lastly, I also unsuccessfully tried sed -e 's/name=\(.*\)#Set/\1/'
file to find the characters between 'name=' and the beginning of the comment '#Set'. I receive the whole file as my output when I attempt this.
I appreciate any guidance. Thank you!!
Upvotes: 2
Views: 282
Reputation: 184965
Use this, you was close, just correctness of your grep -oP
attempt (the main issue if you are trying to match a space
after =
character):
$ grep -oP -- '--job-name=\K\S+' file
01_job1
Node | Explanation |
---|---|
job-name= |
'job-name=' |
\K |
resets the start of the match (what is K ept) as a shorter alternative to using a look-behind assertion: perlmonks look arounds and Support of K in regex |
\S+ |
non-whitespace (all but \n, \r, \t, \f, and " ") (1 or more times (matching the most amount possible)) |
Upvotes: 2
Reputation: 103714
You can use a lookbehind and lookahead with GNU grep to get exactly what you describe:
grep -oP '(?<=--job-name=)\S+(?=\s+#Set the job name)' file
Or with awk:
awk '/^#SBATCH[[:space:]]+--job-name=/ &&
/#Set the job name$/ {
sub(/^[^=]*=/,"")
sub(/#[^#]*$/,"")
print
}' file
Or perl:
perl -lnE 'say $1 if /(?<=--job-name=)(\S+)(?=\s+#Set the job name)/' file
Any prints:
01_job1
Upvotes: 1
Reputation: 133428
1st solution: In GNU awk
with your shown samples please try following awk
code.
awk -v RS=' --job-name=\\S+' 'RT && split(RT,arr,"="){print arr[2]}' Input_file
OR a non-one liner form of above GNU awk
code would be:
awk -v RS=' --job-name=\\S+' '
RT && split(RT,arr,"="){
print arr[2]
}
' Input_file
2nd solution: Using any awk
please try following code.
awk -F'[[:space:]]+|--job-name=' '{print $3}' Input_file
3rd solution: Using GNU grep
please try following code with your shown samples and using non-greedy .*?
approach here in regex.
grep -oP '^.*?--job-name=\K\S+' Input_file
Upvotes: 2
Reputation: 26471
Simlar to the answer of Gilles Quenot
grep -oP -- '--job-name=\K.*(?= *# *Set the job name)'
This adds a look-ahead to ensure that the string is followed by #Set the job name
Upvotes: 2
Reputation: 626689
You need to match the whole string with sed and capture just what you need to get, and use -n
option with the p
flag:
sed -n 's/.*name=\([^[:space:]]*\).*/\1/p'
See the online demo:
#!/bin/bash
s='#SBATCH --job-name=01_job1 #Set the job name'
sed -n 's/.*name=\([^[:space:]]*\).*/\1/p' <<< "$s"
# => 01_job1
Details:
-n
- suppresses default line output.*
- any textname=
- a literal name=
string\([^[:space:]]*\)
- Group 1 (\1
): any zero or more chars other than whitespace.*
- any textp
- print the result of the successful substitution.Upvotes: 2