Reputation: 3114
I have a file input.txt
which stores information in KEY:VALUE
form. I'm trying to read GOOGLE_URL
from this input.txt
which prints only http
because the seperator is :
. What is the problem with my grep
command and how should I print the entire URL.
SCRIPT
$> cat script.sh
#!/bin/bash
URL=`grep -e '\bGOOGLE_URL\b' input.txt | awk -F: '{print $2}'`
printf " $URL \n"
INPUT_FILE
$> cat input.txt
GOOGLE_URL:https://www.google.com/
OUTPUT
https
DESIRED_OUTPUT
https://www.google.com/
Upvotes: 0
Views: 2900
Reputation: 203393
Take your pick:
$ sed -n 's/^GOOGLE_URL://p' file
https://www.google.com/
$ awk 'sub(/^GOOGLE_URL:/,"")' file
https://www.google.com/
The above will work using any sed or awk in any shell on every UNIX box.
Upvotes: 1
Reputation: 2761
Yet another awk
alternative:
gawk -F'(^[^:]*:)' '/^GOOGLE_URL:/{ print $2 }' infile
Upvotes: 0
Reputation: 36390
I would use GNU AWK
following way for that task:
Let file.txt
content be:
EXAMPLE_URL:http://www.example.com/
GOOGLE_URL:https://www.google.com/
KEY:GOOGLE_URL:
Then:
awk 'BEGIN{FS="^GOOGLE_URL:"}{if(NF==2){print $2}}' file.txt
will output:
https://www.google.com/
Explanation: GNU AWK
FS
might be pattern, so I set it to GOOGLE_URL:
anchored (^
) to begin of line, so GOOGLE_URL:
in middle/end will not be seperator (consider 3rd line of input). With this FS
there might be either 1 or 2 fields in each line - latter is case only if line starts with GOOGLE_URL:
so I check number of fields (NF
) and if this is second case I print 2nd field ($2
) as first record in this case is empty.
(tested in gawk 4.2.1)
Upvotes: 0
Reputation: 785108
Since there are multiple :
in your input, getting $2
will not work in awk
because it will just give you 2nd field. You actually need an equivalent of cut -d: -f2-
but you also need to check key name that comes before first :
.
This awk
should work for you:
awk -F: '$1 == "GOOGLE_URL" {sub(/^[^:]+:/, ""); print}' input.txt
https://www.google.com/
Or this non-regex awk
approach that allows you to pass key name from command line:
awk -F: -v k='GOOGLE_URL' '$1==k{print substr($0, length(k FS)+1)}' input.txt
Or using gnu-grep
:
grep -oP '^GOOGLE_URL:\K.+' input.txt
https://www.google.com/
Upvotes: 1
Reputation: 133468
Could you please try following, written and tested with shown samples in GNU awk
. This will look for string GOOGLE_URL
and will catch further either http or https value from url, in case you need only https
then change http[s]?
to https
in following solution please.
awk '/^GOOGLE_URL:/{match($0,/http[s]?:\/\/.*/);print substr($0,RSTART,RLENGTH)}' Input_file
Explanation: Adding detailed explanation for above.
awk ' ##Starting awk program from here.
/^GOOGLE_URL:/{ ##Checking condition if line starts from GOOGLE_URL: then do following.
match($0,/http[s]?:\/\/.*/) ##Using match function to match http[s](s optional) : till last of line here.
print substr($0,RSTART,RLENGTH) ##Printing sub string of matched value from above function.
}
' Input_file ##Mentioning Input_file name here.
2nd solution: In case you need anything coming after first :
then try following.
awk '/^GOOGLE_URL:/{match($0,/:.*/);print substr($0,RSTART+1,RLENGTH-1)}' Input_file
Upvotes: 1