Reputation: 91
I am trying to use awk to do two things. I want to separate a list into three separate lists and convert 1 or 2 columns of each to a regular expression. When I pipe awk to itself, ie select my items in my list and then use awk to do the substitutions, it appends 1s to the list items.
I figure I need to not pipe awk to itself and instead do all of this in a single call to awk.
AH??0*,*,ARRAY RESISTIVITY,RESISTIVITY
AHD*,*,MEASURED DEPTH,REFERENCE
AI*,*,ACOUSTIC IMPEDANCE COMPRESSIONAL,GEOPHYSICAL SYNTHETICS
AI_AVG_HOR_SIG,*,ACOUSTIC IMPEDANCE,ACOUSTIC
*,FOO,BAR,BLEH
List one would be lines like line 4, with no wildcards in column one, replacing wildcards in column 2.
List two would be for lines 1,2 and 3 in a separate list and will need to do substitutions on columns 1 and 2.
Lastly, I need to do a similar thing for line 5 in a separate list.
I am able to get this lists doing this.
Line 4: awk -F \, '$1!~/([\*\?])/' file.txt
Lines 1-3: awk -F \, '$1~/([\*\?])/' file.txt
Line 5: awk -F \, '$1~/^\*$/' file.txt
My subs are * => .* and ? => [0-9].
When I attempt to use gsub like this awk -F \, 'gsub(/\*/,".*",$2) $1!~/([\*\?])/' OFS=, file.txt
, the list comes back funky with unexpected results. I feel as though there is a fundamental thing I don't understand about awk with regard to stacking operations.
Halp!
Upvotes: 0
Views: 461
Reputation: 5347
What I write here is not the solution of your question. It is just an exercise of reorganization of your versions... (for you to complete :). Some of @Etan wise suggestions are still missing. (Stylistic concerns can save us lots of time).
awk (or any one-liner solutions) gets confusing in it exceeds some 30 chars. Quotes, etc become difficult.
You can (should?) write it in a file (a.awk) with proper indentation, comments, vertical symmetries:
#!/usr/bin/gawk -f
BEGIN { FS="," ; OFS="," }
$1 ~ /[\*\?]/ && $1 !~ /^\*$/ { gsub(/\*/, ".*" ,$1 );
gsub(/\?/, "[0-9]",$1 );
gsub(/\*/, ".*" ,$2 );
print; }
and use it as awk -f a.awk inputfile
The current behavior is:
echo 'AH??0*,*,ARRAY RESISTIVITY,RESISTIVITY
AHD*,*,MEASURED DEPTH,REFERENCE
AI*,*,ACOUSTIC IMPEDANCE COMPRESSIONAL,GEOPHYSICAL SYNTHETICS
AI_AVG_HOR_SIG,*,ACOUSTIC IMPEDANCE,ACOUSTIC
*,FOO,BAR,BLEH' | awk -f /tmp/a1
AH[0-9][0-9]0.*,.*,ARRAY RESISTIVITY,RESISTIVITY
AHD.*,.*,MEASURED DEPTH,REFERENCE
AI.*,.*,ACOUSTIC IMPEDANCE COMPRESSIONAL,GEOPHYSICAL SYNTHETICS
Upvotes: 1