Reputation: 198
I have a test.csv
#cat test.csv
a.b.c.d
a.b
a.b.c
a-a.b.c
a-a.b
(1) I am trying to print all values after 1st dot and (2) last dot must not be printed.
I try below but it come up with spaces, actual file is about 1 billion records, any idea how can I print without dot,
#cat test.csv | awk -F. '{print $2,".",$3}'
b . c
b .
b . c
b . c
b .
Desired output
b.c.d
b
b.c
b.c
b
Upvotes: 1
Views: 255
Reputation: 203522
The spaces in your output are because you're telling awk to add spaces. Each ,
in the print statement is you telling awk to add the value of the OFS
variable (a single blank char by default) in that position in the output. Instead of:
awk -F. '{print $2,".",$3}'
Try either of these:
awk -F. '{print $2"."$3}'
awk 'BEGIN{FS=OFS="."} {print $2,$3}'
To get the output you want with awk though would be:
awk '{sub(/[^.]*\./,"")}1'
but I'd really suggest you use the tool designed for this task, cut:
cut -d'.' -f2-
Upvotes: 3
Reputation: 4004
$ sed 's|[^.]*\.||' test.csv
b.c.d
b
b.c
b.c
b
[^.]
means anything but a .
character. \.
is the .
character (needs to be escaped because it has a special meaning in regexes).
Upvotes: 2
Reputation: 133518
Could you please try following, written and tested with shown samples.
awk 'BEGIN{FS=OFS="."} NF>=3{print $2,$3;next} NF==2{print $2}' Input_file
Explanation: Adding detailed explanation for above code.
awk ' ##Starting awk program from here.
BEGIN{ ##Starting BEGIN section of this awk program from here.
FS=OFS="." ##Setting FS and OFS as DOT(.) here.
}
NF>=3{ ##Checking condition if number of fields greater than 3 then do following.
print $2,$3 ##Printing 2nd and 3rd field values here.
next ##next will skip all further statements from here.
}
NF<=2{ ##Checking if number of fields is lesser than 2 then do following.
print $2 ##Printing 2nd field here.
}
' Input_file ##Mentioning Input_file name here.
Upvotes: 2