Reputation: 41
Its kind of daily record, sample below, though value format will not change but position of field/column of record keep changing which is creating a problem while using awk,sed,grep
.
Filename.txt - with millions of record
abcd D20140624 Useragent username userid
abcd D20140625 Useragent username1 userid1
D20140626 Useragent username2 userid2
result should be:
D20140624 username userid
D20140625 username1 userid1
D20140626 username2 userid2
IF i use cat Filename.txt|awk -f ' ' '{print $2,$4,$5}'
- invalid result
Similarly sed
gives invalid result.
Can anyone help me with this.
Upvotes: 1
Views: 114
Reputation: 35208
Using a perl one-liner, using indexes from the end of the fields:
perl -lane 'print "@F[-4,-2,-1]"' file
Or with more explicit logic:
perl -lane 'print @F == 5 ? "@F[1,3,4]" : "@F[0,2,3]"' file
Switches:
-l
: Enable line ending processing, specifies line terminator-a
: Splits the line on space and loads them in an array @F
-n
: Creates a while(<>){..}
loop for each “line” in your input file. -e
: Tells perl
to execute the code on command line. Upvotes: 0
Reputation: 204458
-f
is the argument to tell awk to read its script from a file so when you say "invalid result" I assume you're getting an error message like can't open source file ' '
.
I THINK You probably were trying to use -F
but then ' '
is the default FS value so there's no need to set it explicitly.
Once you've got past that issue, to get the output you want from that input file is just:
$ awk '{print $(NF-3), $(NF-1), $NF}' file
D20140624 username userid
D20140625 username1 userid1
D20140626 username2 userid2
Upvotes: 0
Reputation: 5092
You can also use sed command
sed -r 's/.*(D[0-9]+) \w+ (.*)/\1 \2/g' file_name
Upvotes: 0
Reputation: 174826
Through GNU sed,
$ sed -r 's/^.*(D\S*).*(usern\S*).*(useri\S*).*/\1 \2 \3/g' file
D20140624 username userid
D20140625 username1 userid1
D20140626 username2 userid2
Upvotes: 0
Reputation: 195229
awk '{for(i=1;i<=NF;i++)if($i~/^D[0-9]{8}$/){n=i;break}}
{print $n,$(NF-1),$NF}' file
gives:
D20140624 username userid
D20140625 username1 userid1
D20140626 username2 userid2
it searches for the the first column that matching D....
no matter where it is, print it and the last two column. you didn't specify the rule in detail, so I came with this.
Upvotes: 0
Reputation: 5424
use this:
awk '{ if(NF==5) print $2,$4,$5; else print $1,$3,$4; }'
Upvotes: 1
Reputation: 41460
You can do like this with awk
awk '!/^D20[0-9][0-9]/ {$1="";sub(/^ /,"")}1'
D20140624 Useragent username userid
D20140625 Useragent username1 userid1
D20140626 Useragent username2 userid2
If first field dos not start with a year, remove it and remove extra space.
Upvotes: 1