ocslegna
ocslegna

Reputation: 113

Use awk to output formated fields from each line

Let a be a text file and b an unl file.

In a I got separated data by spaces/tabs, and the last column got spaces.

I.E:

30714931330     1.0000  201608  10 X 10 S.A.
30594465497  E  0.0044  201608  1 SOYORRO S.A.

Here, "10 X 10 S.A." and "1 SOYORRO S.A." are the last column.


What I need to do is:

Ouput some fields in each line from a to b in a way depending whether "E"(second column) is present or not, and each field should be separated by a semicolon ";" including the last one.

The output format will be:

20160727;30714931330; ;1.0000;201608;
20160727;30594465497;E;0.0044;201608;

Where the first field is the date of issue in YYYYMMDD format(it is not in a file). How could I get it and put it here?

I worked with a few things and as a result I got:

awk '{if($2 == "E") {print $issueDate ";" $1 ";" "E;" $3 ";" $4 ";" > "b.unl"} else {print $issueDate ";" $1 ";" " ;" $2 ";" $3 ";" > "b.unl"}}' a.txt

Or

awk '{if($2 == "E") {print $issueDate ";" $1 ";" "E;" $3 ";" $4 ";"} else {print $issueDate ";" $1 ";" " ;" $2 ";" $3 ";"}' a > b

Is this a correct way to implement it? Otherwise, How should I do this? Using sed would it help?

Thanks.

Upvotes: 0

Views: 92

Answers (3)

James K. Lowden
James K. Lowden

Reputation: 7837

Current time is always available from date(1). Grab that once in the beginning. To separate your output with ";", use the OFS variable:

BEGIN {
    FS = "\t"
    "date +'%Y%m%d'" | getline date
}

{ e = " " }

$2 == "E" {
    e  = "E";
    $2 = $3;
    $3 = $4;
}

{
    OFS = ";"
    print date, $1, e, $2, $3 ";"
}

Invoke as:

$ awk -f E.awk E.txt 

20160816;30714931330; ;1.0000;201608;
20160816;30594465497;E;0.0044;201608;

Upvotes: 1

karakfa
karakfa

Reputation: 67467

with gawk fixed field widths

$ awk -v OFS=';' -v d="$issueDate" 'BEGIN{FIELDWIDTHS="11 2 1 2 6 2 6 35"}
                                         {print d,$1,$3,$5,$7}' file

20160727;30714931330; ;1.0000;201608
20160727;30594465497;E;0.0044;201608

and pass the date as an awk variable.

Upvotes: 2

jil
jil

Reputation: 2691

Did you mean that the first field of the output would be the current timestamp? If so, you can use functions strftime() and systime().

I would use a guard expression instead of if and shift the fields to have only one print statement but this is just matter of style.

awk '
    $2 == "E" { e="E"; $2=$3 ; $3=$4; }
    { print strftime("%Y%m%d", systime()) ";" $1 ";" e ";" $2 ";" $3 ";"}
'

Upvotes: 1

Related Questions