HagridV
HagridV

Reputation: 111

How to apply a program to a column in a text in bash / awk?

I have a text that contains something like this:

column1 column2 column3 column4
text1.1 text1.2 text1.3 text1.4
text2.2 text2.2 text2.3 text3.4

I want to execute a program that transforms all the text in column 2 to a new text. The program takes stdin and returns stdout, so it is called like this: echo "text-to-transform" | myprogram, and returns "transformed-text" to stdout.

What would be the easiest way to apply myprogram to the column2 and display the output in bash ?

The output would look something like this

column1 column2 column3 column4
text1.1 transformed-text1.2 text1.3 text1.4
text2.2 transformed-text2.2 text2.3 text3.4

I'm guessing awk is the way, but I don't know enough about it.

Thanks

Upvotes: 0

Views: 101

Answers (3)

Ed Morton
Ed Morton

Reputation: 203684

$ cat tst.awk
BEGIN { myprogram = "tr [:lower:] [:upper:]" }
NR>1 {
    cmd = "printf \047%s\n\047, \047" $2 "\047 | " myprogram
    if ( (cmd | getline line) > 0 ) {
        $2 = line
    }
    close(cmd)
}
{ print }

$ awk -f tst.awk file
column1 column2 column3 column4
text1.1 TEXT1.2 text1.3 text1.4
text2.2 TEXT2.2 text2.3 text3.4

Replace myprogram = "tr [:lower:] [:upper:]" with myprogram = "<whatever your real program is called>". You can even parametrize it if you like:

$ cat tst.awk
NR>1 {
    cmd = "printf \047%s\n\047, \047" $col "\047 | " myprogram
    if ( (cmd | getline line) > 0 ) {
        $col = line
    }
    close(cmd)
}
{ print }

$ awk -v myprogram='tr [:lower:] [:upper:]' -v col=2 -f tst.awk file
column1 column2 column3 column4
text1.1 TEXT1.2 text1.3 text1.4
text2.2 TEXT2.2 text2.3 text3.4

$ awk -v myprogram='wc -c' -v col=2 -f tst.awk file
column1 column2 column3 column4
text1.1        9 text1.3 text1.4
text2.2        9 text2.3 text3.4

$ awk -v myprogram="sed 's/x/X/' | tr 't' '#'" -v col=3 -f tst.awk file
column1 column2 column3 column4
text1.1 text1.2 #eX#1.3 text1.4
text2.2 text2.2 #eX#2.3 text3.4

Upvotes: 2

Mark Setchell
Mark Setchell

Reputation: 207540

Here's an ugly way, just using sed to do a simple transform to column 2:

paste <(cut -f1 -d' ' file) <(cut -f2 -d' ' file | sed 's/text/TEXT/') <(cut -f3,4 -d' ' file)

Output

text1.1 TEXT1.2 text1.3 text1.4
text2.2 TEXT2.2 text2.3 text3.4

It is essentially pasting 3 files together side-by-side, so read it as:

paste file1 file2 file3

where file1 is what you get when you cut the first field from your input file, file2 is what you get when you cut and transform the second field of your input file and file3 is what you get when you cut fields 3 and 4 from your input file.


Or plain bash:

#!/bin/bash

while read c1 c2 rest ; do
     c2trans=$(echo "$c2" | ./transformer)
     echo "$c1 $c2trans $rest"
done < file

Upvotes: 1

David C. Rankin
David C. Rankin

Reputation: 84569

With awk you can simply concatenate a prefix to the second field, e.g.

awk 'FNR > 1 && NF > 1 {$2="transformed-"$2}1' file

Which just checks that you have at least 2 fields in the line and then sets then concatenates the prefix "transformed-" to the second field in the line from the second line in the file until the end.

Example Use/Output

Using a simple heredoc to provide the input to awk you could do:

$ cat << eof | awk 'FNR > 1 && NF > 1 {$2="transformed-"$2}1'
> column1 column2 column3 column4
> text1.1 text1.2 text1.3 text1.4
> text2.2 text2.2 text2.3 text3.4
> eof
column1 column2 column3 column4
text1.1 transformed-text1.2 text1.3 text1.4
text2.2 transformed-text2.2 text2.3 text3.4

Upvotes: 1

Related Questions