Manipulate nth column of a .csv file with awk or sed

Question

I have a .csv file with 6 columns:

source  raised_time cleared_time    cause   pcause  sproblem
source1 rtime1  ctime1  cause1  communicationsSubsystemFailure#model.route.1.2  oMCIFailure#model.route.1.2
source2 rtime2  ctime2  cause2  equipmentMalfunction#model.route.1.2    deviceNotActive#model.route.1.2

I want to manipulate the 5th and 6th columns of the .csv file with below rules:

Convert the first letters of 5th and 6th columns to upper case
Keep the strings upto the character: "#" and remove the trailing part (which are coming after the # char)
Put a space between the lower case and upper case letters

So the wanted format is:

source  raised_time cleared_time    cause   pcause  sproblem
source1 rtime1  ctime1  cause1  Communication Subsystem Failure OMCI Failure
source2 rtime2  ctime2  cause2  Equipment Malfunction   Device Not Active

How can I do that with awk or sed command?

I tried to start with converting the first letter to upper case with the command:

awk 'BEGIN {$5 = toupper(substr($5,1,1))
    substr($5, 2)}1' input_file

but it did not work.

Ed Morton · Accepted Answer

You said your input is CSV (Comma-Separated Values) but there's are no commas in it while it does have apparently random spacing between fields so I assume you actually meant TSV (Tab-Separated Values). If so then this should do what you want:

$ cat tst.awk
BEGIN { FS=OFS="	" }
NR > 1 {
    for (i=5; i<=NF; i++) {
        new = ""
        old = $i
        sub(/#.*/,"",old)
        while ( match(old,/[[:upper:]][[:lower:]]+/) ) {
            new = new substr(old,1,RSTART-1) " " substr(old,RSTART,RLENGTH)
            old = substr(old,RSTART+RLENGTH)
        }
        new = new old
        $i = toupper(substr(new,1,1)) substr(new,2)
    }
}
{ print }

.

$ awk -f tst.awk file
source  raised_time     cleared_time    cause   pcause  sproblem
source1 rtime1  ctime1  cause1  Communications Subsystem Failure        OMCI Failure
source2 rtime2  ctime2  cause2  Equipment Malfunction   Device Not Active

Manipulate nth column of a .csv file with awk or sed

Answers (2)

Related Questions