concatenating string with multiple array

Question

I'm trying to rearrange from a specific string into the respective column. Here is the input

String 1:  47/13528 
String 2:  55(s) 
String 3:   
String 4:  114(n) 
String 5:  225(s), 26/10533-10541 
String 6:  103/13519 
String 7:  10(s), 162(n) 
String 8:  152/12345,12346
(d=dead, n=null, s=strike)

The alphabet in each value is the flag (d=dead, n=null, s=strike). The String with value (digit) which is "String 1" will be the 47c1 etc:

String 1:  47/13528 
value without any flag will be sorted into the null column along with null tag (n)
String 1 (the integer will be concatenated with 47/13528)


Sorted : 
null
47c1@SP13528;114c4;103c6@SP13519;162c7


Str#2:  55(s)
flagged with (s) will be sorted into strike column

Sorted :
strike
55c2;225c5;26c5@SP10533-10541;162c7

I'm trying to parse it by modifying previous code, seems no luck

{
    for (i=1; i<=NF; i++) {
        num  = $i+0
        abbr = $i
        gsub(/[^[:alpha:]]/,"",abbr)
        list[abbr] = list[abbr] num " c " val ORS
    }
}
END {
    n = split("dead null strike",types)
    for (i=1; i<=n; i++) {
        name = types[i]
        abbr = substr(name,1,1)
        printf "name,list[abbr]
" 
    }
}

Expected Output (sorted into csv) :

dead,null,strike
,47c1@SP13528;114c4; 26c5@SP10533-10541;103c6@SP13519;162c7, 152c8@SP12345;152c8@SP12346,55c2;225c5;162c7;10c7

Breakdown for crosscheck purpose:

dead
none 

null
47c1@SP13528;114c4;103c6@SP13519;162c7;152c8@SP12345;152c8@SP12346;26c5@SP10533-10541;;162c7

strike
55c2;225c5;10c7

thanasisp · Accepted Answer

Here is an awk script for parsing your file.

BEGIN {
    types["d"]; types["n"]; types["s"]
    deft = "n"; OFS = ","; sep = ";"
}

$1=="String" {
    gsub(/[)(]/,""); gsub(",", " ")    # general line subs
    for (i=3;i<=NF;i++) {
        if (!gsub("/","c"$2+0"@SP", $i)) $i = $i"c"$2+0    # make all subs on items
        for (t in types) { if (gsub(t, "", $i)) { x=t; break }; x=deft } #find type
        items[x] = items[x]? items[x] sep $i: $i    # append for type found
    }
}

END {
    print "dead" OFS "null" OFS "strike"
    print items["d"] OFS items["n"] OFS items["s"]
}

Input:

String 1:  47/13528 
String 2:  55(s) 
String 3:   
String 4:  114(n) 
String 5:  225(s), 26/10533-10541 
String 6:  103/13519 
String 7:  10(s), 162(n) 
String 8:  152/12345,12346
(d=dead, n=null, s=strike)

Output:

> awk -f tst.awk file
dead,null,strike
,47c1@SP13528;114c4;26c5@SP10533-10541;103c6@SP13519;162c7;152c8@SP12345;12346c8,55c2;225c5;10c7

Your description was changing on important details, like how we decide the type of an item or how they are separated, and untill now your input and outputs are not consistent to it, but in general I think you can easily get what is done into this script. Have in mind that gsub() returns the number of the substitutions made, while doing them also, so many times it is convenient to use it as a condition.

concatenating string with multiple array

Answers (2)

Related Questions