Reputation: 59
I have the following CSV file:
data.csv
Chart #,Ticker,Industry,Last Price,Multiple
2,AFL,Accident & Health Insurance,60.9,0.82
3,UNM,Accident & Health Insurance,32.97,1.52
4,CNO,Accident & Health Insurance,19.33,2.59
2,OMC,Advertising Agencies,71.71,0.7
3,IPG,Advertising Agencies,21.24,2.35
4,ADS,Advertising Agencies,278.18,0.18
2,UPS,Air Delivery & Freight Services,103.8,0.48
3,FDX,Air Delivery & Freight Services,152.11,0.33
4,EXPD,Air Delivery & Freight Services,50.725,0.99
5,CHRW,Air Delivery & Freight Services,72.3,0.69
6,FWRD,Air Delivery & Freight Services,42.86,1.17
I'd like to use Awk or the best linux command line tool to make the date in the file look like this:
output.txt
Accident & Health Insurance
2*0.82,3*1.52,4*2.59
Advertising Agencies
2*0.7,3*2.35,4*0.18
Air Delivery & Freight Services
2*0.48,3*0.33,4*0.99,5*0.69,6*1.17
I'm basically taking all "Chart #" & multiplying them by the multiple and outputing the "Industry" on one line, all charts with multiples comma separated, then a blank space on a third line...then it processes the entire list.
Can someone point me in the right direction on how to do this? Would Awk be the best tool for this task or do I have to create a bash script to process it all?
Upvotes: -1
Views: 412
Reputation: 203617
$ awk -F, -v OFS='\n' -v ORS='\n\n' '
NR==1 { next }
(NR>2) && ($3!=prevKey) { print prevKey, prevRec; prevRec="" }
{ prevKey=$3; prevRec=(prevRec==""?"":prevRec",") $1"*"$NF }
END { print prevKey, prevRec }
' file
Accident & Health Insurance
2*0.82,3*1.52,4*2.59
Advertising Agencies
2*0.7,3*2.35,4*0.18
Air Delivery & Freight Services
2*0.48,3*0.33,4*0.99,5*0.69,6*1.17
The functional differences between the above and @A-Ray's answer are that:
Upvotes: 4
Reputation: 1456
awk -F, '{a[$3]=a[$3]?a[$3]","$1"*"$NF:$1"*"$NF}END{for(i in a)print i"\n"a[i]}' filename
Air Delivery & Freight Services
2*0.48,3*0.33,4*0.99,5*0.69,6*1.17
Advertising Agencies
2*0.7,3*2.35,4*0.18
Accident & Health Insurance
2*0.82,3*1.52,4*2.59
Upvotes: 4