Reputation: 103
I have a file delimited with Cedilla, but the records are in a single line. I have to convert this to a multiline file.
Sample record:
P002365Ç1200ÇMastercardÇcarolinaÇBasildonÇEnglandÇUnited kingdomÇP002368Ç2100ÇVisaÇGouyaÇEchucaÇVictoriaÇAustraliaÇP002373Ç3600ÇMastercardÇRenee ElisabethÇTel AvivÇTel AvivÇIsraelÇP002382Ç6300ÇDinersÇbarbaraÇHyderabadÇAndhra PradeshÇIndia
Need to be converted to:
P002365Ç1200ÇMastercardÇcarolinaÇBasildonÇEnglandÇUnited Kingdom
P002368Ç2100ÇVisaÇGouyaÇEchucaÇVictoriaÇAustralia
P002373Ç3600ÇMastercardÇRenee ElisabethÇTel AvivÇTel AvivÇIsrael
P002382Ç6300ÇDinersÇbarbaraÇHyderabadÇAndhra PradeshÇIndia
Can we achieve this using awk command?
Upvotes: 1
Views: 738
Reputation: 58391
This might work for you (GNU sed):
sed 's/Ç/\n/7;P;D' file
This replaces every 7th Ç
with a newline.
Upvotes: 0
Reputation: 74605
You could use something like this:
awk -FÇ '{for (i=1;i<=NF;++i) printf "%s%s", $i, (i%7==0?RS:FS)}' file
P002365Ç1200ÇMastercardÇcarolinaÇBasildonÇEnglandÇUnited kingdom
P002368Ç2100ÇVisaÇGouyaÇEchucaÇVictoriaÇAustralia
P002373Ç3600ÇMastercardÇRenee ElisabethÇTel AvivÇTel AvivÇIsrael
P002382Ç6300ÇDinersÇbarbaraÇHyderabadÇAndhra PradeshÇIndia
A breakdown of what's going on here:
-FÇ
- This command line argument sets the FS
variable (Field Separator) to the Ç
characterNF
(Number [of] Fields)printf
is executed that prints two strings (%s%s
), the first being the content of the actual field ($i
) and the second being one of two options:
RS
Record Separator), FS
is printed. (Defined as the Ç
character).* The number 7 is used "arbitrarily" because it was your definition for splitting the records according to the example output you supplied.
Upvotes: 4