Reputation: 821
does any one know how to split a file based on a key and name the relatedoutput with the respective key name. Thanx in advance
Input
>mail9.country1(+):38689378-38709400
XXXXXXXXXXXXXXXXX-----------HHHHHHHH------
>father
XXXXXXXXXXXXXXXXX-----------HHHHHHHH------
>mother
XXXXXXXXXXXXXXXXX-----------HHHHHHHH------
>son
XXXXXXXXXXXXXXXXX-----------HHHHHHHH------
>daughter
XXXXXXXXXXXXXXXXX-----------HHHHHHHH------
>mailX.countryX(+):000000-3111111110
XXXXXXXXXXXXXXXXX-----------HHHHHHHH------
>father
XXXXXXXXXXXXXXXXX-----------HHHHHHHH------
>mother
XXXXXXXXXXXXXXXXX-----------HHHHHHHH------
>son
XXXXXXXXXXXXXXXXX-----------HHHHHHHH------
>daughter
XXXXXXXXXXXXXXXXX-----------HHHHHHHH------
Output files should be like below with their respective content
mail9.country1(+):38689378-38709400.mail
>mail9.country1(+):38689378-38709400
XXXXXXXXXXXXXXXXX-----------HHHHHHHH------
>father
XXXXXXXXXXXXXXXXX-----------HHHHHHHH------
>mother
XXXXXXXXXXXXXXXXX-----------HHHHHHHH------
>son
XXXXXXXXXXXXXXXXX-----------HHHHHHHH------
>daughter
XXXXXXXXXXXXXXXXX-----------HHHHHHHH------
mailX.countryX(+):000000-3111111110.mail
>mailX.countryX(+):000000-3111111110
XXXXXXXXXXXXXXXXX-----------HHHHHHHH------
>father
XXXXXXXXXXXXXXXXX-----------HHHHHHHH------
>mother
XXXXXXXXXXXXXXXXX-----------HHHHHHHH------
>son
XXXXXXXXXXXXXXXXX-----------HHHHHHHH------
>daughter
XXXXXXXXXXXXXXXXX-----------HHHHHHHH------
Upvotes: 0
Views: 142
Reputation: 774
Also consider an approach using csplit, on MacOS. Just set up the input variables and run the code line by line in a bash terminal.
#input variables
filename=input_file
split_on="mail"
suffix=".mail"
#this regex is used to prepare the renaming, the crux of the approach
#see how it will work by running the first fgrep line. The second fgrep line actually does the renaming.
regex_for_extracting_filename="s/^(xx[^:]+).+>(mail[^(]+)\(.+/mv \1 \2${suffix}/gi"
splitcount=$(fgrep -c "${split_on}" "${filename}")
splitnumber=$(( splitcount - 1 ))
echo -e "There are going to be the folllowing number of files: \n${splitcount}\n"
#actually does the splitting work
csplit -n${#splitcount} -ks ${filename} "/${split_on}/" {${splitnumber}}
fgrep "${split_on}" xx* | perl -pe "${regex_for_extracting_filename}" #shows the filename rename command
fgrep "${split_on}" xx* | perl -pe "${regex_for_extracting_filename}" | bash -
Upvotes: 0
Reputation: 85865
One way with awk
:
$ awk -F'>' '$2~/^mail/{f=$2".mail";gsub(/[)(]/,"_",f)}{print > f}' file
Upvotes: 1