Reputation: 1402
I have a single file(input.txt) that looks like follows :
# STOCKHOLM 1.0
#=GF AC RF00001
#=GF ID 5S_rRNA
ghgjg---jkhkjhkjhk
## STOCKHOLM 1.0
#=GF AC RF00002
#=GF ID 6S_rRNA
hhhjkjhk---kjhkjhkj
## STOCKHOLM 1.0
#=GF AC RF00005
#=GF ID 12S_rRNA
hkhjhkjhkjuuwww
I have to split the file where the line equals ##stockholm1.0 and name the file with value in the second string RF00001_full.txt.Hence, for the input file, I should be able to get 3 different files as follows :
RF00001_full.txt
# STOCKHOLM 1.0
#=GF AC RF00001
#=GF ID 5S_rRNA
ghgjg---jkhkjhkjhk
RF00002_full.txt
## STOCKHOLM 1.0
#=GF AC RF00002
#=GF ID 6S_rRNA
hhhjkjhk---kjhkjhkj
RF00005_full.txt
## STOCKHOLM 1.0
#=GF AC RF00005
#=GF ID 12S_rRNA
hkhjhkjhkjuuwww
The code, I have tried till now is as follows :
while read p;
if [[ $p == ## STOCKHOLM 1.0* ]];
then
#what should I do here to sort the line by OS ?
done <input.txt
Upvotes: 0
Views: 247
Reputation: 133428
Could you please try following, written and tested with provided samples.
awk '
/STOCKHOLM/{
close(file)
file=count=""
}
(/STOCKHOLM/ || !NF) && !file{
val=(val?val ORS:"")$0
count++
next
}
count==2{
count=""
file=$NF"_full.txt"
if(val){
print val > (file)
val=""
}
next
}
file{
print >> (file)
}
' Input_file
Explanation: Adding detailed explanation here.
awk ' ##Starting awk program from here.
/STOCKHOLM/{ ##Checking condition if string STOCKHOLM is present in line then do following.
close(file) ##Closing the file opened in background to avoid errors.
file=count="" ##Nullifying variables file and count here.
}
(/STOCKHOLM/ || !NF) && !file{ ##Checking condition if line has string STOCKHOLM OR null fields AND file variable is NULL then do following.
val=(val?val ORS:"")$0 ##Creating val which is concatenating its own value each time cursor comes here.
count++ ##Increment variable count with 1 here.
next ##next will skip all further statements from here.
}
count==2{ ##Checking condition if count is 2 then do following.
count="" ##Nullifying count here.
file=$NF"_full.txt" ##Creating outputfile name here with last field and string adding to it.
if(val){ ##Check if val is NOT NULL then do following.
print val > (file) ##Printing val into output file here.
val="" ##Nullifying val here.
}
next ##next will skip all further statements from here.
}
file{ ##if file is NOT NULL.
print >> (file) ##Printing lines into output file here.
}
' Input_file ##Mentioning Input_file name here.
Upvotes: 2