Cebs
Cebs

Reputation: 180

Using sed to split a file by a sequence. bash

I want to cut a 211,548,559 lines file into 10 smaller files. So, the first file, for example will have 1st to 21154856th line

I would like to write a for loop with a seq that allows me to automatize the process.

I tried to create a function first and then a loop with seq.

run_sed(){
    sed -n $1p Bar08_depth_chr1.txt > Bar8_d_c1_$1.txt
}
for pos in seq 1 10 211548559
do
    run_sed $pos
done

This script didn't work. I believe its because the $1 in sed -n 1$p But I don't know how to solve it

Upvotes: 2

Views: 336

Answers (1)

Dennis Williamson
Dennis Williamson

Reputation: 360325

For GNU split:

split -nl/10 --additional-suffix=.txt -d Bar08_depth_chr1.txt Bar8_d_c1_

Which will create 10 files named Bar8_d_c1_00.txt through Bar8_d_c1_09.txt which will likely not need to be renamed.

For split under MacOS:

split -l $(( (211548559 - 9) / 10 )) Bar08_depth_chr1.txt Bar8_d_c1_

Which will create 10 files named Bar8_d_c1_aa through Bar8_d_c1_aj which can be renamed to the name pattern you need.

The calculation shown causes the number of lines per file to be rounded up in order to avoid a very small 11th file.

Upvotes: 2

Related Questions