wangtianye
wangtianye

Reputation: 306

Split files by line content

I have a file with the following content.

aaaa
bbbb
cccc
1111
qqqq
1111
aaaa
dddd

Split into multiple small files with 1111 as a separator.The method I tried is as follows.

#!/bin/bash
i=0
while read line  
do
        if [[ $line =~ '1111'  ]];then
                ((i++))
        else
                echo $line >> $i.txt
        fi
done < data.txt

Split into several files as follows

0.txt
aaaa
bbbb
cccc

1.txt
qqqq

2.txt
aaaa
dddd

But I want to get a more concise method, what should I do?

Upvotes: 0

Views: 67

Answers (2)

John1024
John1024

Reputation: 113994

There is a utility built just for this. Try:

csplit -f '' -b'%d.txt' --suppress-matched data.txt /1111/ '{*}'

How it works:

  • -f '' -b'%d.txt'

    These two options tell csplit to name the output files with single digits and .txt at the end.

  • --suppress-matched

    This tells csplit to omit the divider lines.

  • data.txt

    This is the file to divide up.

  • /1111/

    This is the regex pattern to use as a divider.

  • {*}

    This tells csplit to divide as many times as it finds a divider line.

Upvotes: 3

tink
tink

Reputation: 15238

Does this work for you?

awk 'BEGIN{num=0} /^1111/{num++} !/^1111/{print $0 >> num".txt"}' wantianye

I named the input file after your username, and it does what you ask with your sample data

awk 'BEGIN{num=0}                # initialise num to 0
/^1111/{num++}                   # if the line begins with 1111, increment num
!/^1111/{print $0 >> num".txt"}  # if the line DOESN'T begin with 1111, print it to num'.txt'
' wantianye

Upvotes: 1

Related Questions