Stepan Yakovenko
Stepan Yakovenko

Reputation: 9216

How to split mailbox into single file per message?

I'd like to split my inbox into separate files (one file per one message) by bash command, or may be simple program in Java. How can I do it?

WBR, Thanx.

Upvotes: 9

Views: 11777

Answers (4)

thesame
thesame

Reputation: 133

If you, like me, are trying to split mbox file from Google Takeout, this is what I used:

awk '/^From / { file = $2 ".eml"; next } {print > file}' ../download.mbox

Upvotes: 1

mivk
mivk

Reputation: 14999

The old mailbox files I have seen have messages separated by a line starting with "From ", followed by:

  • either "???@??? " and a date for old Eudora files:
    From ???@??? Fri Oct 16 10:49:27 1998
  • or just "- " and a date for Thunderbird files:
    From - Tue Jul 31 13:23:45 2007

So you can use this Perl oneliner

perl -pe 'open STDOUT, ">out".++$n if /^From (-|\?{3}\@\?{3}) /' < $IN

or, to have 6 digit 0-padded numbers (if your mailbox is less than 1m messages.) and an ".eml" extension:

perl -pe 'open STDOUT, sprintf(">m%06d.eml", ++$n) if /^From (-|\?{3}\@\?{3}) /' < $IN

Upvotes: 7

mop
mop

Reputation: 347

there is also a specialized git command for this:

mkdir messages
git mailsplit -omessages mbox

Upvotes: 1

Igor Chubin
Igor Chubin

Reputation: 64603

Just use formail. formail is a program that can process mailbox, run some actions for each message in the mailbox, separate messages and so on.

More info: http://www.manpagez.com/man/1/formail/

If you want just split a mailbox to separate files, I would suggest such solution:

$ cat $MAIL | formail -ds sh -c 'cat > msg.$FILENO'

From man:

   FILENO
        While splitting, formail  assigns  the  message  number  currently
        being  output  to  this  variable.   By presetting FILENO, you can
        change the initial message number being used and the width of  the
        zero-padded  output.   If  FILENO is unset it will default to 000.
        If FILENO is non-empty and does not contain a number, FILENO  gen-
        eration is disabled.

Note: formail is also included in procmail - https://github.com/BuGlessRB/procmail .

Upvotes: 14

Related Questions