Reputation: 323
I have a data frame containing emails. There is a column named "message" that looks like this:
> > dataset$message[1]
>[1] Message-ID:...
>
> Date: ...
>
> From: ...
>
> To:...
>
> Subject: ...
>
> Mime-Version: ...
>
> Content-Type:...
>
> Content-Transfer-Encoding: ...
>
> X-From:...
>
> X-To: ...
>
> X-cc:...
>
> X-bcc: ...
>
> X-Folder: ...
>
> X-Origin: ...
>
> X-FileName: ...
>
> > Some message text
In other words, each entry contains 15 lines of headers and then the text. What I want is to remove these 15 lines from each row and be left only with the text, so that
>dataset$message[1]
looks like this:
> Some message text
Upvotes: 0
Views: 148
Reputation: 11480
Something like this would work:
sub("^(?:.*\\n){15}", "", multiline_string_mail, perl = TRUE)
#[1] "Super secret message"
example data: (you should always provide usable example data)
multiline_string_mail =
"hehe
hehe
hehe
hehe
hehe
hehe
hehe
hehe
hehe
hehe
hehe
hehe
hehe
hehe
hehe
Super secret message"
Upvotes: 1