sebastian.klotz
sebastian.klotz

Reputation: 125

How to split a text using a string delimiter with R's strsplit?

Let's say I have a text file of a book that contains multiple chapters that contain text.

x <- "Chapter 1 Text. Text. Chapter 2 Text. Text. Chapter 3 Text. Text."

I would like to split up this text and get a separate file for each chapter.

"Chapter 1 Text. Text." "Chapter 2 Text. Text." "Chapter 3 Text. Text."

Ideally, I would like to save each file according to the chapter, so Chapter 1, Chapter 2 and Chapter 3.

I have tried the following:

unlist(strsplit(x, "Chapter", perl = TRUE))

Unfortunately, this deletes the delimiter, which I would like to keep.

I have also tried the following:

unlist(strsplit(x, "(?<=Chapter)", perl=TRUE))

Unfortunately, this only seems to work for a single character but not for a string.

Many thanks for your help!

Upvotes: 1

Views: 344

Answers (1)

akrun
akrun

Reputation: 887048

We need to use regex lookahead

strsplit(x, "\\s(?=Chapter)", perl = TRUE)[[1]]
#[1] "Chapter 1 Text. Text." "Chapter 2 Text. Text." "Chapter 3 Text. Text."

Upvotes: 1

Related Questions