Galadude
Galadude

Reputation: 253

Separating columns using regex in R

I have a data set where the columns are separated using a ton of white space, so that when you open it in a text editor, the columns are aligned.

The problem is that I can't open this file using the white space separator, because one of the columns contain sentences that have spaces. I was wondering if I could somehow open this file in R, by making a regex separator,

like \s{2,}.

I've tried typing sep='\s{2,}'

but that doesn't work.

Upvotes: 1

Views: 69

Answers (2)

sgibb
sgibb

Reputation: 25736

You could use readLines to read all lines and strsplit+rbind to create your data.frame afterwards:

ll <- readLines(
  textConnection("Column1          Column2
Stupid sentence  Stupid sentence 2
foobar           foobar 2"))

l <- strsplit(ll, " {2,}")

df <- as.data.frame(do.call(rbind, l[-1]))
colnames(df) <- l[[1]]
df
#          Column1           Column2
#1 Stupid sentence Stupid sentence 2
#2          foobar          foobar 2

Upvotes: 1

Ravindra
Ravindra

Reputation: 2271

You can remove the white spaces for the columns data by regex

Upvotes: 0

Related Questions