NBK
NBK

Reputation: 905

r - consolidate rows in dataframe by removing empty spaces

This question is similar to this here, but not the same. I have an extremely large and messy data frame from an OCR'd old document. It's full of empty values that I need to collapse to form consolidated rows. Take the following example:

DF <- data.frame(
  Col1 = c("Egg", "", ""),
  Col2 = c("", "Flour", ""),
  Col3 = c("", "", "Bread"),
  Col4 = c("4", "", ""),
  Col5 = c("", "6", "8")
)

> DF
  Col1  Col2  Col3 Col4 Col5
1  Egg                4     
2      Flour               6
3            Bread         8

How do I remove the empty values of the dataframe to form consolidated rows? The desired output looks like this:

> DF
  Col1    Col2
1  Egg    4     
2  Flour  6
3  Bread  8

The actual dataframe is very large with rows containing varied number of empty values.

Upvotes: 1

Views: 82

Answers (2)

mtoto
mtoto

Reputation: 24188

Here is an option with coalesce() from dplyr:

library(dplyr)
DF[DF==""] <- NA # replace empty strings by NA
data.frame(Col1 = coalesce(!!! DF), Col2 = coalesce(!!! rev(DF)))
#   Col1 Col2
#1   Egg    4
#2 Flour    6
#3 Bread    8

Upvotes: 2

Akarsh Jain
Akarsh Jain

Reputation: 1000

yet, another approach to implement the same

library(dplyr)

df<-do.call("paste", DF)%>%as.data.frame()

df<- df%>%
  transmute(x1= df[[1]]%>%str_extract("[A-z]+"),
            x2 = df[[1]]%>%str_extract("\\d+"))
df

#     x1 x2
#1   Egg  4
#2 Flour  6
#3 Bread  8

@mtoto still worth more credits

Upvotes: 0

Related Questions