Rilcon42
Rilcon42

Reputation: 9765

merging data into existing column

I am trying to combine data from a dataframe into one value separated by && using merge (for no particular reason). Can someone explain what I am missing with this command?

news<-data.frame(c("2016-05-20","2016-05-19","2016-05-19"),c("x","y","z"))
data<-data.frame(c("2016-05-20","2016-05-21","2016-05-22"),c(1,2,3))

#bind news with the same date into value seperated by &&
    news<-merge(news,by.x=news[,1])
    #Error in as.data.frame(y) : argument "y" is missing, with no default

Bonus Question:

#merge news with data based on matching date
    merge(news,data,by.x=news[,1],by.y=data[,1])
    #Error in fix.by(by.x, x) : 'by' must specify uniquely valid columns

GOAL:

1                                  2016-05-20          1      x
2                                  2016-05-19          NA     y && z
3                                  2016-05-21          2      NA
4                                  2016-05-22          3      NA

Upvotes: 2

Views: 168

Answers (2)

RoyalTS
RoyalTS

Reputation: 10203

A plyr/dplyr-based solution:

library(dplyr)

news <- data.frame(date=c("2016-05-20","2016-05-19","2016-05-19"),
                   letters=c("x","y","z"), stringsAsFactors = FALSE)
data <- data.frame(date=c("2016-05-20","2016-05-21","2016-05-22"),
                   numbers=c(1,2,3), stringsAsFactors = FALSE)

df <- plyr::rbind.fill(news,data)

df.combined <- df %>% group_by(date) %>% summarize_each(funs(paste(na.omit(.), collapse=" && ")), letters:numbers)

Upvotes: 1

lmo
lmo

Reputation: 38500

This produces the output you want, though it is a two-step process.

# get data with some nice names
news <- data.frame(date=c("2016-05-20","2016-05-19","2016-05-19"), lets=c("x","y","z"))
data <- data.frame(date=c("2016-05-20","2016-05-21","2016-05-22"), nums=c(1,2,3))

# combine observations with the same date
newsC <- aggregate(lets~date, data=news, FUN=paste, collapse="&&")
merge(data, newsC, by="date", all=TRUE)

The first error you are getting is because you are not specifying a second data.frame in merge.

Upvotes: 4

Related Questions