Reputation: 402
I have a foo_dataframe
(see below) that I want to convert to transaction data:
foo_dataframe <- data.frame(replicate(50,1:4))
foo_dataframe
# X1 X2 X3 X4 X5 X6 X7 X8 X9...................X50
#1 1 1 1 1 1 1 1 1 1
#2 2 2 2 2 2 2 2 2 2
#3 3 3 3 3 3 3 3 3 3
#4 4 4 4 4 4 4 4 4 4
The transaction data I am expecting is below (i.e. the transaction data must be a concatenation of column name and each value of the dataframe):
# X1 X2 X3 X4 ................X50
#1 X1 1 X2 1 X3 1 X4 1 X50 1
#2 X1 2 X2 2 X3 2 X4 2 X50 2
#3 X1 3 X2 3 X3 3 X4 3 X50 3
#4 X1 4 X2 4 X3 4 X4 4 X50 4
I can concatenate each column and its values with this code:
m <- paste(colnames(foo_dataframe)[1], foo_dataframe[[1]], "")
n <- paste(colnames(foo_dataframe)[2], foo_dataframe[[2]], "")
o <- paste(colnames(foo_dataframe)[3], foo_dataframe[[3]], "")
p <- paste(colnames(foo_dataframe)[4], foo_dataframe[[4]], "")
And later join them using data.frame(m,n,o,p)
to produce:
# X1 X2 X3 X4
#1 X1 1 X2 1 X3 1 X4 1
#2 X1 2 X2 2 X3 2 X4 2
#3 X1 3 X2 3 X3 3 X4 3
#4 X1 4 X2 4 X3 4 X4 4
To save time, I think this can be done dynamically using apply functions because I have many columns to be done. However, when I tried apply function, with the code below:
c <- 1:length(length(colnames(foo_dataframe)))
t <- foo_dataframe
transactionData <- function(t, c){ # t = dataframe; c = column no.
paste(colnames(t)[c], t[[c]], "")
}
foo_transactionData <- lapply(t, transactionData, c)
I got the following error:
Error in t[[c]] : attempt to select more than one element in vectorIndex
I have toiled stackoverflow to seek for solution but have not found any. Any help will be appreciated. Thanks.
Upvotes: 0
Views: 160
Reputation: 389135
We can use Map
:
foo_dataframe[] <- Map(paste, names(foo_dataframe), foo_dataframe)
foo_dataframe[, 1:4]
# X1 X2 X3 X4
#1 X1 1 X2 1 X3 1 X4 1
#2 X1 2 X2 2 X3 2 X4 2
#3 X1 3 X2 3 X3 3 X4 3
#4 X1 4 X2 4 X3 4 X4 4
Using lapply
, we can loop over the index of columns or their names
foo_dataframe[] <- lapply(names(foo_dataframe), function(x)
paste(x, foo_dataframe[[x]]))
The equivalent options using purrr
are :
library(purrr)
imap_dfc(foo_dataframe, ~paste(.y, .x))
map2_dfc(foo_dataframe, names(foo_dataframe), ~paste(.y, .x))
map_dfc(names(foo_dataframe), ~paste(.x, foo_dataframe[[.x]]))
EDIT
To avoid NA
values from pasting we can do :
foo_dataframe[] <- Map(function(x, y) ifelse(is.na(y), "",paste(x, y)),
names(foo_dataframe), foo_dataframe)
Upvotes: 2