Reputation: 197
I have a .csv table named mailing.csv
as below. It consist a receiver, subject and sender.
Receiver subject sender
1 Adrian Cole RE: [WHIRR-117] Composable services Tom White
2 Adrian Cole RE: [WHIRR-117] Composable services Tom White
3 Adrian Cole RE: [WHIRR-117] Composable services Adrian Cole
4 Adrian Cole RE: [WHIRR-117] Composable services Adrian Cole
5 Adrian Cole RE: [WHIRR-117] Composable services Tom White
6 Adrian Cole RE: [WHIRR-117] Composable services Adrian Cole
7 Adrian Cole RE: [WHIRR-117] Composable services Tom White
8 Adrian Cole RE: [WHIRR-117] Composable services Tom White
9 Adrian Cole RE: [WHIRR-117] Composable services Adrian Cole
10 Adrian Cole RE: [WHIRR-117] Composable services Adrian Cole
11 Adrian Cole RE: [WHIRR-117] Composable services Tom White
12 Adrian Cole RE: [WHIRR-117] Composable services Tom White
13 Adrian Cole RE: [WHIRR-117] Composable services Tom White
14 Adrian Cole RE: [WHIRR-117] Composable services Tom White
15 Patrick Hunt RE: [WHIRR-123] Cassandra integration Tom White
16 Patrick Hunt RE: [WHIRR-123] Cassandra integration Andrei Savu
17 Patrick Hunt RE: [WHIRR-123] Cassandra integration Andrei Savu
18 Patrick Hunt RE: [WHIRR-123] Cassandra integration Tom White
19 Patrick Hunt RE: [WHIRR-123] Cassandra integration Tom White
20 Patrick Hunt RE: [WHIRR-123] Cassandra integration Adrian Cole
21 Patrick Hunt RE: [WHIRR-123] Cassandra integration Tom White
22 Patrick Hunt RE: [WHIRR-123] Cassandra integration Patrick Hunt
What I would like to do is to update/map the information from table above to .csv template (named AC_template.csv
), and save it in a separate file using subject details in backet as file name (for instance AC_WHIRR-117
). As for table above, it should create two new files name as AC_WHIRR-117
and AC_WHIRR-123
.
sample .csv template (AC_template.csv
) is as below:
Adrian.Cole Patrick.Hunt Andrei.Savu Bruno.Dumon Edward.J..Yoon Eugene.Koontz Jakob.Homan Kelvin.Kakugawa Tom.White
Adrian Cole 0 0 0 0 0 0 0 0 0
Patrick Hunt 0 0 0 0 0 0 0 0 0
Andrei Savu 0 0 0 0 0 0 0 0 0
Bruno Dumon 0 0 0 0 0 0 0 0 0
Edward J. Yoon 0 0 0 0 0 0 0 0 0
Eugene Koontz 0 0 0 0 0 0 0 0 0
Jakob Homan 0 0 0 0 0 0 0 0 0
Kelvin Kakugawa 0 0 0 0 0 0 0 0 0
Tom White 0 0 0 0 0 0 0 0 0
Lars George 0 0 0 0 0 0 0 0 0
Soren Macbeth 0 0 0 0 0 0 0 0 0
Lars.George Soren.Macbeth
Adrian Cole 0 0
Patrick Hunt 0 0
Andrei Savu 0 0
Bruno Dumon 0 0
Edward J. Yoon 0 0
Eugene Koontz 0 0
Jakob Homan 0 0
Kelvin Kakugawa 0 0
Tom White 0 0
Lars George 0 0
Soren Macbeth 0 0
Sample output for this question is as below:
sample output for AC_WHIRR-117:
Adrian.Cole Patrick.Hunt Andrei.Savu Bruno.Dumon Edward.J..Yoon Eugene.Koontz Jakob.Homan Kelvin.Kakugawa Tom.White
Adrian Cole 0 0 0 0 0 0 0 0 9
Patrick Hunt 0 0 0 0 0 0 0 0 0
Andrei Savu 0 0 0 0 0 0 0 0 0
Bruno Dumon 0 0 0 0 0 0 0 0 0
Edward J. Yoon 0 0 0 0 0 0 0 0 0
Eugene Koontz 0 0 0 0 0 0 0 0 0
Jakob Homan 0 0 0 0 0 0 0 0 0
Kelvin Kakugawa 0 0 0 0 0 0 0 0 0
Tom White 9 0 0 0 0 0 0 0 0
Lars George 0 0 0 0 0 0 0 0 0
Soren Macbeth 0 0 0 0 0 0 0 0 0
Lars.George Soren.Macbeth
Adrian Cole 0 0
Patrick Hunt 0 0
Andrei Savu 0 0
Bruno Dumon 0 0
Edward J. Yoon 0 0
Eugene Koontz 0 0
Jakob Homan 0 0
Kelvin Kakugawa 0 0
Tom White 0 0
Lars George 0 0
Soren Macbeth 0 0
Sample output for AC_WHIRR-123
Adrian.Cole Patrick.Hunt Andrei.Savu Bruno.Dumon Edward.J..Yoon Eugene.Koontz Jakob.Homan Kelvin.Kakugawa Tom.White
Adrian Cole 0 1 0 0 0 0 0 0 0
Patrick Hunt 1 0 2 0 0 0 0 0 4
Andrei Savu 0 2 0 0 0 0 0 0 0
Bruno Dumon 0 0 0 0 0 0 0 0 0
Edward J. Yoon 0 0 0 0 0 0 0 0 0
Eugene Koontz 0 0 0 0 0 0 0 0 0
Jakob Homan 0 0 0 0 0 0 0 0 0
Kelvin Kakugawa 0 0 0 0 0 0 0 0 0
Tom White 0 4 0 0 0 0 0 0 0
Lars George 0 0 0 0 0 0 0 0 0
Soren Macbeth 0 0 0 0 0 0 0 0 0
Lars.George Soren.Macbeth
Adrian Cole 0 0
Patrick Hunt 0 0
Andrei Savu 0 0
Bruno Dumon 0 0
Edward J. Yoon 0 0
Eugene Koontz 0 0
Jakob Homan 0 0
Kelvin Kakugawa 0 0
Tom White 0 0
Lars George 0 0
Soren Macbeth 0 0
dput(head) for mailing.csv
as below:
structure(list(Receiver = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = c("Adrian Cole",
"Patrick Hunt"), class = "factor"), subject = structure(c(1L,
1L, 1L, 1L, 1L, 1L), .Label = c("RE: [WHIRR-117] Composable services",
"RE: [WHIRR-123] Cassandra integration "), class = "factor"),
sender = structure(c(4L, 4L, 1L, 1L, 4L, 1L), .Label = c("Adrian Cole",
"Andrei Savu", "Patrick Hunt", "Tom White"), class = "factor")), .Names = c("Receiver",
"subject", "sender"), row.names = c(NA, 6L), class = "data.frame")
dput(head) for AC_template.csv
:
structure(list(Adrian.Cole = c(0L, 0L, 0L, 0L, 0L, 0L), Patrick.Hunt = c(0L,
0L, 0L, 0L, 0L, 0L), Andrei.Savu = c(0L, 0L, 0L, 0L, 0L, 0L),
Bruno.Dumon = c(0L, 0L, 0L, 0L, 0L, 0L), Edward.J..Yoon = c(0L,
0L, 0L, 0L, 0L, 0L), Eugene.Koontz = c(0L, 0L, 0L, 0L, 0L,
0L), Jakob.Homan = c(0L, 0L, 0L, 0L, 0L, 0L), Kelvin.Kakugawa = c(0L,
0L, 0L, 0L, 0L, 0L), Tom.White = c(0L, 0L, 0L, 0L, 0L, 0L
), Lars.George = c(0L, 0L, 0L, 0L, 0L, 0L), Soren.Macbeth = c(0L,
0L, 0L, 0L, 0L, 0L)), .Names = c("Adrian.Cole", "Patrick.Hunt",
"Andrei.Savu", "Bruno.Dumon", "Edward.J..Yoon", "Eugene.Koontz",
"Jakob.Homan", "Kelvin.Kakugawa", "Tom.White", "Lars.George",
"Soren.Macbeth"), row.names = c("Adrian Cole", "Patrick Hunt",
"Andrei Savu", "Bruno Dumon", "Edward J. Yoon", "Eugene Koontz"
), class = "data.frame")
sample output for WHIRR-117 as below:
structure(list(Adrian.Cole = c(0L, 0L, 0L, 0L, 0L, 0L), Patrick.Hunt = c(0L,
0L, 0L, 0L, 0L, 0L), Andrei.Savu = c(0L, 0L, 0L, 0L, 0L, 0L),
Bruno.Dumon = c(0L, 0L, 0L, 0L, 0L, 0L), Edward.J..Yoon = c(0L,
0L, 0L, 0L, 0L, 0L), Eugene.Koontz = c(0L, 0L, 0L, 0L, 0L,
0L), Jakob.Homan = c(0L, 0L, 0L, 0L, 0L, 0L), Kelvin.Kakugawa = c(0L,
0L, 0L, 0L, 0L, 0L), Tom.White = c(9L, 0L, 0L, 0L, 0L, 0L
), Lars.George = c(0L, 0L, 0L, 0L, 0L, 0L), Soren.Macbeth = c(0L,
0L, 0L, 0L, 0L, 0L)), .Names = c("Adrian.Cole", "Patrick.Hunt",
"Andrei.Savu", "Bruno.Dumon", "Edward.J..Yoon", "Eugene.Koontz",
"Jakob.Homan", "Kelvin.Kakugawa", "Tom.White", "Lars.George",
"Soren.Macbeth"), row.names = c("Adrian Cole", "Patrick Hunt",
"Andrei Savu", "Bruno Dumon", "Edward J. Yoon", "Eugene Koontz"
), class = "data.frame")
sample output for WHIRR-123 as below:
structure(list(Adrian.Cole = c(0L, 1L, 0L, 0L, 0L, 0L), Patrick.Hunt = c(1L,
0L, 2L, 0L, 0L, 0L), Andrei.Savu = c(0L, 2L, 0L, 0L, 0L, 0L),
Bruno.Dumon = c(0L, 0L, 0L, 0L, 0L, 0L), Edward.J..Yoon = c(0L,
0L, 0L, 0L, 0L, 0L), Eugene.Koontz = c(0L, 0L, 0L, 0L, 0L,
0L), Jakob.Homan = c(0L, 0L, 0L, 0L, 0L, 0L), Kelvin.Kakugawa = c(0L,
0L, 0L, 0L, 0L, 0L), Tom.White = c(0L, 4L, 0L, 0L, 0L, 0L
), Lars.George = c(0L, 0L, 0L, 0L, 0L, 0L), Soren.Macbeth = c(0L,
0L, 0L, 0L, 0L, 0L)), .Names = c("Adrian.Cole", "Patrick.Hunt",
"Andrei.Savu", "Bruno.Dumon", "Edward.J..Yoon", "Eugene.Koontz",
"Jakob.Homan", "Kelvin.Kakugawa", "Tom.White", "Lars.George",
"Soren.Macbeth"), row.names = c("Adrian Cole", "Patrick Hunt",
"Andrei Savu", "Bruno Dumon", "Edward J. Yoon", "Eugene Koontz"
), class = "data.frame")
Appreciate help from the expert...
Upvotes: 0
Views: 153
Reputation: 13363
Some use of the plyr
package with the base function table
. Some assembly may be required. This should get you most of the way there.
#load template
template <- structure(list(Adrian.Cole = c(0L, 0L, 0L, 0L, 0L, 0L), Patrick.Hunt = c(0L,
0L, 0L, 0L, 0L, 0L), Andrei.Savu = c(0L, 0L, 0L, 0L, 0L, 0L),
Bruno.Dumon = c(0L, 0L, 0L, 0L, 0L, 0L), Edward.J..Yoon = c(0L,
0L, 0L, 0L, 0L, 0L), Eugene.Koontz = c(0L, 0L, 0L, 0L, 0L,
0L), Jakob.Homan = c(0L, 0L, 0L, 0L, 0L, 0L), Kelvin.Kakugawa = c(0L,
0L, 0L, 0L, 0L, 0L), Tom.White = c(0L, 0L, 0L, 0L, 0L, 0L
), Lars.George = c(0L, 0L, 0L, 0L, 0L, 0L), Soren.Macbeth = c(0L,
0L, 0L, 0L, 0L, 0L)), .Names = c("Adrian.Cole", "Patrick.Hunt",
"Andrei.Savu", "Bruno.Dumon", "Edward.J..Yoon", "Eugene.Koontz",
"Jakob.Homan", "Kelvin.Kakugawa", "Tom.White", "Lars.George",
"Soren.Macbeth"), row.names = c("Adrian Cole", "Patrick Hunt",
"Andrei Savu", "Bruno Dumon", "Edward J. Yoon", "Eugene Koontz"
), class = "data.frame")
#the rownames of this data frame hold the names of senders/receivers
#that we are interested in
names.to.search <- rownames(template)
#load data frame
mailing <- structure(list(Receiver = structure(c(1L, 1L, 1L, 1L, 1L, 1L),
.Label = c("Adrian Cole", "Patrick Hunt"), class = "factor"),
subject = structure(c(1L, 1L, 1L, 1L, 1L, 1L),
.Label = c("RE: [WHIRR-117] Composable services",
"RE: [WHIRR-123] Cassandra integration "), class = "factor"),
sender = structure(c(4L, 4L, 1L, 1L, 4L, 1L), .Label = c("Adrian Cole",
"Andrei Savu", "Patrick Hunt", "Tom White"), class = "factor")), .Names = c("Receiver",
"subject", "sender"), row.names = c(NA, 6L), class = "data.frame")
names(mailing) <- tolower(names(mailing))
#get topic to sort by
mailing$topic <- gsub(".*\\[(.*)\\].*","\\1",mailing$subject)
#restrict to rows that have sender and receiver in names list
mailing <- mailing[mailing$receiver %in% names.to.search &
mailing$sender %in% names.to.search,]
library(plyr)
fn <- function(x) {
with(x, {
#add NA-name and name-NA to the sender and receiver lists
#so that the resulting table is of the right dimension
receiver <- append(as.character(receiver),
c(names.to.search, rep(NA,times=length(names.to.search))))
sender <- append(as.character(sender),
c(rep(NA,times=length(names.to.search)),names.to.search))
#create the table
y <- table(receiver,sender)
#write table to csv file
write.csv(y,file=paste0("AC_",topic[1],".csv"))
})
}
#perform fn on each section of data frame by topic
d_ply(mailing,.(topic),fn)
Upvotes: 2