Lauren Dahlin
Lauren Dahlin

Reputation: 159

if match, list in vector

I have a data frame with vectors in a format like the following

ID <- c("ID1", "ID1", "ID1", "ID2", "ID2", "ID3")  
ModNum <- c(1, 2, 3, 1, 2, 0)  
Amnt <- c(2.00, 3.00, 2.00, 5.00, 1.00, 5.00)  
df <- data.frame(ID, ModNum, Amnt)  

My desired output would be to create a new vector in the data frame "Mod" which would be something like

ID   Mod  
ID1 ((1,2.00), (2, 3.00), (3, 2.00))  
ID2 ((1, 5.00), (2, 1.00))  
ID3 ((0, 5.00))  

Then I would delete the redundant IDs.

I have considered using tapply and looping over the IDs to append to a list, but I am a bit confused about how to go about this.

How to add variable key/value pair to list object?

`tapply()` to return data frame

Upvotes: 1

Views: 241

Answers (3)

Marek
Marek

Reputation: 50704

Another solution with plyr package:

df$Mod <- sprintf("(%i, %.2f)", df$ModNum, df$Amnt) # prepare format

library(plyr)
ddply(df, .(ID), summarise, Mod=paste(Mod, collapse=", "))
#    ID                             Mod
# 1 ID1 (1, 2.00), (2, 3.00), (3, 2.00)
# 2 ID2            (1, 5.00), (2, 1.00)
# 3 ID3                       (0, 5.00)

Upvotes: 1

Carl Witthoft
Carl Witthoft

Reputation: 21502

I would recommend organizing the output a little differently, so that your dataframe called Mod has three elements named ID1 , ID2, ID3 , and each of those elements is a matrix with two columns. So ID2 would be

1 5.00
2 1.00
Edit: using split as in the other answer is much cleaner.

then,

Rgames> df<-as.list(1:length(unique(ID))) 
Rgames> names(df)<-unique(ID) 
Rgames> df$ID1<-cbind(ModNum[ID=="ID1"],Amnt[ID=="ID1"]) 
Rgames> df 
$ID1 
     [,1] [,2] 
[1,]    1    2 
[2,]    2    3 
[3,]    3    2 

$ID2
[1] 2

$ID3
[1] 3

And of course you could do a loop or lapply to fill in all the ID slots.

Upvotes: 0

flodel
flodel

Reputation: 89057

Here is a solution using split().

> ID.split <- split(df[-1], df$ID)
> ID.split
$ID1
  ModNum Amnt
1      1    2
2      2    3
3      3    2

$ID2
  ModNum Amnt
4      1    5
5      2    1

$ID3
  ModNum Amnt
6      0    5

> 
> flat.list <- lapply(ID.split, function(x)as.vector(t(x)))
> df <- data.frame(ID = names(flat.list))
> df$Mod <- flat.list
> df
   ID              Mod
1 ID1 1, 2, 2, 3, 3, 2
2 ID2       1, 5, 2, 1
3 ID3             0, 5

It is my opinion that the output of split() (what I called ID.split above) is a much better data.structure to work with from a programming point of view than the final output you asked for.

Upvotes: 1

Related Questions