David
David

Reputation: 315

R Populate a vector by matching names to df column values

I have a named vector filled with zeros

toy1<- rep(0, length(37:45))
names(toy1) <- 37:45

I want to populate the vector with count data from a dataframe

size    count
37      1.181
38      0.421
39      0.054
40      0.005
41      0.031
42      0.582
45      0.024

I need help finding a way to match the value for size to the vector name and then input the corresponding count value into that vector position

Upvotes: 2

Views: 933

Answers (3)

Mankind_2000
Mankind_2000

Reputation: 2208

Lets say your data frame is df, then you can just update the records in toy1 for records available in your data frame:

toy1[as.character(df$size)]    <- df$count

Edit: To check for a match m before updating the records. m are the matched indices in size column of df:

m <- match(names(toy1), as.character(df$size))

Then, for the indices in toy1 which have a match, it can be updated as below:

toy1[which(!is.na(m))]    <- df$count[m[!is.na(m)]]

PS: Efficient way would be to define toy1 as a data frame and perform an outer join by size column.

Upvotes: 2

IRTFM
IRTFM

Reputation: 263301

Might be as simple as:

toy1[ as.character(dat$size) ] <- dat$count
toy1

#   37    38    39    40    41    42    43    44    45 
#1.181 0.421 0.054 0.005 0.031 0.582 0.000 0.000 0.024 

R's indexing for assignments can have character values. If you had just tried to index with the raw column:

toy1[ dat$size ] <- dat$count

You would have gotten (as did I initially):

> toy1
   37    38    39    40    41    42    43    44    45                                                             
0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA 

   NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA 1.181 0.421 

0.054 0.005 0.031 0.582    NA    NA 0.024 

That occurred because numeric indexing occurred and there was default extension of the length of the vector to accommodate the numbers up to 45.

With a version of the dataframe that had a number that was not in the range 37:45, I did get a warning from using match with a nomatch of 0, but I also got the expected results:

toy1[ match( as.character( dat$size), names(toy1) , nomatch=0) ] <- dat$count
#------------
Warning message:
In toy1[match(as.character(dat$size), names(toy1), nomatch = 0)] <- dat$count :
  number of items to replace is not a multiple of replacement length
> toy1
   37    38    39    40    41    42    43    44    45 
1.181 0.421 0.054 0.005 0.031 0.582 0.000 0.000 0.000 

The match function is at the core of the merge function but this application would be much faster than a merge of dataframes

Upvotes: 3

Mark
Mark

Reputation: 4537

First, let's get the data loaded in.

toy1<- rep(0, length(37:45))
names(toy1) <- 37:45
df = read.table(text="37      1.181
38      0.421
39      0.054
40      0.005
41      0.031
42      0.582
45      0.024")
names(df) = c("size","count")

Now, I present a really ugly solution. We only update toy1 where the name of toy1 appears in df$size. We return df$count by obtaining the index of the match in df. I use sapply to get a vector of the index back. On both sizes we only look for places where names(toy1) appear in df$size.

toy1[names(toy1) %in% df$size] = df$count[sapply(names(toy1)[names(toy1) %in% df$size],function(x){which(x == df$size)})]

But, this isn't very elegant. Instead, you could turn toy1 into a data.frame.

toydf = data.frame(toy1 = toy1,name = names(toy1),stringsAsFactors = FALSE)

Now, we can use merge to get the values.

updated = merge(toydf,df,by.x = "name",by.y="size",all.x=T)

This returns a 3 column data.frame. You can then extract the count column from this, replace NA with 0 and you're done.

updated$count[is.na(updated$count)] = 0
updated$count
#> [1] 1.181 0.421 0.054 0.005 0.031 0.582 0.000 0.000 0.024

Upvotes: 1

Related Questions