D.C. the III
D.C. the III

Reputation: 340

appending entries to vector in R is producing list instead of vector as expected

I was attempting to fill an empty vector of values using a for loop and then reference specific entries from these vectors to fill a data frame. After doing a little researching it appears my vector got converted to a list. But I'm not sure how that occurred. The code is pretty simple, unless there is another underlying nuance that I'm missing. Here is the code I used:

simul_conf_int_mean_612 = tibble("mean_estimate" = vector(), "lwr" = vector(), "upr" = vector()) #initialized/created an empty data frame

fit_mean_res_612 = vector()
lwr_mean_res_612 = vector()
upr_mean_res_612 = vector()

for (i in 1:5){
  
  fit_mean_res_612[i] = Grocery_Retailer_mean_response_612[i,5]
  lwr_mean_res_612[i] = Grocery_Retailer_mean_response_612[i,5] - Bonferroni_Grocery_Retailer_mean_612 * sqrt(estimated_var_matrix_mean_response_612[i,i])
  upr_mean_res_612[i] = Grocery_Retailer_mean_response_612[i,5] + Bonferroni_Grocery_Retailer_mean_612 * sqrt(estimated_var_matrix_mean_response_612[i,i])
  
}

To test if I was doing things in the right way I tested filling a column of my initialized data frame:

> simul_conf_int_mean_612[,1] = fit_mean_res_612
Error: Can't recycle `fit_mean_res_612` (size 5) to size 1.
Run `rlang::last_error()` to see where the error occurred

This is where the motivation for my whole question came from.

So to see the form of the vector that I created I called it and I got it in the form of a list:

> fit_mean_res_612
[[1]]
     [,1]
1 4292.79

[[2]]
      [,1]
2 4245.293

[[3]]
      [,1]
3 4279.424

[[4]]
      [,1]
4 4333.203

[[5]]
      [,1]
5 4917.418

Before I get chastised about the inefficiency of writing for loops I just read up on it. So I'll "append" that knowledge to further applications. But what am I missing in this simple code?

EDIT: Since the example is so small I'm also able to provide the outputs fromdput for the variables used in my loop:

> dput(head(Grocery_Retailer_mean_response_612, 5))
structure(list(Total_Labour_hrs = c(0, 0, 0, 0, 0), Cases_Shipped = c(302000, 
245000, 280000, 350000, 295000), Labour_Hrs_Cost = c(7.2, 7.4, 
6.9, 7, 6.7), Holiday = c(0, 0, 0, 0, 1), Estimated_mean_response = structure(c(4292.7901491276, 
4245.29336352498, 4279.42418648312, 4333.2032112936, 4917.41807678638
), .Dim = c(5L, 1L), .Dimnames = list(c("1", "2", "3", "4", "5"
), NULL))), row.names = c(NA, -5L), class = c("tbl_df", "tbl", 
"data.frame"))
> dput(head(estimated_var_matrix_mean_response_612,5))
structure(c(9364732.79459537, 8929134.32888918, 9755196.52837781, 
9778318.18418078, 1361250.54834372, 8929134.3288892, 18113464.4689885, 
11908157.9211221, 1147591.80945451, 862671.162787766, 9755196.52837787, 
11908157.9211222, 12268422.3003328, 8083016.63097916, 4987857.12818105, 
9778318.18418069, 1147591.80945441, 8083016.63097903, 17183253.6748362, 
2100840.51542087, 1361250.54834367, 862671.162787751, 4987857.12818098, 
2100840.51542089, 80202096.5334524), .Dim = c(5L, 5L), .Dimnames = list(
    c("1", "2", "3", "4", "5"), c("1", "2", "3", "4", "5")))

Bonferroni is just a constant

> Bonferroni_Grocery_Retailer_mean_612
[1] 2.682204

I was able to get a working solution by using unlist, but now I'm curious as to why my data ended up coercing to a list instead of an atomic vector. It isn't a problem for such a small data set, but I could envision issues when things are scaled up.

EDIT 2:

I've added some code to the initial post to provide an idea of exactly what it is I was attempting to accomplish. I tried just creating a tibble after I created the vectors and then filling in the tibble then, but it appears that doesn't solve the issue either because the tibble ended up looking as follows:

> test_conf_bounds = tibble("point_estimate" = fit_pred_613, "lwr" = lwr_pred_613, "upr" = upr_pred_613)
> test_conf_bounds
# A tibble: 4 x 3
  point_estimate    lwr       upr      
  <list>            <list>    <list>   
1 <dbl[,1] [1 x 1]> <dbl [1]> <dbl [1]>
2 <dbl[,1] [1 x 1]> <dbl [1]> <dbl [1]>
3 <dbl[,1] [1 x 1]> <dbl [1]> <dbl [1]>
4 <dbl[,1] [1 x 1]> <dbl [1]> <dbl [1]>

Upvotes: 0

Views: 411

Answers (1)

IceCreamToucan
IceCreamToucan

Reputation: 28685

The fifth column of your input data, the Estimated_mean_response is a 1-d matrix. So your output is a vector of matrices. Not sure how that happened, but I would just convert that column to a regular numeric column and everything should work as you expect from there.

str(Grocery_Retailer_mean_response_612)
#> tibble [5 × 5] (S3: tbl_df/tbl/data.frame)
#>  $ Total_Labour_hrs       : num [1:5] 0 0 0 0 0
#>  $ Cases_Shipped          : num [1:5] 302000 245000 280000 350000 295000
#>  $ Labour_Hrs_Cost        : num [1:5] 7.2 7.4 6.9 7 6.7
#>  $ Holiday                : num [1:5] 0 0 0 0 1
#>  $ Estimated_mean_response: num [1:5, 1] 4293 4245 4279 4333 4917
#>   ..- attr(*, "dimnames")=List of 2
#>   .. ..$ : chr [1:5] "1" "2" "3" "4" ...
#>   .. ..$ : NULL

Grocery_Retailer_mean_response_612$Estimated_mean_response <- 
  as.numeric(Grocery_Retailer_mean_response_612$Estimated_mean_response)

str(Grocery_Retailer_mean_response_612)
#> tibble [5 × 5] (S3: tbl_df/tbl/data.frame)
#>  $ Total_Labour_hrs       : num [1:5] 0 0 0 0 0
#>  $ Cases_Shipped          : num [1:5] 302000 245000 280000 350000 295000
#>  $ Labour_Hrs_Cost        : num [1:5] 7.2 7.4 6.9 7 6.7
#>  $ Holiday                : num [1:5] 0 0 0 0 1
#>  $ Estimated_mean_response: num [1:5] 4293 4245 4279 4333 4917

Created on 2021-10-03 by the reprex package (v2.0.1)

Upvotes: 1

Related Questions