Sakura
Sakura

Reputation: 57

Add a new column on multiple dataframes using mutate

I am a new user in R.

I have a bunch of dataframe like this:

 Date         ID    Value conversion
1 2018-07-16  123450  617     10
2 2018-07-23  123450  476     20
3 2018-07-30  123450  44.6     30
4 2018-08-06  123450  248     10  
5 2018-08-13  123450  177     40 

All the dataframes have the name pattern like this:

df_1
df_2
df_3

I will need to add a column to calculate the "weight of the conversion", which means that this column is dividing the conversion in each row by the aggregated total sum of the conversions column (which is 100 in the example above). My intention is to calculate the weighted Standard deviation of each dataframe later. So, ideally, the result should looks like this, and hopefully the outcome can be exported as dataframe in the environment as well (can also replace the original one):

 Date         ID    Value conversion  weight
1 2018-07-16 123450  617     10        0.1
2 2018-07-23 123450  476     20        0.2
3 2018-07-30 123450  44.6    30       0.3
4 2018-08-06 123450  248     10        0.1
5 2018-08-13 123450  177     30        0.3

I've tried to reproduce the results from Apply changes (group by) on multiple dataframes using for loop

but I received the error: "no applicable method for 'mutate_' applied to an object of class list". But wasnt sure how I can do this using dplyr or for loop or dataframe list..

Thank you!

EDIT

When I have my result (a list of dataframes), I did this: list2env(list ,envir=.GlobalEnv) however, I have the error:

  names(x) must be a character vector of the same length as x 

Does anyone know how can I resolve this? Thanks a lot!

Upvotes: 1

Views: 833

Answers (2)

TimTeaFan
TimTeaFan

Reputation: 18561

Lets assume this are your data frame:

df1 <- data.frame(Date = c("2018-07-16","2018-07-23","2018-07-30","2018-08-06","2018-08-13"),
                  ID = c(123450,123450,123450,123450,123450),
                  Value = c(617,467,44.6,248,177),
                  conversion = c(10,20,20,10,40))

df2 <- df1

df3 <- df1

Then it would be best to have those data frames in a list. Like this:

df_ls <- list(df1, df2, df3)

Then you can do this to get your desired output.

library(dplyr)
library(purrr)
df_ls %>% map(~ mutate(., weight = conversion/sum(conversion)))

If you do not have the data frames in a list, just create a character vector containing the names of the data frames. Like this:

df_ls1 <- c("df1", "df2", "df3")

Then you can do this:

df_ls1 %>% map(~ mutate(get(., envir = .GlobalEnv), weight = conversion/sum(conversion)))

Both ways yield the same output - a list of data frames:

[[1]]
        Date     ID Value conversion weight
1 2018-07-16 123450 617.0         10    0.1
2 2018-07-23 123450 467.0         20    0.2
3 2018-07-30 123450  44.6         20    0.2
4 2018-08-06 123450 248.0         10    0.1
5 2018-08-13 123450 177.0         40    0.4

[[2]]
        Date     ID Value conversion weight
1 2018-07-16 123450 617.0         10    0.1
2 2018-07-23 123450 467.0         20    0.2
3 2018-07-30 123450  44.6         20    0.2
4 2018-08-06 123450 248.0         10    0.1
5 2018-08-13 123450 177.0         40    0.4

[[3]]
        Date     ID Value conversion weight
1 2018-07-16 123450 617.0         10    0.1
2 2018-07-23 123450 467.0         20    0.2
3 2018-07-30 123450  44.6         20    0.2
4 2018-08-06 123450 248.0         10    0.1
5 2018-08-13 123450 177.0         40    0.4

Upvotes: 3

Parfait
Parfait

Reputation: 107632

Since you are new to R, consider the base package which ships with every install of R and loads with each R session. In fact, library is a base method!

With this package, you can run your simple arithmetic and handle new column assignment with transform or within.

df_list <- list(df_1, df_2, df_3)

new_df_list <- lapply(df_list, function(df) 
    within(df, conv_weight <- conversion / sum(conversion))
    # transform(df, conv_weight = conversion / sum(conversion))  # EQUIVALENT CALL
)

Upvotes: 2

Related Questions