GIORIGO
GIORIGO

Reputation: 59

Create a new columns in R

I am carrying out an analysis on some Italian regions. I have a dataset similar to the following:

mydata <- data.frame(date= c(2020,2021,2020,2021,2020,2021),
                 Region= c('Sicilia','Sicilia','Sardegna','Sardegna','Campania','Campania'),
                 Number=c(20,30,50,70,90,69) )

Now I have to create two new columns. The first (called 'Total population') containing a fixed number for each region (for example each row with Sicily will have a "Total Population" = 250). The second column instead contains the % ratio between the value of 'Number' column and the corresponding value of 'Total Population' (for example for Sicily the value will be 20/250 and so on). I hope I explained myself well, Thank you very much

Upvotes: 0

Views: 42

Answers (1)

Sirius
Sirius

Reputation: 5429

Like thsi perhaps:



mydata %<>% group_by( Region ) %>%
    mutate(
        `Total Population` = sum(Number),
        `Ratio of Total` = sprintf( "%.1f%%",100 * Number / sum(Number)) )


mydata is now:

> mydata
# A tibble: 6 x 5
# Groups:   Region [3]
   date Region   Number `Total Population` `Ratio of Total`
  <dbl> <chr>     <dbl>              <dbl> <chr>           
1  2020 Sicilia      20                 50 40.0%           
2  2021 Sicilia      30                 50 60.0%           
3  2020 Sardegna     50                120 41.7%           
4  2021 Sardegna     70                120 58.3%           
5  2020 Campania     90                159 56.6%           
6  2021 Campania     69                159 43.4%         

Upvotes: 1

Related Questions