thundercat
thundercat

Reputation: 45

How to take to two columns of data, perform a calculation and create a new column with that data

I am not sure if the way I hardcoded this table is the best way of doing it. But anyways, rn I am trying to take the total points column and dividing it by the time needed to get the points per minute then create a new column for that calculation but I can't seem to make it work.

table <- data.frame(list(Question=c("Q1", "Q2", "Q3", "Q4" , "Q5" , "Q6" , "Q7" , "Q8", "Q9"), 
                        Total_Points=c("21","5","10","14","5","5","10","5","5"), Time_needed=c("24","7","15","12","4","3","10","5","6")))
table <- transform(table, Points_per_min = table$Total_Points / table$Time_needed)

Gives me an error "Warning message: In Ops.factor(table$Total_Points, table$Time_needed) : ‘/’ not meaningful for factors"

Thanks so much for help!

Upvotes: 2

Views: 104

Answers (3)

ladylala
ladylala

Reputation: 221

@Chase already explained why you were not able to create a new column before, but in case you wanted a solution involving fewer lines of code, you can use the dplyr package (I'm using dplyr version 0.7.6).

As @Ozgur Yigit mentioned, you do not need to pass in your data as a list. You can actually just pass in each column as an argument for data.frame(...).

EDITED

library(dplyr)

table <- data.frame(Question = c("Q1", "Q2", "Q3", "Q4" , "Q5" , "Q6" , "Q7" , "Q8", "Q9"), 
                    Total_Points = c("21","5","10","14","5","5","10","5","5"), 
                    Time_needed = c("24","7","15","12","4","3","10","5","6"))
table %>% 
  mutate_at(vars(Total_Points, Time_needed), 
            funs(as.numeric(as.character(.)))) %>% # converts the `Total_Points` and `Time_needed` cols to character first (b/c factors coded as integers under the hood) then to numeric
  mutate(Points_per_min = Total_Points / Time_needed)

For more information on mutate(): https://r4ds.had.co.nz/transform.html#add-new-variables-with-mutate

PS Apologies if my response is unhelpful or outside the scope of SO responses. I am still learning how to write answers on SO.

Upvotes: 0

DeduciveR
DeduciveR

Reputation: 1702

you can use tidyverse but only need one mutate statement - little different from the other answerer:

table2 <- table %>%
     mutate(Total_Points = as.numeric(as.character(Total_Points)),
     Time_needed  = as.numeric(as.character(Time_needed)),
     Points_per_min = Total_Points / Time_needed)

  Question Total_Points Time_needed Points_per_min
1       Q1           21          24      0.8750000
2       Q2            5           7      0.7142857
3       Q3           10          15      0.6666667
4       Q4           14          12      1.1666667
5       Q5            5           4      1.2500000
6       Q6            5           3      1.6666667
7       Q7           10          10      1.0000000
8       Q8            5           5      1.0000000
9       Q9            5           6      0.8333333

Upvotes: 0

Chase
Chase

Reputation: 69241

The warning is informative here, which you can see by examining the str() of table. It tells you that all your variables are of type factor. Check here for a quick intro on different data types.

Convert your data to numeric then do your math:

table <- data.frame(list(Question=c("Q1", "Q2", "Q3", "Q4" , "Q5" , "Q6" , "Q7" , "Q8", "Q9"), 
                         Total_Points=c("21","5","10","14","5","5","10","5","5"), Time_needed=c("24","7","15","12","4","3","10","5","6")))
str(table)
#> 'data.frame':    9 obs. of  3 variables:
#>  $ Question    : Factor w/ 9 levels "Q1","Q2","Q3",..: 1 2 3 4 5 6 7 8 9
#>  $ Total_Points: Factor w/ 4 levels "10","14","21",..: 3 4 1 2 4 4 1 4 4
#>  $ Time_needed : Factor w/ 9 levels "10","12","15",..: 4 9 3 2 6 5 1 7 8
table$Total_Points <- as.numeric(as.character(table$Total_Points))
table$Time_needed <- as.numeric(as.character(table$Time_needed))
table$Points_per_min <- table$Total_Points / table$Time_needed
table
#>   Question Total_Points Time_needed Points_per_min
#> 1       Q1           21          24      0.8750000
#> 2       Q2            5           7      0.7142857
#> 3       Q3           10          15      0.6666667
#> 4       Q4           14          12      1.1666667
#> 5       Q5            5           4      1.2500000
#> 6       Q6            5           3      1.6666667
#> 7       Q7           10          10      1.0000000
#> 8       Q8            5           5      1.0000000
#> 9       Q9            5           6      0.8333333

Created on 2019-02-10 by the reprex package (v0.2.1)

Upvotes: 3

Related Questions