simplycoding
simplycoding

Reputation: 2977

How do I define and run a function over a dataframe?

I have the following function that I was able to copy into a piece of code but it wasn't running directly over a dataframe. I need to run it now simply on a dataframe with a slight change, but can't figure out the proper syntax for doing this.

The function is simply: function(x) ifelse(x>0, paste0("+", x), x)

And the change is that I want to run it on every column except for the first column. So after the first column, this function should iterate over all the cells in the dataframe and prepend a + sign to any positive value.

And I'd like to run the modified function over dataframe df. Is there a way to do this inline?

Sample data to play with:

structure(list(data_2018 = c(3.2, 3, 3.2), data_2017 = c(2.825, 
0, -0.425), pilot = c(0.51578947368421, -0.0526315789473699, 
0.41052631578947), all = c(0.42222222222222, -0.18518518518519, 
0.27407407407407), general = c(0.40833333333333, -0.0833333333333299, 
0.36666666666667)), class = "data.frame", row.names = c(NA, -3L
))

Upvotes: 0

Views: 61

Answers (2)

utubun
utubun

Reputation: 4505

There are few approaches you can use:


base

daf[, 2:5] <- lapply(daf[, 2:5], fu)

dplyr

#library(dplyr)

mutate_at(daf, vars(data_2017:general), fu)

data.table

#library(data.table)

dat <- data.table(daf)

dat[, 
    (colnames(dat)[-1]) := lapply(.SD, fu), 
    .SDcols = -1
    ]

data

daf <- structure(
  list(data_2018 = c(3.2, 3, 3.2), 
       data_2017 = c(2.825, 0, -0.425), 
       pilot = c(0.51578947368421, -0.0526315789473699, 0.41052631578947), 
       all = c(0.42222222222222, -0.18518518518519, 0.27407407407407), 
       general = c(0.40833333333333, -0.0833333333333299, 0.36666666666667)
  ), 
  class = "data.frame", row.names = c(NA, -3L)
)

function

fu <- function(x) ifelse(x>0, paste0("+", x), x)

output

  data_2018 data_2017               pilot               all             general
1       3.2    +2.825   +0.51578947368421 +0.42222222222222   +0.40833333333333
2       3.0         0 -0.0526315789473699 -0.18518518518519 -0.0833333333333299
3       3.2    -0.425   +0.41052631578947 +0.27407407407407   +0.36666666666667

Output shown only for lapply call


Upvotes: 0

DeduciveR
DeduciveR

Reputation: 1702

seem to lose a trailing zero in the first column, but this works when considering your example data as df:

df2 <- as.data.frame(apply(df, 2, function(x) if_else(substr(as.character(x), 1, 1) == "-" | as.character(x) == "0",
                                                  as.character(x),
                                                  paste0("+", as.character(x)))))

I took a different approach - I looked for the minus sign or a zero as characters and then added the + from there.

UPDATE - simplified code below with dplyr

library(dplyr)
df2 <- df %>%
  mutate_all(as.character) %>% 
  apply(2, function(x) if_else(substr(x, 1, 1) == "-" | x == "0",
                           x,
                           paste0("+", x))) %>% 
  as.data.frame()

Upvotes: 1

Related Questions