xbarvazx
xbarvazx

Reputation: 57

How can I change DF values conditionally with value from another column

I have a dataframe in which I would like to check if a column value equals a specific value and if it is change it to a value from another column. In the example below I would like to change all "0/0" into the value from the 4th column, so that first line would be "A" same in line 2 and in line 3 it will be "C".

table example:

chr1A   63248   .   A   G   0/0 0/0 0/0 ./. 0/0
chr1A   80950   .   A   C   1/1 0/0 ./. 0/0 0/0
chr1A   81080   .   C   G   0/0 0/0 0/0 ./. 0/0
chr1A   81084   .   C   T   0/1 0/0 0/0 ./. 0/0 

I tried using this code:

for(i in names(df)) {
  if(df[,i] == "0/0") {df[,i]<-df$V4}
}

but it doesn't change all "0/0" in the dataframe.

Thanks a lot for any help, Raz

Upvotes: 2

Views: 37

Answers (2)

MKR
MKR

Reputation: 20095

An option is to use dplyr::mutate_at as:

library(dplyr)

df %>% mutate_at(vars(6:10), funs(ifelse(.=="0/0",df[,4],.)))

#      V1    V2 V3 V4 V5  V6 V7  V8  V9 V10
# 1 chr1A 63248  .  A  G   A  A   A ./.   A
# 2 chr1A 80950  .  A  C 1/1  A ./.   A   A
# 3 chr1A 81080  .  C  G   C  C   C ./.   C
# 4 chr1A 81084  .  C  T 0/1  C   C ./.   C

Data:

df <- read.table(text =
                 "chr1A   63248   .   A   G   0/0 0/0 0/0 ./. 0/0
                 chr1A   80950   .   A   C   1/1 0/0 ./. 0/0 0/0
                 chr1A   81080   .   C   G   0/0 0/0 0/0 ./. 0/0
                 chr1A   81084   .   C   T   0/1 0/0 0/0 ./. 0/0",
                 stringsAsFactors = FALSE)

Upvotes: 0

akrun
akrun

Reputation: 887128

As we are changing only the column values from 6:10, just loop over those and replace it with the 4th column value

df[6:10] <- lapply(df[6:10], function(x) ifelse(x == "0/0", df[[4]], x))

Or this can be done without a loop by creating a logical matrix, then replicate the 4th column to make the lengths equal and assign the elements (based on the 'i1') to the 4th column value

i1 <- df[6:10] == "0/0"
df[6:10][i1]  <- df$V4[row(df[6:10])][i1]

In the OP's code the logical expression is used within if, but the length of it is greater than 1, so it is better to use ifelse instead of if/else

for(i in names(df)[6:10]) {
    df[,i] <- ifelse(df[,i] == "0/0", df[[4]], df[,i])
 }
df
#     V1    V2 V3 V4 V5  V6 V7  V8  V9 V10
#1 chr1A 63248  .  A  G   A  A   A ./.   A
#2 chr1A 80950  .  A  C 1/1  A ./.   A   A
#3 chr1A 81080  .  C  G   C  C   C ./.   C
#4 chr1A 81084  .  C  T 0/1  C   C ./.   C

Upvotes: 1

Related Questions