Taylorrrr
Taylorrrr

Reputation: 21

Change value in column R

I have a data set like the subset below

    col1   col2   col3
     y      
            y
     y      y
                   y
            y
     y

In reality, my data set is describing drug overdoses. There is a "y" in the column for the drug that was in each persons system when they died. What I'm trying to do is change the "y" to the cooresponding column. for example, I want the example data set above to appear as

    col1   col2   col3
    col1      
           col2
    col1   col2
                  col3
           col2
    col1

So that I can later combine these three columns to know what each person died of, but only looking in one column. Data set is called "Data1", so I know that I need at some point to put

    Data1$col1
    Data1$col2
    Data1$col3

Someone please help.. I'm very new at this.

Upvotes: 1

Views: 95

Answers (4)

xirururu
xirururu

Reputation: 5508

There are already some brillant answers, but I still write down here my solution inclusive the loop of columns:

df <- data.frame(col1 = c("y", NA, "y", NA, NA, "y"),
                 col2 = c(NA, "y", "y", NA, "y", NA),
                 col3 = c(NA, NA, NA, "y", NA, NA), stringsAsFactors=FALSE)

Notice: The stringAsFactors=FALSE is important!

for (name in colnames(df)){
  df[name][df[name]=="y"] = name
}

Then, you will get what you want:

> df

  col1 col2 col3
1 col1 <NA> <NA>
2 <NA> col2 <NA>
3 col1 col2 <NA>
4 <NA> <NA> col3
5 <NA> col2 <NA>
6 col1 <NA> <NA>

Upvotes: 1

Taylorrrr
Taylorrrr

Reputation: 21

I was able to figure it out:

    Data1$Col1 <- as.character(Data1$Col1)
    Data1$Col1[Data1$Col1 %in% "y"] <- "Col1"

Thank you for all of your help!

Upvotes: 1

tbradley
tbradley

Reputation: 2290

If you add a column for a patient ID, you can reformat to a long format using data.table's melt function:

df <- data.frame(id = 1:6, col1 = c("y", NA, "y", NA, NA, "y"),
                 col2 = c(NA, "y", "y", NA, "y", NA),
                 col3 = c(NA, NA, NA, "y", NA, NA))

df1 <- melt(setDT(df), id.vars = "id", measure.vars = c("col1", "col2", "col3"))
df1 <- df1[order(id)][is.na(value) == FALSE]

then that will give you this:

> df1
   id variable value
1:  1     col1     y
2:  2     col2     y
3:  3     col1     y
4:  3     col2     y
5:  4     col3     y
6:  5     col2     y
7:  6     col1     y

Upvotes: 2

Calbers
Calbers

Reputation: 409

In a column, you can check where values are and get back a vector of logicals that you next use to index said column. In our case:

Data1$col1 == "y"

would get you a vector like

## [1] TRUE FALSE TRUE FALSE FALSE TRUE

This you can use to access the "y"s in the column:

Data1$col1[Data1$col1 == "y"]
## [1] "y" "y" "y"

It reads out the entries of the column only where there is an "y", a bit boring in this case, but very useful in general, like when you use values in one column to manipulate another one. The line above is also a two-way-street: You can read out the values (like above), our you can use an assignment operator to assign the specified entries a new value:

Data1$col1[Data1$col1 == "y"] = "col1"

You can then repeat the code for each column and be done.

Upvotes: 1

Related Questions