Combining multiple columns of values together in R

Question

So I scripted a code that allows me to bin values in a single column. However, the problem I have is that my data is contained in multiple columns, all of varying length and with NAs in it. The script I have for binning goes like this:-

bin <- seq(min(data[, 1]), max(data[, 1]), by = 0.0005)
binnedData <- tapply(data[, 1], cut(data[, 1], breaks = bin), median)

I was wondering whether there was a way I could combine all the values from all columns into one massive long column so I can basically run this, or whether there's a way in which I can tweak this so that I can run it on my 1000 x 1000 matrix

This is taken from part of my data:-

102.23144 123.23242 102.23145
103.23144 123.23242 102.36563
103.83637        NA 102.36356
104.23225        NA 102.23423
105.87890        NA        NA

And the expected outcome with just combining column values:-

And outcome with binning:-

(102.0000 - 102.0005) - Median of all values that fall into bin
(102.0005 - 102.0010) - Median of all values that fall into bin
(102.0015 - 102.0020) - Median of all values that fall into bin

Thanks

amitm · Accepted Answer

have you tried using melt function from reshape2 package.

Here is some of your test data:

test <- data.frame(V1=c(102.2314,103.2314,103.8364,104.2322,105.8789),
                   V2=c(123.2324,123.2324,NA,NA,NA),
                   V3=c(102.2314,102.3656,102.3636,102.2342,NA)
                  )

> test
        V1       V2       V3
1 102.2314 123.2324 102.2314
2 103.2314 123.2324 102.3656
3 103.8364       NA 102.3636
4 104.2322       NA 102.2342
5 105.8789       NA       NA

and then use melt function -

test_m <- melt(test)

But there are NAs.

> test_m
   variable    value
1        V1 102.2314
2        V1 103.2314
3        V1 103.8364
4        V1 104.2322
5        V1 105.8789
6        V2 123.2324
7        V2 123.2324
8        V2       NA
9        V2       NA
10       V2       NA
11       V3 102.2314
12       V3 102.3656
13       V3 102.3636
14       V3 102.2342
15       V3       NA

So, now same step with this filter -

test_m<- melt(test)[which(!(is.na(melt(test)[,2]))),]

> test_m
   variable    value
1        V1 102.2314
2        V1 103.2314
3        V1 103.8364
4        V1 104.2322
5        V1 105.8789
6        V2 123.2324
7        V2 123.2324
11       V3 102.2314
12       V3 102.3656
13       V3 102.3636
14       V3 102.2342

So, NAs removed. You can select only the 2nd col of the data remove the variable name col.

Combining multiple columns of values together in R

Answers (2)

data

Related Questions