GCGM
GCGM

Reputation: 1073

Arrange dataframe format for ggplot - R

I want to reshape my data from wide to long format so that I can use ggplot to create graphs. I am having some problems to properly arragne the data. So far I start my process with a list of 27 dataframes (just showing you the first 10 ones):

> str(NDVI_stat)
List of 27
 $ :'data.frame':   10 obs. of  2 variables:
  ..$ NDVI 1 mean: num [1:10] 0.1796 0.3105 0.1422 0.0937 0.1711 ...
  ..$ NDVI 1 sd  : num [1:10] 0.1117 0.05845 0.00743 0.02754 0.01506 ...
 $ :'data.frame':   10 obs. of  2 variables:
  ..$ NDVI 2 mean: num [1:10] 0.0819 0.5954 0.1328 0.0953 0.1492 ...
  ..$ NDVI 2 sd  : num [1:10] 0.00872 0.10508 0.00863 0.01878 0.02303 ...
 $ :'data.frame':   10 obs. of  2 variables:
  ..$ NDVI 3 mean: num [1:10] 0.0634 0.681 0.2108 0.0151 0.179 ...
  ..$ NDVI 3 sd  : num [1:10] 0.0344 0.076 0.0361 0.0638 0.0428 ...
 $ :'data.frame':   10 obs. of  2 variables:
  ..$ NDVI 4 mean: num [1:10] 0.0971 0.6885 0.2326 0.1157 0.3219 ...
  ..$ NDVI 4 sd  : num [1:10] 0.00991 0.07509 0.02054 0.02793 0.0303 ...
 $ :'data.frame':   10 obs. of  2 variables:
  ..$ NDVI 5 mean: num [1:10] 0.0817 0.4825 0.2754 0.1003 0.4155 ...
  ..$ NDVI 5 sd  : num [1:10] 0.00998 0.05034 0.02781 0.03248 0.04056 ...
 $ :'data.frame':   10 obs. of  2 variables:
  ..$ NDVI 6 mean: num [1:10] 0.1119 0.7667 0.582 0.0997 0.4426 ...
  ..$ NDVI 6 sd  : num [1:10] 0.023 0.0672 0.0649 0.0331 0.0557 ...
 $ :'data.frame':   10 obs. of  2 variables:
  ..$ NDVI 7 mean: num [1:10] 0.1997 0.6567 0.5111 0.0988 0.3307 ...
  ..$ NDVI 7 sd  : num [1:10] 0.0671 0.0756 0.0435 0.0288 0.0457 ...
 $ :'data.frame':   10 obs. of  2 variables:
  ..$ NDVI 8 mean: num [1:10] 0.3626 0.7356 0.6304 0.0954 0.335 ...
  ..$ NDVI 8 sd  : num [1:10] 0.1454 0.0888 0.0502 0.0298 0.038 ...
 $ :'data.frame':   10 obs. of  2 variables:
  ..$ NDVI 9 mean: num [1:10] 0.541 0.748 0.637 0.089 0.577 ...
  ..$ NDVI 9 sd  : num [1:10] 0.0968 0.0721 0.0396 0.0276 0.0656 ...
 $ :'data.frame':   10 obs. of  2 variables:
  ..$ NDVI 10 mean: num [1:10] 0.6691 0.4377 0.6713 0.0942 0.6827 ...
  ..$ NDVI 10 sd  : num [1:10] 0.088 0.0698 0.033 0.0316 0.0688 ...
 $ :'data.frame':   10 obs. of  2 variables:

I am using rbindlist from the data.table package to merge everything into a single dataframe

newdf<-rbindlist(NDVI_stat, use.names = TRUE, fill = TRUE)

The code works properly but I am not creating the structure I really need. The output is a dataframe with 270 (27 daframes * 10 rows in each one) observations and 54 variables (27 dataframes * 2 columns in each one)

image of newdf

As you can see in the image newdf it is creating 270 rows but what I want to obtain is 10 rows (so avoid the NA values)

Any help on that?

This question is similar to this one Plot dataframe with ggplot2 - R

The difference is that I changed the way I produced my input and know I dont know how to arrange the dataframe properly to later use

NDVIdf_forplot <- gather(NDVIdf, key = statistic, value = value, -ID)

and then use ggplot to create my graph

Any help on that?

Upvotes: 1

Views: 1000

Answers (2)

Dries
Dries

Reputation: 514

The problem is that the variable names are different in each df of the list. Once that is solved, the rest is as you imagine it to be.

An example with dplyr/tidyr:

df1<-data.frame(mean1=c(2,3),
                sd1 = c(1,2))

df2<-data.frame(mean2=c(4,5),
                sd2 = c(3,4))

listdf<-list(df1,df2)
str(listdf)

Gives

List of 2

 $ :'data.frame':   2 obs. of  2 variables:

  ..$ mean1: num [1:2] 2 3

  ..$ sd1  : num [1:2] 1 2

 $ :'data.frame':   2 obs. of  2 variables:

  ..$ mean2: num [1:2] 4 5

  ..$ sd2  : num [1:2] 3 4

To rename all data frames and bind them together row by row

library(tidyverse)


listdf%>%map(function(x){x%>%rename_(mean = names(x)[1],
                                     sd = names(x)[2])})%>%
  bind_rows()

gives

  mean sd

    2  1

    3  2

    4  3

    5  4

Upvotes: 1

Eumenedies
Eumenedies

Reputation: 1688

I think you're asking how to column bind the matrices. As far as I'm aware, data.table doesn't have a cbindlist function so you could try: do.call("cbind", NDVI_stat) though that's not quite the same and will fail if you don't have an equal number of rows in each dataframe.

Upvotes: 1

Related Questions