george willy
george willy

Reputation: 1713

grouping data in R

I have this data frame, this is a daily data. For every day, I have free space for each file system. I like to graph this. I was thinking to put each file system on its own column to create R graphs. How can I go about doing this. Without moving them to their own columns, can I create a chart for each file system for each day?

     Date     fileSystem FreeSpace
2011-12-03     /var          99.785
2011-12-03     /opt          30.494
2011-12-03     /tmp          55.643
2011-12-03     /data         37.846
2011-12-03     /ora          0.578
2011-12-04     /var          99.785
2011-12-04     /opt          30.494
2011-12-04     /tmp          55.643
2011-12-04    /data         37.846
2011-12-04     /ora          0.578

Upvotes: 1

Views: 576

Answers (3)

IRTFM
IRTFM

Reputation: 263301

With lattice::xyplot you have many options:

require(lattice)
xyplot(FreeSpace ~ Date + fileSystem, data=df1)
xyplot(FreeSpace ~ Date | fileSystem, data=df1)
xyplot(FreeSpace ~ Date , group= fileSystem, data=df1)
xyplot(FreeSpace ~ Date , group= fileSystem, data=df1, type="b")

The lattice equivalent of the base barplot is barchart:

barchart(FreeSpace ~ Date | fileSystem, data=df1)

Upvotes: 2

Brian Diggs
Brian Diggs

Reputation: 58825

Your data is hard to read in in that format; here is a reproducible version:

DF <-
structure(list(Date = structure(c(15311, 15311, 15311, 15311, 
15311, 15312, 15312, 15312, 15312, 15312), class = "Date"), fileSystem = structure(c(5L, 
2L, 4L, 1L, 3L, 5L, 2L, 4L, 1L, 3L), .Label = c("/data", "/opt", 
"/ora", "/tmp", "/var"), class = "factor"), FreeSpace = c(99.785, 
30.494, 55.643, 37.846, 0.578, 99.785, 30.494, 55.643, 37.846, 
0.578)), .Names = c("Date", "fileSystem", "FreeSpace"), row.names = c(NA, 
-10L), class = "data.frame")

I'll also show examples with ggplot2:

library("ggplot2")
library("scales")

This uses grid faceting rather than wrapping like in @EDi's answer. One is not more right than the other; it depends on what you want.

ggplot(DF, aes(x=Date, y=FreeSpace)) +
  geom_point() +
  geom_line() +
  scale_x_date(breaks=date_breaks("1 day")) +
  facet_grid(fileSystem~.)

enter image description here

Your other question was how to reshape the data.

library("reshape2")

DF.wide <- dcast(DF, Date~fileSystem, value.var="FreeSpace")

which gives

> DF.wide
        Date  /data   /opt  /ora   /tmp   /var
1 2011-12-03 37.846 30.494 0.578 55.643 99.785
2 2011-12-04 37.846 30.494 0.578 55.643 99.785

Individual columns can be plotted as desired, then.

Upvotes: 1

EDi
EDi

Reputation: 13280

There are many possibilities for this in R... Something like this? However if you want to plot for each file system and each day, there would be only one bar (don`t if this is much usefull...).

df <- read.table(header = TRUE, text = "Date     fileSystem FreeSpace
                 2011-12-03     /var          99.785
                 2011-12-03     /opt          30.494
                 2011-12-03     /tmp          55.643
                 2011-12-03     /data         37.846
                 2011-12-03     /ora          0.578
                 2011-12-04     /var          99.785
                 2011-12-04     /opt          30.494
                 2011-12-04     /tmp          55.643
                 2011-12-04    /data         37.846
                 2011-12-04     /ora          0.578
                 ")

## using ggplot (dates are faceted)
require(ggplot2)
ggplot(df, aes(x = fileSystem, y = FreeSpace)) +
  geom_bar() +
  facet_wrap(~Date)

enter image description here

Edit: or as a line chart. Nearly everything is possible in R, but you must think about what kind of plot you want...

df$Date <- strptime(df$Date, format="%Y-%m-%d")
ggplot(df, aes(x = Date, y = FreeSpace)) +
  geom_line() +
  facet_wrap(~fileSystem)

enter image description here

Edit2: Perhabs this? Here I make a plot for every filesystem with a for-loop. The plots are stored in a list.

# or as a line chart
df$Date <- strptime(df$Date, format="%Y-%m-%d")
plotlist <- vector(mode="list", length(levels(df$fileSystem)))
for(i in levels(df$fileSystem)){
  tempdf <- df[df$fileSystem == i, ]
  plotlist[[i]] <- ggplot(tempdf, aes(x = Date, y = FreeSpace)) +
    geom_line() +
    opts(title = i)
}
plotlist[["/data"]]
plotlist[["/var"]]

Upvotes: 2

Related Questions