Reputation: 1

Plot In R with Multiple Lines Based On A Particular Variable?

I have this accelerometer dataset and, let's say that I have some n number of observations for each subject (30 subjects total) for body-acceleration x time.

I want to make a plot so that it plots these body acceleration x time points for each subject in a different color on the y axis and the x axis is just an index. I tried this:

ggplot(data = filtered_data_walk, aes(x = seq_along(filtered_data_walk$'body-acceleration-mean-y-time'), y = filtered_data_walk$'body-acceleration-mean-y-time')) + 
  geom_line(aes(color = filtered_data_walk$subject))

But, the problem is that it doesn't superimpose the 30 lines, instead, they run along side each other. In other words, I end up with n1 + n2 + n3 + ... + n30 x index points, instead of max{n1, n2, ..., n30}. This is my first time posting, so I hope this makes sense (I know my formatting is bad).

One solution I thought of was to create a new variable which gives a value of 1 to n for all the observations of each subject. So, for example, if I had 6 observations for subject1, 4 observations for subject2, and 9 observations for subject3, this new variable would be sequenced like:

1 2 3 4 5 6 1 2 3 4 1 2 3 4 5 6 7 8 9

Is there an easy way to do this? Please help, ty.

Upvotes: 0

Answers (2)

juod

Reputation: 450

That's pretty quick to do if you are OK with using dplyr. group_by to enforce a separate counter for each subject, mutate to add the actual counter, and your ggplot should work. Example with iris dataset:

group_by(iris, Species) %>%
mutate(index = seq_along(Petal.Length)) %>%
ggplot() + geom_line(aes(x=index, y=Petal.Length, color=Species))

Upvotes: 0

Matt Tyers

Reputation: 2215

Assuming your data is formatted as a data.frame or matrix, for a toy dataset like

x <- data.frame(replicate(5, rnorm(10)))
x
#             X1          X2         X3           X4         X5
# 1  -1.36452272 -1.46446475  2.0444381  0.001585876 -1.1085990
# 2  -1.41303046 -0.14690269  1.6179084 -0.310162018 -1.5528733
# 3  -0.15319554 -0.18779791 -0.3005058  0.351619212  1.6282955
# 4  -0.38712167 -0.14867239 -1.0776359  0.106694311 -0.7065382
# 5  -0.50711166 -0.95992916  1.3522922  1.437085757 -0.7921355
# 6  -0.82377208  0.50423328 -0.5366513 -1.315263679  1.0604499
# 7  -0.01462037 -1.15213287  0.9910678  0.372623508  1.9002438
# 8   1.49721113 -0.84914197  0.2422053  0.337141898  1.2405208
# 9   1.95914245 -1.43041783  0.2190829 -1.797396822  0.4970690
# 10 -1.75726827 -0.04123615 -0.1660454 -1.071688768 -0.3331887

...you might be able to get there with something like

plot(x[,1], type='l', xlim=c(1, nrow(x)), ylim=c(min(x), max(x)))
for(i in 2:ncol(x)) lines(x[,i], col=i)

You could play with formatting some more, of course, do things with lty= and lwd= and maybe a color ramp of your own choosing, etc.

If your data is in the format below...

x <- data.frame(id=c("A","A","A","B","B","B","B","C","C"), acc=rnorm(9))
x
#   id        acc
# 1  A  0.1796964
# 2  A  0.8770237
# 3  A -2.4413527
# 4  B  0.9379746
# 5  B -0.3416141
# 6  B -0.2921062
# 7  B  0.1440221
# 8  C -0.3248310
# 9  C -0.1058267

...you could get there with

maxn <- max(with(x, tapply(acc, id, length)))
ids <- sort(unique(x$id))
plot(x$acc[x$id==ids[1]], type='l', xlim=c(1,maxn), ylim=c(min(x$acc),max(x$acc)))
for(i in 2:length(ids)) lines(x$acc[x$id==ids[i]], col=i)

Hope this helps, and that I interpreted your problem right--

Upvotes: 1

Plot In R with Multiple Lines Based On A Particular Variable?

Answers (2)

Related Questions