Pascal
Pascal

Reputation: 1246

Understanding indexing of data frame with colon operator

I have a data.frame that is from the predictive maintenance R Notebook example by MS.

Now they show how to subset this data.frame like this (showing some of the first rows and some of the last with just one line of code instad of using head() or tail()):

> errors[c(1:3, nrow(errors)-3:1),]
                datetime machineID errorID
1    2015-01-03 07:00:00         1  error1
2    2015-01-03 20:00:00         1  error3
3    2015-01-04 06:00:00         1  error5
3916 2015-12-04 02:00:00       100  error1
3917 2015-12-08 06:00:00       100  error2
3918 2015-12-08 06:00:00       100  error3

They want to output

I do unterstand what the colon operator does. But I do not really understand what it does in this example. Because: while 4:1 will return

> 4:1
[1] 4 3 2 1

and

nrow(errors):1
   [1] 3919 3918 3917 3916 3915 3914 3913 3912 3911 3910 3909 3908 3907 3906 3905 3904 3903
  [18] 3902 3901 3900 3899 3898 3897 3896 3895 3894 3893 3892 3891 3890 3889 3888 3887 3886
...

Then the following does not return what I would expect:

> nrow(errors)-3:1
[1] 3916 3917 3918

I would have expected that it returns the same long list as before, but starting with the index at nrow(errors)-3. So something like:

nrow(errors):1
       [1] 3916 3915 3914 3913 3912 3911 3910 3909 3908 3907 3906 3905 3904 3903
    ...

What do I understand wrong here? Thanks in advance!

Upvotes: 0

Views: 420

Answers (1)

Pascal
Pascal

Reputation: 1246

Thanks to @markus and @Aaron Hayman and @G Grothendieck

The colon operator is evaluated first. So that

> 3:1
[1] 3 2 1

And nrow(errors) will return 3919. Then subtracting 3:1 will give a vector like c(3919-3, 3919-2, 3919-1)

And by thinking this over again, I realize it should be:

> nrow(errors)-2:0
[1] 3917 3918 3919

to really get the last three lines, like in the following:

> errors[c(1:3, nrow(errors)-2:0),]
                datetime machineID errorID
1    2015-01-03 07:00:00         1  error1
2    2015-01-03 20:00:00         1  error3
3    2015-01-04 06:00:00         1  error5
3917 2015-12-08 06:00:00       100  error2
3918 2015-12-08 06:00:00       100  error3
3919 2015-12-22 03:00:00       100  error3

This helped understanding. Thanks!

Upvotes: 1

Related Questions