Reputation: 1107
I have the following data.table
(in fact my data.table
is much bigger (more groups and more other variables)):
Data <- data.table(Group = rep(c("a", "b"), each = 3),
Var = 1:6)
> print(Data)
Group Var
1: a 1
2: a 2
3: a 3
4: b 4
5: b 5
6: b 6
Now I want to sort the data.table
based on the variable Group
but only if Group == "a"
.
My poor attempt was the following:
> Data[Group == "a", .SD[.N:1]]
Group Var
1: a 3
2: a 2
3: a 1
I know why this is wrong but I can not think of a solution that leads to my desired output:
Group Var
1: a 3
2: a 2
3: a 1
4: b 4
5: b 5
6: b 6
Upvotes: 0
Views: 323
Reputation: 388817
Your attempt would work with
library(data.table)
Data[Group == "a"] <- Data[Group == "a", .SD[.N:1]]
Data
# Group Var
#1: a 3
#2: a 2
#3: a 1
#4: b 4
#5: b 5
#6: b 6
But the above just reverses the rows if you want to sort the rows in decreasing order based on Var
you could do
Data[Group == "a"] <- Data[Group == "a", .SD[order(-Var)]]
If you have multiple columns you can do
cols <- c("a", "b")
Data[Group %in% cols] <- Data[Group %in% cols,.SD[order(-Var)],Group]
A better approach with data.table
is to update by reference as suggested by @markus
Data[Group %in% cols, Var := .SD[order(-Var)]$Var]
Upvotes: 2
Reputation: 2320
Without using the .SD
notation:
> Data[Group == "a", Var := sort(Var, decreasing = TRUE)]
> Data
Group Var
1: a 3
2: a 2
3: a 1
4: b 4
5: b 5
6: b 6
As you have a large data.table and more groups, you may wish to consider the use of the .I
notation (see the article by @nathaneastwood) as it does result in better performance in some situations. Here, .I
will identify the row numbers of interest. Let us change the example so that we are interested in two groups:
Data <- data.table(Group = rep(c("a", "b", "c"), each = 3), Var = 10:18)
Then:
> Data[Data[, .I[Group %in% c("a", "c")]], Var := sort(Var, decreasing = TRUE), by = Group]
> Data
Group Var
1: a 12
2: a 11
3: a 10
4: b 13
5: b 14
6: b 15
7: c 18
8: c 17
9: c 16
For completeness the basic idea is contained in:
Data[Group %in% c("a", "c"), Var:= sort(Var, decreasing = TRUE), by = Group]
Upvotes: 2