Reputation: 15
The arrange() in dplyr produces incorrect result.
library(dplyr)
x <- as.data.frame(cbind(name=c("A","B","C","D"), val=c(0.032, 0.077, 0.4, 0.0001)))
x.1 <- x %>% arrange(val)
x.2 <- x %>% arrange(desc(val))
The outputs are:
name val
1 A 0.032
2 B 0.077
3 C 0.4
4 D 1e-04
>x.1
name val
1 A 0.032
2 B 0.077
3 C 0.4
4 D 1e-04
> x.2
name val
1 D 1e-04
2 C 0.4
3 B 0.077
4 A 0.032
Both ascending and descending order sort producing incorrect output. Not sure what I am doing wrong here? Thank you.
Upvotes: 0
Views: 274
Reputation: 99331
as.data.frame(cbind())
is what you are doing wrong there. Everything is converted to character in cbind()
, and then to factor in as.data.frame()
. Have a look ...
str(x)
# 'data.frame': 4 obs. of 2 variables:
# $ name: Factor w/ 4 levels "A","B","C","D": 1 2 3 4
# $ val : Factor w/ 4 levels "0.032","0.077",..: 1 2 3 4
I don't know where people are learning this method of creating data frames, but it's terrible practice and should never be used.
Use data.frame()
to create data frames, that's why it's there (or when using dplyr, there is data_frame()
as well).
library(dplyr)
x <- data.frame(name=c("A","B","C","D"), val=c(0.032, 0.077, 0.4, 0.0001))
x.1 <- x %>% arrange(val)
x.2 <- x %>% arrange(desc(val))
x.1
# name val
# 1 D 0.0001
# 2 A 0.0320
# 3 B 0.0770
# 4 C 0.4000
x.2
# name val
# 1 C 0.4000
# 2 B 0.0770
# 3 A 0.0320
# 4 D 0.0001
Upvotes: 3