bob123
bob123

Reputation: 163

R Sorting of Multi-Column Data

I'm having a peculiar problem sorting the data frame that follows (sds45) in the column stat0;

>sds45

           icntr     iexpt angle overlap Specified.Shot.Width          V6 mcsp             stat0
DD.Sigma2      3   1R50_50    45       0                   50 rectangular  1.5  3.62075986666667
DD.Sigma5      6   1R50_35    45      15                   50 rectangular  1.5  1.07005992333333
DD.Sigma8      9   1R50_40    45      10                   50 rectangular  1.5        1.36916201
DD.Sigma11    12   1R50_30    45      20                   50 rectangular  1.5 0.951408239333333
DD.Sigma14    15  1R100_75    45      25                  100 rectangular  1.5  11.6972803333333
DD.Sigma17    18  1R100_80    45      20                  100 rectangular  1.5  13.4350596666667
DD.Sigma20    21  1R100_90    45      10                  100 rectangular  1.5         16.654366
DD.Sigma31    32 1R100_150    45      50                  100 rectangular  1.5  2.19166406666667
DD.Sigma34    35 1R100_160    45      40                  100 rectangular  1.5         5.4822418
DD.Sigma39    40  1C200_25    45      75                  100    circular  1.5       0.704197414
DD.Sigma42    43  1C200_50    45      50                  100    circular  1.5  1.03405964333333
DD.Sigma45    46  1C200_75    45      25                  100    circular  1.5  7.03481966666667
DD.Sigma48    49  1C200_80    45      20                  100    circular  1.5  9.19375816666667

My first approach was this:

test<-sds45[order(sds45$stat0),]

... which did nothing.

I also tried this:

test=orderBy(~stat0, data=sds45)

I must have a basic concept problem. I would appreciate a small bit of education on this.

Upvotes: 1

Views: 395

Answers (1)

Richie Cotton
Richie Cotton

Reputation: 121057

When the data frame was created, there were probably some non-numeric characters in stat0, and so that column was converted to be a factor. When you sort on the factor, you are sorting by the underlying level codes, which would be ascribed in the order that those values appeared, thus the order doesn't change.

The solution is to convert that column to be numeric.

You can convert that factor to be numeric, as you intended it to be, using:

factor_to_numeric <- function(f)
{
  as.numeric(levels(f))[as.integer(f)]
}

sds45$stat0 <- factor_to_numeric(sds45$stat0)

It is also a good idea very important to check your dataset to try and find those non-numeric characters. If that column contains dirty data, then the rest of your dataset may also need cleaning.

Upvotes: 3

Related Questions