Reputation: 3584
I pasted the important parts of my code below. Basically I am creating a data.frame of which two of its columns contain numeric values and one column contains factors.
I am trying to convert the "Location" column into numeric values, however once I do, the Location values for some reason switch around.
f <- fread("ABC.txt",header=F,skip=1)$V1
f <- paste(f, collapse = "")
vector <- 1:stri_length(f)
fillmatrix <- c(rbind(strsplit(f, "")[[1]], vector))
A <- data.frame(1,matrix(fillmatrix, ncol=2, byrow = TRUE))
A <- A[c(1,3,2)]
colnames(A)=c("Track","Location","Base")
class(A$Track)
# [1] "factor"
A[1:15,] # Before as.numeric
Track Location Base
# 1 1 1 A
# 2 1 2 C
# 3 1 3 G
# 4 1 4 G
# 5 1 5 A
# 6 1 6 A
# 7 1 7 T
# 8 1 8 A
# 9 1 9 A
# 10 1 10 A
# 11 1 11 A
# 12 1 12 T
# 13 1 13 T
# 14 1 14 C
# 15 1 15 C
a <- transform(A, Location = as.numeric(Location), Track = as.numeric(Track))
a[1:15,] # After as.numeric
# Track Location Base
# 1 1 1 A
# 2 1 112 C
# 3 1 223 G
# 4 1 334 G
# 5 1 445 A
# 6 1 556 A
# 7 1 667 T
# 8 1 679 A
# 9 1 690 A
# 10 1 2 A
# 11 1 13 A
# 12 1 24 T
# 13 1 35 T
# 14 1 46 C
# 15 1 57 C
The A data frame is fairly long ~ 700 rows long. Is the way I'm creating the data.frame the issue? Or am I overlooking a small mistake?
Thank you for your help
Upvotes: 0
Views: 1497
Reputation: 56935
A reproducible example would be good.
I suspect it is because class(A$Location)
is a factor, not a character.
In that case, you need as.numeric(as.character(Location))
to get the numbers as you wish. This is because R encodes factors just as integers 1:nlevels(your.factor)
after doing a (string, not numeric - so 10 goes before 2) sort.
You might set stringsAsFactors=F
in your data.frame
call - in your fillmatrix <- ...
line you seem to be converting everything to character by doing the strsplit
on "" (why do you paste your f
together only to split it back out again?)
Upvotes: 2