Reputation: 21
I have a data frame that contains 2 columns, filename
and monitorid
.
filename monitorid
1 001.csv 1
2 002.csv 2
3 003.csv 3
4 004.csv 4
5 005.csv 5
6 006.csv 6
I am trying to subset in order to select the filename
for a given monitorid
filename <- files[files$monitorid==3,1]
I expected this to return "003.csv"
Instread it returns
[1] 003.csv
6 Levels: 001.csv 002.csv 003.csv 004.csv 005.csv 006.csv
However
filename <- files[files$monitorid==3,2] returns
[1] 3
as expected
I do not understand why choosing column 1 returns a factor with multiple levels while column 2 returns a single value.
Any ideas would be greatly appreciated.
@KenM This is the function I used to read the files names
getfileinfo <- function (directory){
## Reads file names into filenames variable
filenames <- list.files (path = directory)
## assigns monitorids to each file name
monitorid <- as.numeric(substr(filenames,1,3))
##combines filenames and monitorid into data frame, files
files <- data.frame(filenames, monitorid)
names(files) <- c("filename","monitorid")
return(files)
}
Solution
Here's is the ouput from each line
filenames <- list.files (path = directory)
class(filenames)
[1] "character"
monitorid <- as.numeric(substr(filenames,1,3))
class(monitorid)
[1] "numeric"
files <- data.frame(filenames, monitorid)
sapply (files, class)
filenames monitorid
"factor" "numeric"
As noted by both KenM and BeginneR when combined into a data frame the character vector filenames becomes a column of data class factor
Corrected code
files <- data.frame(filenames, monitorid, stringsAsFactors = FALSE)
sapply (files, class)
filenames monitorid
"character" "numeric"
Upvotes: 1
Views: 6806
Reputation: 2826
I do not understand why choosing column 1 returns a factor with multiple levels while column 2 returns a single value.
You get factor because you loaded "filename" column as factor, while (I suppose) you want a string/character for the value of "filename" object.
Solutions are either: 1. When you load the csv file, read the values as character instead of factor; or 2. Convert the factor into character.
For the solution 1, set colClasses = "character
in read.csv()
(See ?read.csv
)
For the solution 2, do filename <- as.character(files[files$monitorid==3,1])
(BTW, please include a reproducible example when asking a question)
Upvotes: 1