upendra
upendra

Reputation: 2189

writing function for reading files from a directory in R

I am trying to write a function to read files from a directory and then either print the head of the file or summary of the head of the file in R. My code is as below...

getmonitor <- function(id, directory, summarize = FALSE) {
    if(id>=1 && id<10) {
        te1 <- paste("00",id,sep="")
        #print(te1)
    } else if(id>10 && id<=99) {
        te1 <- paste("0",id,sep="")
        #print(te1)
    } else {
        te1 <- id
        #print(te1)
    }
filename = paste(directory, te1, sep = "/")
filename1 = paste(filename, "csv", sep = ".")
filename1
test <- read.csv(file = filename1)
    if(summarize==TRUE) {
        test1 <- summary(test)
    } else {
        test1 = test
    }
    return (test1)
}

When i use this function without summarize option it works fine as below....

data <- getmonitor(1, "specdata") 
> head(data) 
        Date sulfate nitrate ID
1 2003-01-01      NA      NA  1
2 2003-01-02      NA      NA  1
3 2003-01-03      NA      NA  1
4 2003-01-04      NA      NA  1
5 2003-01-05      NA      NA  1
6 2003-01-06      NA      NA  1

But when i use the summary option then i getting the output with all quotes around the lines as below...

data <- getmonitor(101, "specdata", TRUE) 
> head(data) 
         Date          sulfate            nitrate                ID       
 "2005-01-01:  1  " "Min.   : 1.700  " "Min.   : 0.2490  " "Min.   :101  "
 "2005-01-02:  1  " "1st Qu.: 3.062  " "1st Qu.: 0.6182  " "1st Qu.:101  "
 "2005-01-03:  1  " "Median : 4.345  " "Median : 1.0500  " "Median :101  "
 "2005-01-04:  1  " "Mean   : 6.267  " "Mean   : 2.2679  " "Mean   :101  "
 "2005-01-05:  1  " "3rd Qu.: 7.435  " "3rd Qu.: 2.7825  " "3rd Qu.:101  "
 "2005-01-06:  1  " "Max.   :22.100  " "Max.   :10.8000  " "Max.   :101  "

I don't want any of the quotes for the lines. I even tried converting this into df but doesn't work. Where am i doing wrong?

Upvotes: 0

Views: 18708

Answers (4)

upendra
upendra

Reputation: 2189

I finally got what i wanted using the bits and pieces from the above all. Here is the final code. Thanks a ton for the help though. Much appreciated......

getmonitor <- function(id, directory, summarize = FALSE) {
    te1 <- formatC(id, width=3, flag="0")
    filename = paste(directory, te1, sep = "/")
    filename1 = paste(filename, "csv", sep = ".")
    test <- read.table(file = filename1, header=T, sep=",")
    if(summarize) {
        print(summary(test))
    return (test)
    } else {
    return (test)
    }
}

Upvotes: 3

hvollmeier
hvollmeier

Reputation: 2986

First of all I would get rid of the ugly if-else construct using formatC or sprintf.(see SO question).If you want to print the head of the file or the summary of the head of the file, you have to put this in your function :-) .

getmonitor <- function(id, directory, summarize = FALSE) {
  te1 <- formatC(id, width=4, flag="0")
  
  filename = paste(directory, te1, sep = "/")
  filename1 = paste(filename, "csv", sep = ".")
  filename1
  print(filename1)
  test <- read.csv(file = filename1)
  if(summarize==TRUE) {
    test1 <- summary(head(test))
  } else {
    test1 = head(test)
  }
  return (test1)
}

As an example I just use a randomly selected csv-file in my data directory.

getmonitor(22,"~/Data/R")

  term   vola    range
1   30 0.2129      max
2   30 0.1191 quartile
3   30 0.0944   median
4   30 0.0855 quartile
5   30 0.0714      min
6   60 0.1831      max

or if you want to get the summary of the head:

getmonitor(22,"~/Data/R",summarize=TRUE)

      term         vola              range  
 Min.   :30   Min.   :0.07140   max     :2  
 1st Qu.:30   1st Qu.:0.08772   median  :1  
 Median :30   Median :0.10675   min     :1  
 Mean   :35   Mean   :0.12773   quartile:2  
 3rd Qu.:30   3rd Qu.:0.16710               
 Max.   :60   Max.   :0.21290   

Hope this helps. Be aware that your function only returns the summary/head of the file, so that you have to read-in the file again when you really want to do some work with it.(not very efficient, especially with large files )

Upvotes: 0

Koushik Saha
Koushik Saha

Reputation: 683

# Usage: getmonitor(12,"specdata",TRUE)
getmonitor <- function(id, directory, summarize = FALSE) {

l<-nchar(id)
if(l==1)
{
op<-paste(directory,"/","00",paste(id,".csv",sep=""),sep="")
data<-read.csv(op)
#print(class(data))
if(summarize == TRUE)
{
print(summary(data))
return(data)
}
else 
return(data)

}
if (l==2)
{
op<-paste(directory,"/","0",paste(id,".csv",sep=""),sep="")
data<-read.csv(op)
if(summarize == TRUE)
{
print(summary(data))
return(data)
}
else 
return(data)

}
if(l==3)
{
op<-paste(directory,"/",id,".csv",sep="")
data<-read.csv(op)
if(summarize == TRUE)
{
print(summary(data))
return(data)
}
else 
return(data)
}
}

run this and put this in your working directory. and your working directory must have the specdata folder in it. Hope that helps. !!

Upvotes: 0

hd1
hd1

Reputation: 34677

Your read.csv line can have a colClasses parameter, representing:

A vector of classes to be assumed for the columns

So perhaps specify your column types in that vector explicitly. Leave a comment if it doesn't sort you and I'll look into things further.

Upvotes: 0

Related Questions