alexk
alexk

Reputation: 97

How to add rows to an empty column in R

I would like to create an empty data frame with column names and then run a function that counts the number of rows that have no missing data in several files and stores the file number and number of complete rows in the data frame. The function has two arguments, one is the name of the folder the files are stored (directory) in and the other is the file number i want to access (id)

I have the function working but the format of the final data frame comes out wrong, can anyone suggest where i might be going wrong and how to correct it so it comes out in the proper format. My code is:

complete<-function (directory, id = 1:332) {
data1<-data.frame(id=numeric(),nobs=numeric())

for (i in id) {
file_name<-sprintf("%03d.csv",i)
file_add<-paste0("C:/Users/Babbage/coursera/Computing for Data Analysis/assignments","/",directory)
file_to_read<-paste0(file_add,"/",file_name)
filedata<-read.csv(file_to_read)
x <- filedata[complete.cases(filedata), ]
count1<-nrow(x)
newrow<-c(i,count1)
data1<-rbind(c(data1,newrow))
}
print(data1)
}

if I run:

complete("specdata",c(2,4,8,10,12))

I get this out put

[1] id   nobs
<0 rows> (or 0-length row.names)
[,1]      [,2]      [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
[1,] Numeric,0 Numeric,0 2    1041 4    474  8    192  10   148   12    96   

but i want it to look like this:

> complete("specdata", c(2, 4, 8, 10, 12)) 
id nobs 
1 2 1041 
2 4 474 
3 8 192 
4 10 148 
5 12 96

Any other advice on how to make my code better on top is always appreciated.

Upvotes: 0

Views: 1756

Answers (1)

Steve Reno
Steve Reno

Reputation: 1384

looks to me like you should be creating a data frame here rather than binding the values. In your code above

newrow <- c(i, count1)
data1 <- rbind(c(data1, newrow))

newrow is essential just a vector of numbers (i.e. a group of values of the same type). The c() operator creates vectors in this way. When you go to write the next group to data1 you used c() again within the rbind statement and thus you are just lengthening the original vector and not creating a new row (as your output example would suggest you'd like), and rbind() isn't doing what you want it to.

Example:

newrow <- c(1,10)
newrow2 <- c(2,20)
c(newrow, newrow2)
rbind(newrow, newrow2)

Notice the difference in how the c() and rbind() function work above; c() adds the newrow2 values to the end of newrow and rbind creates a second row of data. You could probably just remove the c() within the rbind() function to get your desired result, but I'm more inclined to use a data frame like the example below:

newrow <- data.frame(id = i, nobs = count1)
data1 <- rbind(data1, newrow)

Now your output is a data frame with two columns one named 'id' and one named 'nobs'

Upvotes: 1

Related Questions