kindofhungry
kindofhungry

Reputation: 464

Looping through rows of a particular column by searching for a string

I imported some data from a txt file:

df <- read.table(file.choose(), 
                               sep = "\t",
                               skip = 2,
                               fill = T)
df <- cdf[,c(6,11)]
colnames(df) <- c("area", "population")

A shortened version of my data looks like this in the environment in RStudio

     area                                      population
[1]  area0                                     0
[2]  area1:                                    59,859
[3]  area0:                                    56000
[4]  area0:                                    25
[5]  area0:                                    Unknown
[6]  area0:                                    1,400

This is in a data.frame class and I want to loop through the state to find if it contains the phrase area0. If it does, I want to go to population on that same row and add the populations up via a cumulative sum. My end result should be like this: area0 57425 in a new data.frame

Upvotes: 0

Views: 266

Answers (1)

kindofhungry
kindofhungry

Reputation: 464

As Ronak Shah stated in the comments, this problem can be solved using sum(as.numeric(as.character(df$population‌​[grepl("area0"‌‌​​, df$state])), na.rm = T)

This avoids the need need for a for loop.

A second thing that should be implemented is getting rid of the commas in the population column so the sum can be calculated correctly. This can be done using gsub:

df$population <- gsub(",", "", df$population)

Upvotes: 1

Related Questions