kamome
kamome

Reputation: 858

How to create a new column using function in R?

I have got a data frame with geographic position inside. The positions are strings. This is my function to scrape the strings and get the positions by Degress.Decimal.

Example position 23º 30.0'N

 latitud.decimal <- function(y) {
  latregex <- str_match(y,"(\\d+)º\\s(\\d*.\\d*).(.)")
  latitud <-  (as.numeric(latregex[1,2])) +((as.numeric(latregex[1,3])) / 60) 
  if (latregex[1,4]=="S") {latitud <-  -1*latitud}
  return(latitud)
  }

Results> 23.5

then I would like to create a new column in my original dataframe applying the function to every item in the Latitude column. Is the same issue for the longitude. Another new column

I know how to do this using Python and Pandas buy I am newbie y R and cannot find the solution.

I am triying with

lapply(datos$Latitude, 2 , FUN= latitud.decimal(y)) 

but do not read the y "argument" which is every column value.

Upvotes: 2

Views: 621

Answers (2)

Oliver
Oliver

Reputation: 8572

Note that the str_match is vectorized as stated in the help page of the function help("str_match").

For the sake of answering the question, I lack a reproducable example and data. This page describes how one can make questions that are more likely to be reproducable and thus obtain better answers. As i lack data, and code, i cannot test whether i am actually hitting the spot, but i will give it a shot anyway.

Using the fact the str_match is vectorized, we can apply the entire function without using lapply, and thus create a new column simply. I'll slightly rewrite your function, to incorporate the vectorizations. Note the missing 1's in latregex[., .]

latitud.decimal <- function(y) {
  latregex <- str_match(y,"(\\d+)º\\s(\\d*.\\d*).(.)")
  latitud <-  as.numeric(latregex[, 2]) + as.numeric(latregex[, 3]) / 60)
  which_south <- which(latregex[, 4] == "S") 
  latitud[which_south] <- -latitud[which_south]
  latitud
}

Now that the function is ready, creating a column can be done using the $ operator. If the data is very large, it can be performed more efficiently using the data.table. See this stackoverflow page for an example of how to assign via the data.table package.

In base R we would simply perform the action as

datos$new_column <- latitud.decimal(datos$Latitude)

Upvotes: 2

Adam Waring
Adam Waring

Reputation: 1268

datos$lat_decimal = sapply(datos$Latitude, latitud.decimal)

Upvotes: 1

Related Questions