Reputation: 858
I have got a data frame with geographic position inside. The positions are strings. This is my function to scrape the strings and get the positions by Degress.Decimal.
Example position 23º 30.0'N
latitud.decimal <- function(y) {
latregex <- str_match(y,"(\\d+)º\\s(\\d*.\\d*).(.)")
latitud <- (as.numeric(latregex[1,2])) +((as.numeric(latregex[1,3])) / 60)
if (latregex[1,4]=="S") {latitud <- -1*latitud}
return(latitud)
}
Results> 23.5
then I would like to create a new column in my original dataframe applying the function to every item in the Latitude column. Is the same issue for the longitude. Another new column
I know how to do this using Python and Pandas buy I am newbie y R and cannot find the solution.
I am triying with
lapply(datos$Latitude, 2 , FUN= latitud.decimal(y))
but do not read the y "argument" which is every column value.
Upvotes: 2
Views: 621
Reputation: 8572
Note that the str_match
is vectorized as stated in the help page of the function help("str_match")
.
For the sake of answering the question, I lack a reproducable example and data. This page describes how one can make questions that are more likely to be reproducable and thus obtain better answers. As i lack data, and code, i cannot test whether i am actually hitting the spot, but i will give it a shot anyway.
Using the fact the str_match
is vectorized, we can apply the entire function without using lapply, and thus create a new column simply. I'll slightly rewrite your function, to incorporate the vectorizations. Note the missing 1
's in latregex[., .]
latitud.decimal <- function(y) {
latregex <- str_match(y,"(\\d+)º\\s(\\d*.\\d*).(.)")
latitud <- as.numeric(latregex[, 2]) + as.numeric(latregex[, 3]) / 60)
which_south <- which(latregex[, 4] == "S")
latitud[which_south] <- -latitud[which_south]
latitud
}
Now that the function is ready, creating a column can be done using the $
operator. If the data is very large, it can be performed more efficiently using the data.table
. See this stackoverflow page for an example of how to assign via the data.table package.
In base R we would simply perform the action as
datos$new_column <- latitud.decimal(datos$Latitude)
Upvotes: 2