Reputation: 1094

R : Create dataframe row per row with named and typed columns

I would like to create a dataframe ith 4 columns : charcater, character, numeric, numeric

and to fill my data line per line, for maintenability, so keeping only c("site1", "Site 1", 6.943890, -6.0557) as the data for one row.

This is working but awful.... How can I make it more "R beautiful" ?

S <- data.frame( t(data.frame(
  "s1" = c("None", "None", 0, 0 ),
  "s2" = c("site1", "Site 1", 6.943890, -6.0557),
  "s3" = c("site2", "Site 2", 43.943890, -3.055796)
) ) , stringsAsFactors = F)
colnames(S) <- c("id", "name", "lat", "lng")
S$lat <- as.double(S$lat)
S$lng <- as.double(S$lng)

which results in a correct dataframe with named columns with the right type....

Upvotes: 1

Answers (2)

David Mas

Reputation: 1209

I am not sure I understand your question, if you want to fill in a data.frame row by row I would use a matrix.

library(magrittr)
rnames = c("s1","s2","s3")
values = c(c("None", "None", 0, 0 ),
           c("site1", "Site 1", 6.943890, -6.0557),
           c("site2", "Site 2", 43.943890, -3.055796))
matrix(values,nrow = length(rnames),
       ncol = length(values)/length(rnames),
       byrow = T) %>% as.data.frame() -> S
colnames(S) <- c("id", "name", "lat", "lng")

S
#>      id   name      lat       lng
#> 1  None   None        0         0
#> 2 site1 Site 1  6.94389   -6.0557
#> 3 site2 Site 2 43.94389 -3.055796

The other option is to build the dataframe from a a list. I think this would be more similar to what you would need in a real-world scenario.

options(stringsAsFactors = FALSE)


df_list <- list(
  s1 = data.frame(id = "None", name = "None", lat = 0, lng = 0 ),
  s2 = data.frame(id = "site1", name =  "Site 1", lat = 6.943890,  lng = -6.0557),
  s3 = data.frame(id = "site2", name ="Site 2",lat = 43.943890,lng = -3.055796)
) 

S = dplyr::bind_rows(df_list)
colnames(S) <- c("id", "name", "lat", "lng")

# i don't think u need this
S$lat <- as.double(S$lat)
S$lng <- as.double(S$lng)

S
#>      id   name      lat       lng
#> 1  None   None  0.00000  0.000000
#> 2 site1 Site 1  6.94389 -6.055700
#> 3 site2 Site 2 43.94389 -3.055796

As others have said this is not very common in R, for large datasets the list option will likely be a bit slow. Vectorizing through columns is the optimal option most of the time.

^{Created on 2020-04-07 by the reprex package (v0.3.0)}

Upvotes: 1

Ronak Shah

Reputation: 388982

You could use type.convert which converts data to its appropriate class.

str(S)
#'data.frame':  3 obs. of  4 variables:
# $ X1: chr  "None" "site1" "site2"
# $ X2: chr  "None" "Site 1" "Site 2"
# $ X3: chr  "0" "6.94389" "43.94389"
# $ X4: chr  "0" "-6.0557" "-3.055796"

S <- type.convert(S, as.is = TRUE)
str(S)

#'data.frame':  3 obs. of  4 variables:
# $ X1: chr  "None" "site1" "site2"
# $ X2: chr  "None" "Site 1" "Site 2"
# $ X3: num  0 6.94 43.94
# $ X4: num  0 -6.06 -3.06

There is also readr::type_convert(S) which does the same thing.

Upvotes: 0

R : Create dataframe row per row with named and typed columns

Answers (2)

Related Questions