Reputation: 55
I have a data.frame in R which is in long format.
Gender | Area | Value | Year | Column X | Column Y | And other columns... |
---|---|---|---|---|---|---|
male | urban | 31.45 | 2012 | xxx | yyy | zzz |
male | rural | 41 | 2012 | xxx | yyy | zzz |
male | rural | 35.6 | 2012 | xxx | yyy | zzz |
female | rural | 30 | 2012 | xxx | yyy | zzz |
female | urban | 21 | 2016 | xxx | yyy | zzz |
female | rural | 19 | 2016 | xxx | yyy | zzz |
The main aim is to add a new column, say seg to this data.frame whose rows are filled in the following format:
Gender | Area | Value | Year | Column X | Column Y | And other columns... | seg |
---|---|---|---|---|---|---|---|
male | urban | 31.45 | 2012 | xxx | yyy | zzz | {"Gender": "male", "Area": "urban", "Year": 2012} |
male | rural | 41 | 2012 | xxx | yyy | zzz | {"Gender": "male", "Area": "rural", "Year": 2012} |
male | rural | 35.6 | 2012 | xxx | yyy | zzz | {"Gender": "male", "Area": "rural", "Year": 2012} |
female | rural | 30 | 2012 | xxx | yyy | zzz | {"Gender": "female", "Area": "rural", "Year": 2012} |
female | urban | 21 | 2016 | xxx | yyy | zzz | {"Gender": "female", "Area": "urban", "Year": 2016} |
female | rural | 19 | 2016 | xxx | yyy | zzz | {"Gender": "female", "Area": "rural", "Year": 2016} |
The format of the contents of the seg column is important because it will be fed to a database.
Can anyone advise me how exactly to proceed with that in R? I have spent quite sometime looking around for solutions and couldn't find anything that takes column names, row values in a data.frame and converts them into json strings. The manipulation is not on all the columns but only for specific columns.
Appreciate the help in advance!
Upvotes: 2
Views: 446
Reputation: 2626
Here is my approach with purrr
:
library(purrr)
pmap_dfr(your_data, \(...) {
tibble::tibble(..., seg = imap(list(...), ~ sprintf("\"%s\": \"%s\"",
.y, .x)) %>%
paste(collapse = ", ") %>%
sprintf("{%s}", .))
})
Returns:
# A tibble: 6 x 8
Gender Area Value Year Column.X Column.Y And.other.column… seg
<chr> <chr> <dbl> <int> <chr> <chr> <chr> <chr>
1 male urban 31.4 2012 xxx yyy zzz "{\"Gender\": \"…
2 male rural 41 2012 xxx yyy zzz "{\"Gender\": \"…
3 male rural 35.6 2012 xxx yyy zzz "{\"Gender\": \"…
4 female rural 30 2012 xxx yyy zzz "{\"Gender\": \"…
5 female urban 21 2016 xxx yyy zzz "{\"Gender\": \"…
6 female rural 19 2016 xxx yyy zzz "{\"Gender\": \"…
(Data used:)
your_data <- structure(list(Gender = c("male", "male", "male", "female", "female", "female"), Area = c("urban", "rural", "rural", "rural", "urban", "rural"), Value = c(31.45, 41, 35.6, 30, 21, 19), Year = c(2012L, 2012L, 2012L, 2012L, 2016L, 2016L), Column.X = c("xxx", "xxx", "xxx", "xxx", "xxx", "xxx"), Column.Y = c("yyy", "yyy", "yyy", "yyy", "yyy", "yyy"), And.other.columns... = c("zzz", "zzz", "zzz", "zzz", "zzz", "zzz")), row.names = c(NA, -6L), class = c("tbl_df", "tbl", "data.frame"))
Upvotes: 1
Reputation: 887153
We can use toJSON
from jsonlite
library(purrr)
library(dplyr)
library(jsonlite)
out <- df1 %>%
mutate(seg = pmap_chr(select(cur_data(), Gender, Area, Year), ~
toJSON(list(...), auto_unbox = TRUE)))
-output
out
Gender Area Value Year ColumnX ColumnY seg
1 male urban 31.45 2012 xxx yyy {"Gender":"male","Area":"urban","Year":2012}
2 male rural 41.00 2012 xxx yyy {"Gender":"male","Area":"rural","Year":2012}
3 male rural 35.60 2012 xxx yyy {"Gender":"male","Area":"rural","Year":2012}
4 female rural 30.00 2012 xxx yyy {"Gender":"female","Area":"rural","Year":2012}
5 female urban 21.00 2016 xxx yyy {"Gender":"female","Area":"urban","Year":2016}
6 female rural 19.00 2016 xxx yyy {"Gender":"female","Area":"rural","Year":2016}
Or may use rowwise
df1 %>%
rowwise %>%
mutate(seg = toJSON(lst(Gender, Area, Year), auto_unbox = TRUE)) %>%
ungroup
# A tibble: 6 x 7
Gender Area Value Year ColumnX ColumnY seg
<chr> <chr> <dbl> <int> <chr> <chr> <json>
1 male urban 31.4 2012 xxx yyy {"Gender":"male","Area":"urban","Year":2012}
2 male rural 41 2012 xxx yyy {"Gender":"male","Area":"rural","Year":2012}
3 male rural 35.6 2012 xxx yyy {"Gender":"male","Area":"rural","Year":2012}
4 female rural 30 2012 xxx yyy {"Gender":"female","Area":"rural","Year":2012}
5 female urban 21 2016 xxx yyy {"Gender":"female","Area":"urban","Year":2016}
6 female rural 19 2016 xxx yyy {"Gender":"female","Area":"rural","Year":2016}
df1 <- structure(list(Gender = c("male", "male", "male", "female", "female",
"female"), Area = c("urban", "rural", "rural", "rural", "urban",
"rural"), Value = c(31.45, 41, 35.6, 30, 21, 19), Year = c(2012L,
2012L, 2012L, 2012L, 2016L, 2016L), ColumnX = c("xxx", "xxx",
"xxx", "xxx", "xxx", "xxx"), ColumnY = c("yyy", "yyy", "yyy",
"yyy", "yyy", "yyy")), class = "data.frame", row.names = c(NA,
-6L))
Upvotes: 2