Reputation: 1699
I'd like to convert chemical formulas to a data frame containing columns for 1) the mineral name, 2) the chemical formula and 3) a set of columns for each element that is extracted from the formula. I am given the first two columns and I can extract the number of elements from each formula using CHNOSZ::makeup(). However, I'm not familiar working with lists and not sure how to rbind() the lists back into a data frame that contains everything I'm looking for (i.e. see 1-3 above).
Here is what I have so far - appreciate any help (including a link to a good tutorial on how to convert data from nested lists into dataframes).
library(tidyverse)
library(CHNOSZ)
formulas <- structure(list(Mineral = c("Abelsonite", "Abernathyite", "Abhurite",
"Abswurmbachite", "Acanthite", "Acetamide"), Composition = c("C31H32N4Ni",
"K(UO2)(AsO4)4(H2O)", "Sn3O(OH)2Cl2", "CuMn6(SiO4)O8", "Ag2S",
"CH3CONH2")), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA,
-6L))
test <- formulas %>%
select(Composition) %>%
map(CHNOSZ::makeup) %>%
flatten
test2 <- do.call(rbind,test)
> test2
As H K O U
[1,] 31 32 4 1 31
[2,] 4 2 1 19 1
[3,] 2 2 3 3 2
[4,] 1 6 12 1 1
[5,] 2 1 2 1 2
[6,] 2 5 1 1 2
which is not right.
Upvotes: 1
Views: 62
Reputation: 50668
You could do something like this
library(tidyverse)
library(CNOSZ)
test <- formulas %>%
mutate(res = map(Composition, ~stack(makeup(.x)))) %>%
unnest(cols = res) %>%
spread(ind, values)
## A tibble: 6 x 17
# Mineral Composition C H N Ni As K O U Cl
# <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 Abelso… C31H32N4Ni 31 32 4 1 NA NA NA NA NA
#2 Aberna… K(UO2)(AsO… NA 2 NA NA 4 1 19 1 NA
#3 Abhuri… Sn3O(OH)2C… NA 2 NA NA NA NA 3 NA 2
#4 Abswur… CuMn6(SiO4… NA NA NA NA NA NA 12 NA NA
#5 Acanth… Ag2S NA NA NA NA NA NA NA NA NA
#6 Acetam… CH3CONH2 2 5 1 NA NA NA 1 NA NA
## … with 6 more variables: Sn <dbl>, Cu <dbl>, Mn <dbl>, Si <dbl>, Ag <dbl>,
## S <dbl>
Upvotes: 2