hachiko
hachiko

Reputation: 757

R Creating a data frame from a picture - is there a better way than what I'm doing here?

I have this photo here. It's a table that looks like it was created in Excel. I'm trying to create a data frame in R just copying this picture.

enter image description here

My first attempt was to create these vectors and then use the rownames() function to add the row names. But I discovered the row names weren't really useful and they went away when I tried a tidy::gather method.

English_E <- c(10, 1, 3, 2, 51)
Currier_C <- c(15, 2, 1, 4, 102)
Primrose_P1 <- c(10, 2, 6, 2, 66)
Primrose_P2 <- c(10, 1, 6, 5, 66)
Bluetail_B <- c(20, 1, 3, 3, 89)
Resource_Availability <- c(130, 13, 45, 23, "")

rownames(pottery_df) <- c("Clay_lbs", "Enamel_lbs", "Dry_Room_hrs", "Kiln_hrs", "Contribution_to_Earning")

So my second method was to create these same vectors and then another one which I called "Considerations" for lack of a better idea, then I turned them all into data frames and then I did cbind and removed columns and rearranged it. I think this second way was a lot better because the rowname column was useful.

English_E <- c(10, 1, 3, 2, 51)
Currier_C <- c(15, 2, 1, 4, 102)
Primrose_P1 <- c(10, 2, 6, 2, 66)
Primrose_P2 <- c(10, 1, 6, 5, 66)
Bluetail_B <- c(20, 1, 3, 3, 89)
Resource_Availability <- c(130, 13, 45, 23, "")

Considerations <-  c("Clay_lbs", "Enamel_lbs", "Dry_Room_hrs", "Kiln_hrs", "Contribution_to_Earning")

English_E_df <- data_frame(English_E) %>% cbind(Considerations)
Currier_C_df <- data_frame(Currier_C) %>% cbind(Considerations)
Primrose_P1_df <- data_frame(Primrose_P1) %>% cbind(Considerations)
Primrose_P2_df <- data_frame(Primrose_P2) %>% cbind(Considerations)
Bluetail_B_df <- data_frame(Bluetail_B) %>% cbind(Considerations)
Resource_Availability_df <- data_frame(Resource_Availability) %>% cbind(Considerations)

pottery_df <- cbind(English_E_df, Currier_C_df, Primrose_P1_df, Primrose_P2_df, Bluetail_B_df, Resource_Availability_df) 

pottery_df <- pottery_df %>%
  select(1, 2, 3, 5, 7, 9, 11) 

pottery_df <- pottery_df[, c(2, 1, 3, 4, 5, 6, 7)]

My question -- Is there a better way to do all this? I feel like this is a lot of code to create a pretty simple table and it seems like a hackjob method to create so many tables and combine them together and remove duplicate columns.

Upvotes: 1

Views: 276

Answers (2)

jared_mamrot
jared_mamrot

Reputation: 26690

One method is using the tribble() function:

library(tidyverse)
dataframe <- tribble(~"Rownames", ~"English_E", ~"Currier_C", ~"Primrose_P1", ~"Primrose_P2", ~"Bluetail_B", ~"Resource_Availability",
                     "Clay_lbs", 10, 15, 10, 10, 20, 130,
                     "Enamel_lbs", 1, 2, 2, 1, 1, 13,
                     "Dry_Room_hrs", 3, 1, 6, 6, 3, 13,
                     "Kiln_hrs", 2, 4, 2, 5, 3, 45,
                     "Contribution_to_Earning", 51, 102, 66, 66, 89, 23)

dataframe
# A tibble: 5 x 7
  Rownames English_E Currier_C Primrose_P1 Primrose_P2 Bluetail_B
  <chr>        <dbl>     <dbl>       <dbl>       <dbl>      <dbl>
1 Clay_lbs        10        15          10          10         20
2 Enamel_…         1         2           2           1          1
3 Dry_Roo…         3         1           6           6          3
4 Kiln_hrs         2         4           2           5          3
5 Contrib…        51       102          66          66         89
# … with 1 more variable: Resource_Availability <dbl>

And you can use column_to_rownames() to use the Rownames variable as rownames:

library(tidyverse)
dataframe <- tribble(~"Rownames", ~"English_E", ~"Currier_C", ~"Primrose_P1", ~"Primrose_P2", ~"Bluetail_B", ~"Resource_Availability",
                     "Clay_lbs", 10, 15, 10, 10, 20, 130,
                     "Enamel_lbs", 1, 2, 2, 1, 1, 13,
                     "Dry_Room_hrs", 3, 1, 6, 6, 3, 13,
                     "Kiln_hrs", 2, 4, 2, 5, 3, 45,
                     "Contribution_to_Earning", 51, 102, 66, 66, 89, 23) %>% 
  column_to_rownames(var = "Rownames")

dataframe
                        English_E Currier_C Primrose_P1 Primrose_P2 Bluetail_B Resource_Availability
Clay_lbs                       10        15          10          10         20                   130
Enamel_lbs                      1         2           2           1          1                    13
Dry_Room_hrs                    3         1           6           6          3                    13
Kiln_hrs                        2         4           2           5          3                    45
Contribution_to_Earning        51       102          66          66         89                    23

Upvotes: 1

Ronak Shah
Ronak Shah

Reputation: 389135

tibbles don't support rownames so put the information of rowname as separate column.

English_E <- c(10, 1, 3, 2, 51)
Currier_C <- c(15, 2, 1, 4, 102)
Primrose_P1 <- c(10, 2, 6, 2, 66)
Primrose_P2 <- c(10, 1, 6, 5, 66)
Bluetail_B <- c(20, 1, 3, 3, 89)
Resource_Availability <- c(130, 13, 45, 23, "")
name <- c("Clay_lbs", "Enamel_lbs", "Dry_Room_hrs", "Kiln_hrs", "Contribution_to_Earning")
pottery_df <- data.frame(name, English_E, Currier_C, Primrose_P1, Primrose_P2, Bluetail_B, Resource_Availability)

Upvotes: 3

Related Questions