Kyla Celine Marcaida
Kyla Celine Marcaida

Reputation: 81

R - Output a text file from a dataframe with a specific format

I have the following dataframe:

 Name   Occupation    Country code   Remarks
 Mark   Engineer      1              Ok
 Jerry  Engineer      1              None
 Marie  Veterinarian  2              Ok
 Nolan  Veterinarian  2              Ok
 Max    Shepherd      2              Ok

I want to create one text file per country, say:

output1.txt:

 Engineer
 Mark - Ok
 Jerry - None

output2.txt:

 Veterinarian
 Marie - Ok
 Nolan - Ok

 Shepherd
 Max - Ok

Upvotes: 1

Views: 127

Answers (2)

akrun
akrun

Reputation: 886938

We can do a split in. base R into a list of vectors

df2 <- transform(df1, NameRemarks = paste(Name, Remarks,
        sep=" - "))[, c("NameRemarks", "Occupation", "Countrycode")]
lst1 <- lapply(split(df2[-3], df2$Countrycode), 
           function(x) split(x['NameRemarks'], x$Occupation))
#$`1`
#$`1`$Engineer
#   NameRemarks
#1    Mark - Ok
#2 Jerry - None


#$`2`
#$`2`$Shepherd
#  NameRemarks
#5    Max - Ok

#$`2`$Veterinarian
#  NameRemarks
#3  Marie - Ok
#4  Nolan - Ok

The format may not fit well in writing the file. One option is capture.output

Map(capture.output, lst1, file = paste0("output", seq_along(lst1), ".txt"))

-output

enter image description here

For storing purpose, it may be better to do a single split

lst1 <- split(df2[-3], df2$Countrycode)
lapply(names(lst1), function(x) write.csv(lst1[[x]], 
   file = paste0("output", x, ".csv"), row.names = FALSE, quote = FALSE))

Or another option is tidyverse

library(dplyr)
library(tidyr)
library(stringr)
df1 %>%        
    unite(NameRemarks, Name, Remarks, sep= " - ") %>%
    group_by(Countrycode) %>%
    mutate(rn = row_number()) %>%
    ungroup %>%
    pivot_wider(names_from = Occupation, values_from = NameRemarks)
# A tibble: 5 x 5
#  Countrycode    rn Engineer     Veterinarian Shepherd
#        <int> <int> <chr>        <chr>        <chr>   
#1           1     1 Mark - Ok    <NA>         <NA>    
#2           1     2 Jerry - None <NA>         <NA>    
#3           2     1 <NA>         Marie - Ok   <NA>    
#4           2     2 <NA>         Nolan - Ok   <NA>    
#5           2     3 <NA>         <NA>         Max - Ok

data

df1 <- structure(list(Name = c("Mark", "Jerry", "Marie", "Nolan", "Max"
), Occupation = c("Engineer", "Engineer", "Veterinarian", "Veterinarian", 
"Shepherd"), Countrycode = c(1L, 1L, 2L, 2L, 2L), Remarks = c("Ok", 
"None", "Ok", "Ok", "Ok")), class = "data.frame", row.names = c(NA, 
-5L))

Upvotes: 1

Adam B.
Adam B.

Reputation: 1180

Another option is iterate making named .csv files with purrr:


library(tidyverse)

# Example data
df <- tibble(name = c('Mark', 'Jerry', 'Marie', 'Nolan', 'Max'),
             occupation = rep(c('engineer', 'vetenarian', 'shepher'), c(2, 2, 1)),
             country_code = rep(c(1, 2), c(3, 2)),
             remarks = c('ok', 'none', 'ok', 'ok', 'ok'))

# Get a list of all occupations as a vector
occupations <- df$occupation %>% unique()

# Create a list of subdataframes split by occupation
dfs <- map(occupations, ~ df %>%
      filter(occupation == paste(.x)))

# Create .csv files from your dataframe list in your 
# working directory: "engineers.csv", "veterinarians.csv", ...
walk2(dfs, occupations, ~ write_csv(.x, paste0(.y, 's.csv')))


Upvotes: 1

Related Questions