SiH
SiH

Reputation: 1546

extract information from non tabular file and store as tibble

I used a statical library which exported its output in text file. I wish to extract some information from the text file. I can copy-paste, but I want to automate this process of extracting information from output text file.

Below is an sample output in text file.

How do extract LL(0) and LL(final) and corresponding values and store them in tibble?

Thanks

Date: XX-XX-XXXX
Time: XX:XX PM

Model: Binary Logit Model

LL(0)                            : -376.3789
Number of Inputs                 :  500
LL(final)                        : -114.1382
Rho-sq                           :  0.512

Upvotes: 1

Views: 46

Answers (2)

akrun
akrun

Reputation: 887048

We can read the .txt with readLines, use grep to subset the lines that have "LL", extract the numeric component with a regex by removing characters until the : followed by zero or more spaces

v1 <- sub(".*:\\s*", "", grep("^LL\\(", txt, value = TRUE))
tibble(v1 = as.numeric(v1))
# A tibble: 2 x 1
#    v1
#  <dbl>
#1 -376.
#2 -114.

If we want the header as well

library(data.table)
library(magrittr)
read.table(text = grep("^LL\\(", txt, value = TRUE), 
        sep = ":", strip.white = TRUE) %>% 
     data.table::transpose(., make.names = "V1")
#     LL(0) LL(final)
#1 -376.3789 -114.1382

data

txt <- readLines('file.txt')

Upvotes: 2

Rui Barradas
Rui Barradas

Reputation: 76402

Something like the following will do what the question asks for. The parenthesis must be escaped in order for grep to get them.

library(tibble)

fun_extract <- function(text, pattern){
  i <- grep(pattern, text)
  x <- sub("^.*:(.*$)", "\\1", text[i])
  as.numeric(x)
}

lines <- readLines("data.txt")

ll0 <- "LL\\(0\\)"
llfinal <- "LL\\(final\\)"

tbl <- tibble(
  `LL(0)` = fun_extract(lines, ll0), 
  `LL(final)` = fun_extract(lines, llfinal)
)

Upvotes: 2

Related Questions