Avi
Avi

Reputation: 2283

Extracting Matrix from txt file

I have the following model summary in txt file (T1.txt):

=== Summary ===

Correctly Classified Instances         423               88.6792 %
Incorrectly Classified Instances        54               11.3208 %
Kappa statistic                          0.6766
Mean absolute error                      0.0854
Root mean squared error                  0.2656
Relative absolute error                 38.4098 %
Root relative squared error             79.9279 %
Coverage of cases (0.95 level)          91.6143 %
Mean rel. region size (0.95 level)      36.1985 %
Total Number of Instances              477     

=== Confusion Matrix ===

   a   b   c   <-- classified as
 357  20   7 |   a = 1
  12  37  11 |   b = 2
   3   1  29 |   c = 3

I would like to extract the last matrix into dataframe (df1):

> df1
       a   b   c   
     357  20   7 
      12  37  11 
       3   1  29

We have to take into consideration that the model behind the txt file doesn't exist any more (I have only the txt file). In addition, the matrix size can be varied from one file to another and its number of rows doesn't have to be equal to the number of columns.

Upvotes: 2

Views: 190

Answers (1)

akrun
akrun

Reputation: 886938

We can read the file using readLines, grep to find the line that has 'Confusion Matrix', subset the lines, use gsub to remove the substring, and read with read.table

lines <- readLines('Avi.txt', warn=FALSE)
i1 <- grep('Confusion Matrix', lines)
read.table(text=gsub('(<-|\\|).*', '', 
        lines[(i1+2):length(lines)]), header=TRUE)
#    a  b  c
#1 357 20  7
#2  12 37 11
#3  3  1 29

Upvotes: 3

Related Questions