How to extract numerical data from a file and create a dataframe in R

Question

Below is a sample template of my file where I would like to extract the numerical data and create a dataframe, (file name : mylog.log)

Contents of the file

*****************************************************************
 ******  option summary
 ******
 runlvl  = 3         bypass  = 2         
  Opening plot unit= 15
 file=new_run.pa0

 ******  
 

  ********  dc transfer curves tnom=  25.000 temp=  25.000 *****
x
        


     volt    current    
                    v0     
  1.0000e-01   -2.570e-11  
  2.0000e-01   -2.769e-11  
  3.0000e-01   -2.795e-11  
  4.0000e-01   -2.803e-11  
  5.0000e-01   -2.814e-11  
  6.0000e-01   -2.823e-11  
  7.0000e-01   -2.833e-11  
  8.0000e-01   -2.842e-11  
  9.0000e-01   -2.861e-11  
  1.0000e+00   -2.871e-11  
  1.1000e+00   -2.880e-11  
  1.2000e+00   -2.889e-11  
  1.3000e+00   -2.899e-11  
  1.4000e+00   -2.908e-11  
  1.5000e+00   -2.916e-11  
  1.6000e+00   -2.925e-11  
  1.7000e+00   -2.934e-11  
  1.8000e+00   -2.942e-11  
  1.9000e+00   -2.950e-11  
  2.0000e+00   -2.958e-11  
  2.1000e+00   -2.966e-11  
  2.2000e+00   -2.973e-11  
  2.3000e+00   -2.980e-11  
  2.4000e+00   -2.986e-11  
  2.5000e+00   -3.022e-11  
  2.6000e+00   -3.030e-11  
  2.7000e+00   -3.039e-11  
  2.8000e+00   -3.046e-11  
  2.9000e+00   -3.053e-11  
  3.0000e+00   -3.059e-11  
  3.1000e+00   -3.066e-11  
  3.2000e+00   -3.110e-11  
  3.3000e+00   -3.119e-11  
  3.4000e+00   -3.125e-11  
  3.5000e+00   -3.132e-11  
  3.6000e+00   -3.140e-11  
  3.7000e+00   -3.145e-11  
  3.8000e+00   -3.152e-11  
  3.9000e+00   -3.161e-11  
  4.0000e+00   -3.167e-11  
  4.1000e+00   -3.175e-11  
  4.2000e+00   -3.175e-11  
  4.3000e+00   -3.182e-11  
  4.4000e+00   -3.189e-11  
  4.5000e+00   -3.197e-11  
  4.6000e+00   -3.199e-11  
  4.7000e+00   -3.202e-11  
  4.8000e+00   -3.212e-11  
  4.9000e+00   -3.219e-11  
  5.0000e+00   -3.278e-11  
  5.1000e+00   -3.281e-11  
  5.2000e+00   -3.297e-11  
  5.3000e+00   -3.295e-11  
  5.4000e+00   -3.301e-11  
  5.5000e+00   -3.299e-11  
  5.6000e+00   -3.310e-11  
  5.7000e+00   -3.381e-11  
  5.8000e+00   -3.384e-11  
  5.9000e+00   -3.396e-11  
  6.0000e+00   -3.398e-11  
  6.1000e+00   -3.405e-11  
  6.2000e+00   -3.411e-11  
  6.3000e+00   -3.413e-11  
  6.4000e+00   -3.424e-11  
  6.5000e+00   -3.420e-11  
  6.6000e+00   -3.432e-11  
  6.7000e+00   -3.438e-11  
  6.8000e+00   -3.444e-11  
  6.9000e+00   -3.450e-11  
  7.0000e+00   -3.444e-11  
  7.1000e+00   -3.455e-11  
  3.9300e+01   -8.790e+00  
  3.9400e+01   -8.824e+00  
  3.9500e+01   -8.859e+00  
  3.9600e+01   -8.894e+00  
  3.9700e+01   -8.928e+00  
  3.9800e+01   -8.963e+00  
  3.9900e+01   -8.998e+00  
  4.0000e+01   -9.033e+00  
y

          ***** job concluded
 ****** HSPICE -- H-2013.03-SP2 32-BIT (Aug 26 2013) RHEL32 ******              
 ******  
 

  ******  job statistics summary tnom=  25.000 temp=  25.000 *****
  
  
 ******  Machine Information  ******

I would like to extract the values that are the below the line v0 and create a two column dataframe.

Volt             Current
-100.00000m      406.5220f
"........................"

for all the rows till the numerical data is present. The fact of the matter is, the number of lines before the line volt and current is dynamic and the number of lines after the numerical data gets over is also dynamic. The numerical data rows are also dynamic. One thing that can be considered is numerical data will always start after the below two lines of the file,

volt      current    
                        v0

Below is my code which I have tried by specifying line numbers of the data,

DATA <- readLines(myfile)
  
  DataStartPos <- 314
  DataEndPos <- 1062
  
  
  #Seperate numeric data and META data and bind to data frame
  
    tmp <- as.data.frame(DATA[DataStartPos:DataEndPos])
    tmp <- separate(tmp, col = 1, c("S.No", "Volts", "Amps"), sep = "\s+")

I'm now able to create a dataframe but only by specifying line numbers of the file statically. Is there any way to identify the numerical data with the explained case.

How to extract numerical data from a file and create a dataframe in R

Answers (1)

EDIT 1 - skipping lines

EDIT 2 - lines between 2 patterns

Related Questions