Nyxynyx
Nyxynyx

Reputation: 63647

Remove Metadata when Reading Data Frame from File in R

Problem: How can we read a data file in R where the metadata at the start of the file is to be ignored?

In the example file below, we want to start reading to the end of the file from the line

1446.60     35785.0 

Example Excerpt

Axis    Energy  Elements=   226

...

    Etch Time (EtchTime)\s  0.000000    
    Etch Level (EtchLevel)\ 0.000000    
Energy (E)  
eV  
1446.60     35785.0 
1446.80     34955.9 
1447.00     34448.0 
1447.20     33632.6 
1447.40     32905.1 
1447.60     31976.5 

...

Additionally, there is a space trailing the values in both columns, how can we get rid of them? Using strip.white=T does not seem to help:

read.table('myFile', sep = '\t', header = F, strip.white = T)

gives

        V1 V2          V3 V4
1   1446.6 NA  35785.0000 NA
2   1446.8 NA  34955.9000 NA
3   1447.0 NA  34448.0000 NA
4   1447.2 NA  33632.6000 NA
5   1447.4 NA  32905.1000 NA

Upvotes: 1

Views: 887

Answers (1)

akrun
akrun

Reputation: 887231

You could pipe with awk or sed to read from the lines starting with numbers (in linux).

 read.table(pipe("awk '/^\\s*(-?[0-9]+(\\.[0-9]*)?\\s*)+$/ {print $0}' Nyxynyx.txt"),
         header=FALSE)
 #     V1      V2
 #1 1446.6 35785.0
 #2 1446.8 34955.9
 #3 1447.0 34448.0
 #4 1447.2 33632.6
 #5 1447.4 32905.1
 #6 1447.6 31976.5

NOTE: Nyxynyx.txt is the file

Upvotes: 1

Related Questions