Skip first row and every nth row when loading txt file (Python)

Question

So I have some data in a .txt file that looks like this:

"CLUSTER" "observed" "metric" "structure" "patID"
"1" 1 5.56802742675389 "V50GY" "Wall" "Generic-420-Wall-70"
"2" 1 3.04813846733667 "V70GY" "Wall" "Generic-420-Wall-70"
"3" 2 5.67825143127034 "V50GY" "Wall" "Generic-420-Wall-72"
"4" 2 3.05994158400609 "V70GY" "Wall" "Generic-420-Wall-72"
"5" 3 5.89519521321811 "V50GY" "Wall" "Generic-420-Wall-74"
"6" 3 3.12327777559325 "V70GY" "Wall" "Generic-420-Wall-74"
"7" 4 5.95329849423797 "V50GY" "Wall" "Generic-420-Wall-76"
"8" 4 3.23398452311885 "V70GY" "Wall" "Generic-420-Wall-76"
"9" 5 5.98067106255001 "V50GY" "Wall" "Generic-420-Wall-78"
"10" 5 3.36621440490947 "V70GY" "Wall" "Generic-420-Wall-78"

I also have some data where there is no alteration in the metric column, so just the same metric in the entire column. Now, to extract data from that I use the following code to get the decimal values:

y = np.loadtxt('file.txt), skiprows=1, usecols=(2,))

And then use that data for plotting.

But in the case of the data I have attached here I only need every 2nd line of data after the first skipped line, i.e. (1,3,5,7,9) for the first plot with five x-values, and every 2nd line of data after the first two skipped lines i.e. (2,4,6,8,10) for the second plot with five x-values as well.

But I'm really not sure how to do this. Can it even be done with the np.loadtxt function?

hpaulj · Accepted Answer

In [296]: y = np.loadtxt(txt.splitlines(), skiprows=1, usecols=(2,))
In [297]: y
Out[297]: 
array([ 5.56802743,  3.04813847,  5.67825143,  3.05994158,  5.89519521,
        3.12327778,  5.95329849,  3.23398452,  5.98067106,  3.3662144 ])

With your load, I get a 1d array of values

Normal slicing can give me every other value

In [298]: y[::2]
Out[298]: array([ 5.56802743,  5.67825143,  5.89519521,  5.95329849,  5.98067106])
In [299]: y[1::2]
Out[299]: array([ 3.04813847,  3.05994158,  3.12327778,  3.23398452,  3.3662144 ])

You could also skip rows in the text by using the kind of filter described in this answer, https://stackoverflow.com/a/13893642/901925. But that method is more useful when some of the lines can't/shouldn't be loaded. Here all lines can be loaded, so it's easier to select them after loading.

Skip first row and every nth row when loading txt file (Python)

Answers (1)

Related Questions