Rose
Rose

Reputation: 289

Locate and slice off overlap in data array

I want to get rid of the overlap in my longitude data at the end of the array (0 to 20.4 degrees). So in the end I want the values to be 0-360.

I am going to be doing this for many arrays with a variable number of values that overlap, so I cannot just slice off the last three values. Also, the beginning and end points won't always be 0 & 360 or 20.4. I also want to preserve the order of the values so that I can slice off the corresponding values in the latitude array.

Most of the information on the internet is about getting rid of duplicate values, but none of my values because of the numbers trailing the decimal.

lon = np.array([0.9783,20.1276,40.3784,60.0987,80.3748,100.9999,120.4567,140.3543,160.2342,180.3453,200.8874,220.2346,240.5554,260.5676,280.4345,300.4454,320.5654,340.6432,360.3343,0.0124,10.3213,20.4355]) 

I've tried brainstorming ways to do it with <, >, =, np.where, or if/else without success so far.

Any help or suggestions are appreciated.

Upvotes: 1

Views: 120

Answers (5)

lxop
lxop

Reputation: 8605

@Joe Iddon's answer will work, but if you want to avoid loops, you can do something like this:

diff = np.diff(lon)
drops = np.flatnonzero(diff < 0)
if len(drops) > 0:
    # Only do this if there is a wrap around
    end_index = drops[0] + 1
    lon = lon[:end_index]

And you can then use end_index to slice other matching arrays too (e.g. latitude).

Note that this doesn't do any fixing for values outside of [0..360] - you'll have to do that separately, depending on how you want to deal with them.


Update for new requirement:

assert len(lon) > 0
above_first = (lon >= lon[0]).astype(int)
diffs = np.diff(above_first)
overlap_indices = np.flatnonzero(diffs > 0)
if len(overlap_indices) > 0:
    end_index = overlap_indices[0] + 1
    lon = lon[:end_index]

This will work even if the overlap wraps around multiple times.

Upvotes: 1

Jun Saito
Jun Saito

Reputation: 97

How about

tmp = lon - lon[0]
tmp[tmp<0] += 360
sliced = lon[:np.where(np.diff(tmp) < 0)[0][0]+1]

Upvotes: 0

pyano
pyano

Reputation: 1978

New solution: start from the end of lon2 and compare with the first element of lon2

lon2 = np.array([50,110,200,340,1,10,25,80,90,130])
#lon2 = lon

ix = np.argmax(lon2[::-1] < lon2[0])
L2 = lon2[0:-ix]

gives

with lon2 =  [ 50 110 200 340   1  10  25]

and

with lon =  [  9.78300000e-01   2.01276000e+01   4.03784000e+01   6.00987000e+01
   8.03748000e+01   1.00999900e+02   1.20456700e+02   1.40354300e+02
   1.60234200e+02   1.80345300e+02   2.00887400e+02   2.20234600e+02
   2.40555400e+02   2.60567600e+02   2.80434500e+02   3.00445400e+02
   3.20565400e+02   3.40643200e+02   3.60334300e+02   1.24000000e-02]

Upvotes: 1

Joe Iddon
Joe Iddon

Reputation: 20434

If you want to get rid of all the elements after the data drops down to the start again (so in your case only the elements up to the 360.3343 before the 0.0124), the following for-loop should do the job.

stop = False
for i in range(len(lon)-1):
    if stop and lon[i] > lon[0]:
        lon = lon[:i]
        break
    if lon[i] > lon[i+1]:
        stop = True

which with the data you gave for lon in the question:

lon = np.array([0.9783,20.1276,40.3784,60.0987,80.3748,100.9999,120.4567,140.3543,160.2342,180.3453,200.8874,220.2346,240.5554,260.5676,280.4345,300.4454,320.5654,340.6432,360.3343,0.0124,10.3213,20.4355]):

modifies lon to:

array([   0.9783,   20.1276,   40.3784,   60.0987,   80.3748,  100.9999, 120.4567,  140.3543,  160.2342,  180.3453,  200.8874,  220.2346, 240.5554,  260.5676,  280.4345,  300.4454,  320.5654,  340.6432, 360.3343])

To demonstrated this updated solution with:

lon = np.array([50 ,110, 200, 340, 1, 10, 25, 80, 90, 130]) 

we get:

array([ 50, 110, 200, 340,   1,  10,  25])

Hopefully this finally does what you want!

Upvotes: 1

pyano
pyano

Reputation: 1978

Based on @Ixop 's idea:

dL = np.diff(lon)
ix = np.argmax(dL<0)+1
L = lon[0:ix]

You can write the same in 1 line:

L = lon[0:np.argmax(np.diff(lon)<0)+1]

Upvotes: 1

Related Questions