Reputation: 917
I am trying to create a new 2 dimensional or 2 column array, which will consist of (data value <=20000) from the first column, and their associated ID values in the second column. Mathematically I am doing the following: I am reading data from a text file. I am finding distance to all the points from the last point.
# ID M1 M2 M3 M4 R4 M5 R5 x y z
10217 11.467 11.502 13.428 13.599 432.17 13.266 281.06 34972.8 42985.9 14906
7991 11.529 11.559 13.438 13.520 435.23 13.224 272.23 8538.05 33219.8 43375.1
2100 11.526 11.573 13.478 13.490 448.97 13.356 301.27 9371.75 13734.1 43398.6
9467 11.557 11.621 13.481 13.537 449.99 13.367 303.67 33200.3 36008.9 12735.8
4002 11.454 11.530 13.502 13.583 457.34 13.327 294.53 44607.2 10410.9 9090
2971 11.475 11.563 13.506 13.558 458.77 13.391 309.43 29818.3 98.65 11718.6
1243 11.538 11.581 13.509 13.513 459.62 13.377 306.09 16238.4 11067.9 25048
9953 11.523 11.544 13.559 13.913 477.72 13.440 321.20 34589.6 42869 14878.6
7411 11.547 11.576 13.610 13.658 496.81 13.479 330.96 31436 42092.8 12307.8
1820 11.606 11.619 13.652 12.543 513.11 13.571 355.21 1758.75 15809.8 40473.6
2792 11.647 11.679 13.744 13.877 550.82 13.643 375.38 24393 6774.8 8346.35
510 11.687 11.717 13.771 13.810 562.27 13.642 375.14 22340.3 9316.4 13209.9
1721 11.602 11.646 13.821 14.139 584.37 13.770 413.84 2144.95 15769.1 40470.1
After I get the distances, I only want to take distances<=20,000 from my calculations and also their associated ID column.
So far I wrote this code to return calculated distances and IDs:
# Find nearest neighbors
import numpy as np
import matplotlib.pyplot as plt
halo = 'nntest.txt'
ID, m,r,x,y,z= np.loadtxt(halo, usecols=(0,6,7,8,9,10), unpack =True)
# selet the last point
m_mass = m[-1:]
ID_mass = ID[-1:]
r_mass = r[-1:]
x_mass = x[-1:]
y_mass = y[-1:]
z_mass = z[-1:]
#######################################
#Find distance to all points from our targeted point
nearest_neighbors = []
def neighbors(ID_mass, cx,cy,cz, ID, x, y, z):
dist = np.sqrt((cx-x)**2 + (cy-y)**2 + (cz-z)**2)
return dist, ID
for i in range(len(ID_mass)):
hist = neighbors(ID_mass[i], x_mass[i], y_mass[i], z_mass[i], ID, x, y, z)
print hist
#print all the IDs which are associated with dist<=20000
if (hist[0]<=20000):
print ID
nearest_neighbors.append(hist)
print nearest_neighbors
But I am having problem returning the new array, which will only contain distances<=20000, and associated IDs. I apologize in advance if this is not a good working example. But I will very much appreciate your suggestion to get that desired output.
Upvotes: 2
Views: 84
Reputation: 26
Between the question you asked, and the code you have provided, I am still somewhat unclear on what you what to accomplish. But I can at least show you where there are errors in the code, and perhaps give you the tools you need.
As your code is now, x, y, z are all vectors. So the result of the neighbors distance calculation,
dist = np.sqrt((cx-x)**2 + (cy-y)**2 + (cz-z)**2)
will be a vector. I think this is what you intended since the other values are indexed. But this means you run into trouble with
if (hist[0]<=20000):
print ID
Numpy will treat the inequality as a mask, so hist[0]<=2000
will look something like [True, False, False, ...]
. Used properly, I think that numpy array masks are perfect for what you want.
For example, you could try
for i in range(len(ID_mass)):
hist = neighbors(ID_mass[i], x_mass[i], y_mass[i], z_mass[i], ID, x, y, z)
print hist
#print all the IDs which are associated with dist<=20000
print ID[hist[0]<=20000]
nearest_neighbors.extend(list(zip(hist[0][hist[0]<=20000],ID[hist[0]<=20000])))
print nearest_neighbors
This line where we extend the nearest_neighbors list is a bit of a mess, and I may not have fully understood what you want the output to look like. But this will make a list of tuples, where each tuple contains the distance value and the ID for all of the cases where distance was less than 20000.
Upvotes: 1