Reputation: 419
I have a cell grid of big dimensions. Each cell has an ID (p1
), cell value (p3
) and coordinates in actual measures (X
, Y
). This is how first 10 rows/cells look like
p1 p2 p3 X Y
0 0 0.0 0.0 0 0
1 1 0.0 0.0 100 0
2 2 0.0 12.0 200 0
3 3 0.0 0.0 300 0
4 4 0.0 70.0 400 0
5 5 0.0 40.0 500 0
6 6 0.0 20.0 600 0
7 7 0.0 0.0 700 0
8 8 0.0 0.0 800 0
9 9 0.0 0.0 900 0
Neighbouring cells of cell i
in the p1
can be determined as (i-500+1
, i-500-1
, i-1
, i+1
, i+500+1
, i+500-1
).
For example: p1
of 5 has neighbours - 4,6,504,505,506. (these are the ID of rows in the upper table - p1
).
What I am trying to is:
For the chosen value/row i
in p1
, I would like to know all neighbours in the chosen distance from i
and sum all their p3
values.
I tried to apply this solution (link), but I don't know how to incorporate the distance parameter. The cell value can be taken with df.iloc
, but the steps before this are a bit tricky for me.
Can you give me any advice?
EDIT:
Using the solution from Thomas and having df called CO
:
p3
0 45
1 580
2 12000
3 12531
4 22456
I'd like to add another column and use the values from p3
columns
CO['new'] = format(sum_neighbors(data, CO['p3']))
But it doesn't work. If I add a number instead of a reference to row CO['p3']
it works like charm. But how can I use values from p3 column automatically in format
function?
SOLVED: It worked with:
CO['new'] = CO.apply(lambda row: sum_neighbors(data, row.p3), axis=1)
Upvotes: 1
Views: 797
Reputation: 66
Solution:
import numpy as np
import pandas
# Generating toy data
N = 10
data = pandas.DataFrame({'p3': np.random.randn(N)})
print(data)
# Finding neighbours
get_candidates = lambda i: [i-500+1, i-500-1, i-1, i+1, i+500+1, i+500-1]
filter = lambda neighbors, N: [n for n in neighbors if 0<=n<N]
get_neighbors = lambda i, N: filter(get_candidates(i), N)
print("Neighbors of 5: {}".format(get_neighbors(5, len(data))))
# Summing p3 on neighbors
def sum_neighbors(data, i, col='p3'):
return data.iloc[get_neighbors(i, len(data))][col].sum()
print("p3 sum on neighbors of 5: {}".format(sum_neighbors(data, 5)))
Output:
p3
0 -1.106541
1 -0.760620
2 1.282252
3 0.204436
4 -1.147042
5 1.363007
6 -0.030772
7 -0.461756
8 -1.110459
9 -0.491368
Neighbors of 5: [4, 6]
p3 sum on neighbors of 5: -1.1778133703169344
Notes:
p1
was range(N)
as seemed to be implied (so we don't need it at all).505
is a neighbour of 5
given the list of neighbors of i
defined by the OP. Upvotes: 3