Reputation: 4608
I am analysing the distances of users to userx
over 6 weeks in a social network.
Note: 'No path' means the two users are not conncted yet (at least by friends of friends).
week1 week2 week3 week4 week5 week6
user1 No path No path No path No path 3 1
user2 No path No path No path 5 3 1
user3 5 4 4 4 4 3
userN ...
I want to see how well the users connect with userx
.
For that I initially thought of using the value of regression slope for the interpretation (i.e. the low regression slope, the better it is).
For example; consider user1
and user2
the regression slope of them are calculated as follows.
user1:
from sklearn.linear_model import LinearRegression
regressor = LinearRegression()
X = [[5], [6]] #distance available only for week5 and week6
y = [3, 1]
regressor.fit(X, y)
print(regressor.coef_)
Output is -2.
user2:
from sklearn.linear_model import LinearRegression
regressor = LinearRegression()
X = [[4], [5], [6]] #distance available only for week4, week5 and week6
y = [5, 3, 1]
regressor.fit(X, y)
print(regressor.coef_)
Output is -2.
As you can see both the users get same slope
value. However, user2
has been connected with userx
a week before than user1
. Hence, user1
should be awarded someway.
Therefore, I am wondering if there is a better way of calculating my problem.
I am happy to provide more details if needed.
Upvotes: 0
Views: 1230
Reputation: 1328
Well, if you want to award for the duration of connection, you probably need to take time into calculations. The easiest/most straightforward way is just to multiply the coefficent by time:
outcome_measure <- regressor.coef_ * length(y)
And if you would divide it by 2 it will conceptually be the same as the area under the curve (AUC):
outcome_measure <- (regressor.coef_ * length(y))/2
So you would get -4 and -6 with the first method or -2 and -3 with the second.
Slightly offtopic, but IF you use linear regression for statistical analysis (not just to get coefficent), I probably would add some kind of check to confirm that its assumptions are true.
Upvotes: 1