Reputation: 11
For my thesis, I am trying to built a multilayer social network from twitter data I received from my professor, to analyze the reciprocity of emotions and check out occuring spikes and their relation to worl events. It is important to note, that I am only focussing on dyads aka interactions between no more than two people.
So far, I have properly cleaned my data and downsampled it using stratified sampling. Considering that my dataset covers a whole month and includes specific dates, stratified sampling based on the 'created_at' column (date) could potentially provide a more representative sample than random sampling. By stratifying the data based on dates, I can ensure that my sample captures the temporal patterns and variations within the analyzed month. I had to downsample since I had roughly 16.5 million rows, which my laptop just could not handle. Furthermore, I have removed "useless" columns.
Here is a small sample of what my data now looks like, so my code can be understood better: screenshot of downsampled data
Now to my problem: I already created a network, the code looks as follows
# Create an empty dictionary to store the networks for each emotion
emotion_networks = {}
# Create a set to keep track of dyads
dyads = set()
# Iterate over each row in the sub_df
for index, row in sub_df_multi.iterrows():
# Extract the relevant information for the edge
timestamp = row['created_at']
source_user = row['author_id']
target_user = row['referenced_id']
# Ensure the interaction is a dyad
if source_user != target_user:
# Add the dyad to the set
dyads.add((source_user, target_user))
# Iterate over each emotion column
for emotion in ['posemo', 'negemo', 'anx', 'anger', 'sad']:
emotion_value = row[emotion]
# Check if the emotion value is non-zero
if emotion_value != 0:
# Create a new network for the emotion if it doesn't exist
if emotion not in emotion_networks:
emotion_networks[emotion] = nx.MultiDiGraph()
# Add an edge to the network for the emotion
emotion_networks2[emotion].add_edge(source_user, target_user, timestamp=timestamp, emotion=emotion_value)
Initially I though, that it was working as intended, but when going into analysis, the reciprocity of all emotions turns out to be 0.00 which just seems unusal to me and I suspect that there is a mistake in the way I built my network. I have tried several things like emitting conversations where 0 emotions where exchanged, but this mostly just skews the results and still leads to a reciprocity of 0. I have also tried calculating the reciprocity employing various different code snippets. Here are some examples:
Example 1 (pretty simple):
for emotion, network in emotion_networks.items():
reciprocity = nx.reciprocity(network)
print(f"Emotion: {emotion}")
print(f"Reciprocity: {reciprocity}")
print()
Example 2: (To calculate reciprocity more accurately, here I consider all dyadic interactions and count both reciprocal and non-reciprocal edges)
reciprocal_edges = 0
total_edges = 0
# Iterate over each emotion network
for emotion, network in emotion_networks.items():
# Iterate over all edges in the network
for u, v, d in network.edges(data=True):
total_edges += 1
if network.has_edge(v, u):
reciprocal_edges += 1
reciprocity = reciprocal_edges / total_edges if total_edges > 0 else 0
print("Reciprocity:", reciprocity)
# Analyze the emotions for each dyad
emotion_dyads = {}
# Iterate over each emotion network
for emotion, network in emotion_networks.items():
# Iterate over each edge in the network
for u, v, d in network.edges(data=True):
# Check if the edge has the emotion attribute
if 'emotion' in d:
edge_emotion = d['emotion']
# Add the emotion to the emotion dyads dictionary
if (u, v) in emotion_dyads:
emotion_dyads[(u, v)].add(edge_emotion)
else:
emotion_dyads[(u, v)] = {edge_emotion}
# Print the emotions for each dyad
for dyad, emotions in emotion_dyads.items():
print(f"Dyad: {dyad}, Emotions: {emotions}")
Technically, example 2 should be more accurate since
Nonetheless, the results are 0.00.
If anyone could help me out I'd be super grateful, I've been going crazy trying to troubleshoot and find possible mistakes and would really love to finish my bachelor thesis at some point.
Upvotes: 1
Views: 144