Enie
Enie

Reputation: 11

Multi-layer social network (multiplex) using networkx

For my thesis, I am trying to built a multilayer social network from twitter data I received from my professor, to analyze the reciprocity of emotions and check out occuring spikes and their relation to worl events. It is important to note, that I am only focussing on dyads aka interactions between no more than two people.

So far, I have properly cleaned my data and downsampled it using stratified sampling. Considering that my dataset covers a whole month and includes specific dates, stratified sampling based on the 'created_at' column (date) could potentially provide a more representative sample than random sampling. By stratifying the data based on dates, I can ensure that my sample captures the temporal patterns and variations within the analyzed month. I had to downsample since I had roughly 16.5 million rows, which my laptop just could not handle. Furthermore, I have removed "useless" columns.

Here is a small sample of what my data now looks like, so my code can be understood better: screenshot of downsampled data

Now to my problem: I already created a network, the code looks as follows

# Create an empty dictionary to store the networks for each emotion
emotion_networks = {}

# Create a set to keep track of dyads
dyads = set()

# Iterate over each row in the sub_df
for index, row in sub_df_multi.iterrows():
    # Extract the relevant information for the edge
    timestamp = row['created_at']
    source_user = row['author_id']
    target_user = row['referenced_id']
    
    # Ensure the interaction is a dyad
    if source_user != target_user:
        # Add the dyad to the set
        dyads.add((source_user, target_user))
        
        # Iterate over each emotion column
        for emotion in ['posemo', 'negemo', 'anx', 'anger', 'sad']:
            emotion_value = row[emotion]
            
            # Check if the emotion value is non-zero
            if emotion_value != 0:
                # Create a new network for the emotion if it doesn't exist
                if emotion not in emotion_networks:
                    emotion_networks[emotion] = nx.MultiDiGraph()
                
                # Add an edge to the network for the emotion
                emotion_networks2[emotion].add_edge(source_user, target_user, timestamp=timestamp, emotion=emotion_value)

Initially I though, that it was working as intended, but when going into analysis, the reciprocity of all emotions turns out to be 0.00 which just seems unusal to me and I suspect that there is a mistake in the way I built my network. I have tried several things like emitting conversations where 0 emotions where exchanged, but this mostly just skews the results and still leads to a reciprocity of 0. I have also tried calculating the reciprocity employing various different code snippets. Here are some examples:

Example 1 (pretty simple):

for emotion, network in emotion_networks.items():
    reciprocity = nx.reciprocity(network)
    print(f"Emotion: {emotion}")
    print(f"Reciprocity: {reciprocity}")
    print()

Example 2: (To calculate reciprocity more accurately, here I consider all dyadic interactions and count both reciprocal and non-reciprocal edges)

reciprocal_edges = 0
total_edges = 0

# Iterate over each emotion network
for emotion, network in emotion_networks.items():
    # Iterate over all edges in the network
    for u, v, d in network.edges(data=True):
        total_edges += 1
        if network.has_edge(v, u):
            reciprocal_edges += 1

reciprocity = reciprocal_edges / total_edges if total_edges > 0 else 0
print("Reciprocity:", reciprocity)

# Analyze the emotions for each dyad
emotion_dyads = {}

# Iterate over each emotion network
for emotion, network in emotion_networks.items():
    # Iterate over each edge in the network
    for u, v, d in network.edges(data=True):
        # Check if the edge has the emotion attribute
        if 'emotion' in d:
            edge_emotion = d['emotion']
            # Add the emotion to the emotion dyads dictionary
            if (u, v) in emotion_dyads:
                emotion_dyads[(u, v)].add(edge_emotion)
            else:
                emotion_dyads[(u, v)] = {edge_emotion}

# Print the emotions for each dyad
for dyad, emotions in emotion_dyads.items():
    print(f"Dyad: {dyad}, Emotions: {emotions}")

Technically, example 2 should be more accurate since

  1. Separate calculation for each emotion network: The code iterates over each emotion network separately and counts the reciprocal edges within that network. This ensures that the reciprocity calculation is performed independently for each emotion, providing a more accurate measure of reciprocity specific to each emotion.
  2. Consideration of total edges: The code counts the total number of edges in each emotion network. This is important because reciprocity is calculated as the ratio of reciprocal edges to total edges. By considering the total edges, the calculation accounts for both reciprocal and non-reciprocal edges, providing a more comprehensive view of the network dynamics.

Nonetheless, the results are 0.00.

If anyone could help me out I'd be super grateful, I've been going crazy trying to troubleshoot and find possible mistakes and would really love to finish my bachelor thesis at some point.

Upvotes: 1

Views: 144

Answers (0)

Related Questions