Reputation: 423
I'm trying to implement Top2Vec on Colab. The following code is working fine with the dataset "https://raw.githubusercontent.com/wjbmattingly/bap_sent_embedding/main/data/vol7.json" available here.
But when I'm using the dataset "abcnews-date-text.csv", Colab is running endlessly. Any idea to resolve this please.
# Extract the text data from the dataset
documents = data['headline_text'].tolist()
# Initialize Top2Vec model
top2vec_model = Top2Vec(documents, embedding_model="distiluse-base-multilingual-cased")
# Get the number of topics
num_topics = 5 # You can adjust this number according to your preference
# Get the top topics
top_topics = top2vec_model.get_topics(num_topics)
# Print the top topics
for i, topic in enumerate(top_topics):
print(f"Topic {i+1}: {', '.join(topic)}")
Upvotes: 2
Views: 39