Reputation: 101
I have a CSV file in which I have tweets with the following column names: File, User, Date 1, month, day, Tweet, Permalink, Retweet count, Likes count, Tweet value, Language, Location.
I want to create a new data frame with tweets from certain cities. I can do it but only for the last city on my list (Girona). So it doesn't add all the rows. Here is my code:
import pandas as pd
import os
path_to_file = "populismo_merge.csv"
df = pd.read_csv(path_to_file, encoding='utf-8', sep=',')
values = df[df['Location'].str.contains("A Coruña",na=False)]
values = df[df['Location'].str.contains("Alava",na=False)]
values = df[df['Location'].str.contains("Albacete",na=False)]
values = df[df['Location'].str.contains("Alicante",na=False)]
values = df[df['Location'].str.contains("Almería",na=False)]
values = df[df['Location'].str.contains("Asturias",na=False)]
values = df[df['Location'].str.contains("Avila",na=False)]
values = df[df['Location'].str.contains("Badajoz",na=False)]
values = df[df['Location'].str.contains("Barcelona",na=False)]
values = df[df['Location'].str.contains("Burgos",na=False)]
values = df[df['Location'].str.contains("Cáceres",na=False)]
values = df[df['Location'].str.contains("Cádiz",na=False)]
values = df[df['Location'].str.contains("Cantabria",na=False)]
values = df[df['Location'].str.contains("Castellón",na=False)]
values = df[df['Location'].str.contains("Ceuta",na=False)]
values = df[df['Location'].str.contains("Ciudad Real",na=False)]
values = df[df['Location'].str.contains("Córdoba",na=False)]
values = df[df['Location'].str.contains("Cuenca",na=False)]
values = df[df['Location'].str.contains("Formentera",na=False)]
values = df[df['Location'].str.contains("Girona",na=False)]
values.to_csv(r'populismo_ciudad.csv', index = False)
Many thanks!!!
Upvotes: 0
Views: 45
Reputation: 2924
You are overwriting the values
variable each time. A more concise answer would be along the lines of.
values= df[df['LocationName'].isin(["A Coruña", "Alava", ......)]
Upvotes: 1
Reputation: 120559
Use isin
:
import pandas as pd
import os
path_to_file = "populismo_merge.csv"
df = pd.read_csv(path_to_file, encoding='utf-8', sep=',')
cities = ['A Coruña', 'Alava', 'Albacete', 'Alicante', 'Almería',
'Asturias', 'Avila', 'Badajoz', 'Barcelona', 'Burgos',
'Cáceres', 'Cádiz', 'Cantabria', 'Castellón', 'Ceuta',
'Ciudad Real', 'Córdoba', 'Cuenca', 'Formentera', 'Girona']
values = df[df['Location'].isin(cities)]
values.to_csv(r'populismo_ciudad.csv', index = False)
Upvotes: 1