Reputation: 63
I have a dataset that looks something like this,
Locations Lat Long
1 El Ay 36.086 4.777
2 Burbank, California 34.181 -118.309
3 Nashville, TN 36.163 -86.782
4 On the lam 42.920 -80.285
5 San Dog, CA 32.734 -117.193
6 New York City 40.713 -74.006
7 Dreamland 33.642 -97.315
8 LA 34.052 -118.244
9 Los Angeles 34.052 -118.244
10 United States 37.090 -95.713
Basically, the first column are locations names entered by users, columns 2 and 3 are the latitudes and longitudes of these cities.
I want to summarize this dataset using ddply() that tabulates the frequencies of cities by Lat and Lng, I tried ddply(data, .(Lat, Long), summarize, count = length(Lat))
and it gave me the table below (without city names)
Lat Long count
1 32.734 -117.193 1
2 33.642 -97.315 1
3 34.052 -118.244 2
4 34.181 -118.309 1
5 36.086 4.777 1
6 36.163 -86.782 1
7 37.090 -95.713 1
8 40.713 -74.006 1
9 42.920 -80.285 1
I also tried ddply(data, .(Locations, Lat, Long), summarize, count = length(Lat))
and got
Locations Lat Long count
1 Burbank, California 34.181 -118.309 1
2 Dreamland 33.642 -97.315 1
3 El Ay 36.086 4.777 1
4 LA 34.052 -118.244 1
5 Los Angeles 34.052 -118.244 1
6 Nashville, TN 36.163 -86.782 1
7 New York City 40.713 -74.006 1
8 On the lam 42.920 -80.285 1
9 San Dog, CA 32.734 -117.193 1
10 United States 37.090 -95.713 1
I want to keep the column names but also want LA and Los Angeles to be tabulated together (the name can be LA or Los Angeles). What should I do?
Thanks
Upvotes: 2
Views: 156
Reputation: 4216
Using dplyr
, this groups together locations by common latitude and longitude and gives the count. If there are multiple names for the same lat/long, it will just keep the first name.
library(dplyr)
data2 <- data %>%
group_by(Lat, Long) %>%
summarize(
Locations = first(Locations),
Count = n())
The result:
> data2
Source: local data frame [9 x 4]
Groups: Lat [?]
Lat Long Locations Count
(dbl) (dbl) (fctr) (int)
1 32.734 -117.193 SanDog,CA 1
2 33.642 -97.315 Dreamland 1
3 34.052 -118.244 LA 2
4 34.181 -118.309 Burbank,California 1
5 36.086 4.777 ElAy 1
6 36.163 -86.782 Nashville,TN 1
7 37.090 -95.713 UnitedStates 1
8 40.713 -74.006 NewYorkCity 1
9 42.920 -80.285 Onthelam 1
Upvotes: 3