Reputation: 321
I have two dataframes that I want to combine but I'm not sure how I would as one of them has a different number of columns. The first dataframe is this one:
Species Name Country Lat Lon
Corynosoma hannae New Zealand -46.5000 170.3000
Polymorphus brevis Mexico 19.4206 -102.2060
Acanthocephala terminalis United States 38.1806 -83.4505
Polymorphus brevis Mexico 30.5603 -115.9420
Polymorphus brevis Mexico 19.6728 -99.7078
Polymorphus brevis Mexico 19.6833 -101.8830
Polymorphus brevis Mexico 30.5603 -115.9420
Polymorphus brevis Mexico 30.5603 -115.9420
The second dataframe:
Species Name Country Number of Records
Corynosoma hannae New Zealand 3
Polymorphus brevis Mexico 41
Acanthocephala terminalis United States 1
The second dataframe tallies how many species samples were obtained for each country. I want to be able to add Number of Records to the first dataframe mainly be grouping such that the dataframe appears as so:
Species Name Country Lat Lon Number of Records
Corynosoma hannae New Zealand -46.5000 170.3000 3
Acanthocephala terminalis United States 38.1806 -83.4505 1
Polymorphus brevis Mexico 30.5603 -115.9420 41
Polymorphus brevis Mexico 19.6728 -99.7078
Polymorphus brevis Mexico 19.6833 -101.8830
Polymorphus brevis Mexico 30.5603 -115.9420
Polymorphus brevis Mexico 30.5603 -115.9420
Acanthocephala confraterna United States 35.6859 -83.4986 2
So I don't want to repeat, for example 41, for every row of Polymorphus brevis. I want it to just be one row that groups all of the Polymorphus brevis samples found in Mexico into the column Number of Records. Any help with this would be appreciated. I'm trying to create bubblemap with this dataframe using the rworldmap package.
Upvotes: 0
Views: 84
Reputation: 14764
Something like:
library(dplyr)
left_join(df1, df2, by = c("Species Name", "Country")) %>%
group_by(`Species Name`, Country) %>%
mutate(
`Number of Records` = as.numeric(as.character(`Number of Records`)),
`Number of Records` = ifelse(row_number() == 1, `Number of Records`, NA)
)
Upvotes: 2
Reputation: 21400
I agree with the two previous answers suggesting to simply add a new column to one data frame with info from the other data frame. One way of doing this is by using the function match()
:
# define 1st df:
df1 <- data.frame(
Observations = c("obs1", "obs2", "obs3"),
Data = c(sample(1:20, 3))
)
# define 2nd df:
df2 <- data.frame(
OtherObservations = c("obs1", "obs2", "obs3"),
OtherData = c(1, 2, NA)
)
# now add to df1 the relevant column in df2 based on matching data in either data frame:
df1$NewColumn <- df2$OtherData[match(df1$Observations, df2$OtherObservations)]
df1
Upvotes: 1
Reputation: 11140
Based on our comments a simpler solution might be to add a new column to dataframe 1 rather that joining two dataframes -
library(dplyr)
df1 %>%
group_by(`Species Name`, Country) %>%
mutate(
nRecords = ifelse(row_number() == 1, n(), NA_integer_)
) %>%
ungroup()
Upvotes: 0