stalecrackers
stalecrackers

Reputation: 13

How to order data frame by a partial string (or first word)

I'm very new to R and I'm self-learning basic operations.

I would like to yield the following:

County   Population
ACounty, Alabama   106242
BCounty, Alabama   362845
ACounty, Texas   242342
BCounty, Texas   293729

I've tried:

df<-df %>% arrange(County)
view(df)

Which ends up as:

County   Population
ACounty, Alabama   106242
ACounty, Texas   242342
BCounty, Alabama   362845
BCounty, Texas   293729

Upvotes: 1

Views: 33

Answers (2)

akrun
akrun

Reputation: 887128

We can do this without splitting or uniting

library(dplyr)
library(stringr)
df1 %>% 
    arrange(str_remove(County, ",.*"))
#            County Population
#1 ACounty, Alabama     106242
#2   ACounty, Texas     242342
#3 BCounty, Alabama     362845
#4   BCounty, Texas     293729

data

df1 <- structure(list(County = c("ACounty, Alabama", "BCounty, Alabama", 
"ACounty, Texas", "BCounty, Texas"), Population = c(106242L, 
362845L, 242342L, 293729L)), class = "data.frame", row.names = c(NA, 
-4L))

Upvotes: 0

Ronak Shah
Ronak Shah

Reputation: 388982

You can divide County and States and arrange data based on State.

library(dplyr)
library(tidyr)

df %>%
  separate(County, c('County', 'State'), sep = ",\\s*") %>%
  arrange(State) %>%
  unite(County, County, State, sep = ",")

In base R, you can keep only the state information by removing everything till comma and use order to arrange data by state.

df[order(sub('.*,', '', df$County)), ]

Upvotes: 1

Related Questions