french_fries
french_fries

Reputation: 1

Create a type column in dataframe

I have a dataframe:

x     y
A1  ''
A2  '123,0'
A3  '4557777'
A4  '8756784321675'
A5  ''
A6  ''
A7  
A8
A9  '1533,10'
A10
A11 '51'

I want to add column "type" to it, which has three types: 1,2,3. 1 is if value in y is a number without comma, 2 is for number with comma, 3 is for empty value ''(two apostrophes). So desired output is:

x     y               type
A1  ''                3
A2  '123,0'           2
A3  '4557777'         1
A4  '8756784321675'   1
A5  ''                3
A6  ''                3
A7  
A8
A9  '1533,10'         2
A10
A11 '51'              1

How could i do it? The most unclear part for me is captioning each type in column y

Upvotes: 0

Views: 33

Answers (2)

dvd280
dvd280

Reputation: 962

assuming the empty rows have NULL values in them, I thought of dividing into 3 parts:

  • Those which are empty strings (1)
  • Those which are convertible to numerics without invoking NA (3)
  • Those which are NULL (no value)

the only one outside of this set are the ones who belong to group 2, so:

THREE <- which(df$y == "")
ONE <- which(is.na(df$y %>% as.numeric)==FALSE)
EMPTY <- which(is.null(df$y))

type <- c()

type[THREE] = 3
type[ONE] = 1
type[EMPTY] = NA
type[-c(ONE,THREE,EMPTY)] = 2

finally you have a vector which you can join into your dataframe as a column with :

df2 = cbind(df,type)

Upvotes: 0

Chris Ruehlemann
Chris Ruehlemann

Reputation: 21400

Here's a solution via ifelseand regex:

Data:

df <- data.frame(
  y = c("", "", "1,234", "5678", "001,2", "", "455"), stringsAsFactors = F)

Solution:

df$type <- ifelse(grepl(",", df$y), 2,
                  ifelse(grepl("[^,]", df$y), 1, 3))

Result:

df
      y type
1          3
2          3
3 1,234    2
4  5678    1
5 001,2    2
6          3
7   455    1

Update:

df <- data.frame(
  y = c("''", "", "1,234", "5678", "001,2", "", "''", 455), stringsAsFactors = F)

df$type <- ifelse(grepl(",", df$y), 2,
                  ifelse(grepl("[^,']", df$y), 1,
                         ifelse(df$y=="", "", 3)))

df
      y type
1    ''    3
2           
3 1,234    2
4  5678    1
5 001,2    2
6           
7    ''    3
8   455    1

Is this what you had in mind?

Upvotes: 2

Related Questions