Reputation: 996
I have a variable actor
which is a string and contains values like "military forces of guinea-bissau (1989-1992)"
and a large range of other different values that are fairly complex. I have been using grep()
to find character patterns that match different types of actors. For example I would like to code a new variable actor_type
as 1
when actor
contains "military forces of"
, doesn't contain "mutiny of"
, and the string variable country
is also contained in the variable actor
.
I am at a loss as to how to conditionally create this new variable without resorting to some type of horrible for loop. Help me!
Data looks roughly like this:
| | actor | country |
|---+----------------------------------------------------+-----------------|
| 1 | "military forces of guinea-bissau" | "guinea-bissau" |
| 2 | "mutiny of military forces of guinea-bissau" | "guinea-bissau" |
| 3 | "unidentified armed group (guinea-bissau)" | "guinea-bissau" |
| 4 | "mfdc: movement of democratic forces of casamance" | "guinea-bissau" |
Upvotes: 4
Views: 4062
Reputation: 43255
if your data is in a data.frame
df:
> ifelse(!grepl('mutiny of' , df$actor) & grepl('military forces of',df$actor) & apply(df,1,function(x) grepl(x[2],x[1])),1,0)
[1] 1 0 0 0
grepl
returns a logical vector and this can be assigned to whatever, e.g. df$actor_type
.
breaking that appart:
!grepl('mutiny of', df$actor)
and grepl('military forces of', df$actor)
satisfy your first two requirements. the last piece, apply(df,1,function(x) grepl(x[2],x[1]))
goes row by row and greps
for country in actor.
Upvotes: 5