Reputation: 172
I'm trying to form arguments for use in the reshape()
function. I have a vector of column names, some of which should be merged by reshape()
because they share the same letter at the end:
> v <- c("x","da","db","ea","eb","ec","fb")
Most of these columns are comprised of a combination of pre
and post
characters. pre
will be the timevar
argument and post
will be the v.names
argument in reshape()
. They are defined as:
> pre <- c("d","e","f")
> post <- c("a","b","c")
I have organized the problem this way since there are a variable number of columns I will have to perform this on for different files. By parsing the column names like this, I'm sure I can do this with an algorithm rather than a manual hack.
My desired output is a list of vectors that only include elements of v
that share the same post
letter. The intention is to use these as the varying
parameter in reshape()
:
> desired_lov
$a
[1] "da" "ea"
$b
[1] "db" "eb" "fb"
And in addition, I would like to keep track of which elements are missing from desired_lov
which still exist in the original v
vector. The intention is to use these as the idvar
parameter in reshape()
:
> desired_idh
[1] "x" "ec"
With all that given, someone helped me to build a list of vectors with possible column names with those prefixes and postfixes. Each vector in this list is named after an element in post
, and I believe this is important in order for this to work with reshape()
since it will merge those columns in each vector under a common name:
> lov <- Map(function(x) paste0(pre,x),post)
> lov
$a
[1] "da" "ea" "fa"
$b
[1] "db" "eb" "fb"
$c
[1] "dc" "ec" "fc"
Except this builds more names from those combinations than actually exist in v
. So I would like to keep track of which names in v
do not exist in lov
, for which I've tried:
> idh <- NULL
> Map(function(x) idh <- paste(idh,lov[[x]][lov[[x]] %in% v]),1:length(lov))
[[1]]
[1] " da" " ea"
[[2]]
[1] " db" " eb" " fb"
[[3]]
[1] " ec"
> idh
NULL
Except apparently I'm not succeeding in modifying the idh
variable using Map()
For the next step (after I figure out the bit immediately above), in order to strip out the elements of lov
that don't match v
, I've tried:
> Map(function(x) lov[[x]] <- lov[[x]][lov[[x]] %in% v],1:length(lov))
[[1]]
[1] "da" "ea"
[[2]]
[1] "db" "eb" "fb"
[[3]]
[1] "ec"
> lov
$a
[1] "da" "ea" "fa"
$b
[1] "db" "eb" "fb"
$c
[1] "dc" "ec" "fc"
Which gives me promising output (I would need to remove all vectors from that list that have length < 2 since I'm only looking for duplicated columns based on their second characters), but once again it failed to actually modify lov
by removing the elements I was trying to remove.
I've tried searching, but all I keep finding are ways to remove elements of vectors. This seems to be a much different problem since I'm trying to remove elements from multiple vectors embedded in a list while trying to preserve the vector names in that list.
Edit: I do know about x
ahead of time, so I can manually exclude it where needed. But I don't know that c
is a unique postfix ahead of time (in this particular example), so it needs to be determined within the script.
Upvotes: 0
Views: 88
Reputation: 28441
freq <- lapply(Map(function(x) grep(x, v), post), length)
index <- Map(function(x) grep(x, v), names(freq)[freq>1])
lapply(index, function(x) v[x])
$a
[1] "da" "ea"
$b
[1] "db" "eb" "fb"
and
v[-unlist(index)]
[1] "x" "ec"
v <- c("x","da","db","ea","eb","ec","fb")
pre <- c("d","e","f")
post <- c("a","b","c")
Upvotes: 1