Reputation: 45
I have a list of data frames and I want to loop through the columns of each data frame within the list to create new variables using c++ code (as I'm learning Rcpp).
The input will look like:
$`df1`
a b c
5 30 2
4 2 15
3 2 17
$df2
a b c
5 30 2
4 2 15
3 2 17
Ideally, the output would be:
$`df1`
a b c
5.02 30.02 2
4.15 2.15 15
3.17 2.17 17
$df2
a b c
5.02 30.02 2
4.15 2.15 15
3.17 2.17 17
I would like to drop column c afterwards, but right now I'm trying to figure out the c++ code for doing this.
NOTE: I want the 2 in column C row 1 to come in as 02 and not 20 when it's pasted on (so they're all same width and it's accurate).
Upvotes: 1
Views: 975
Reputation: 45
@Ralf Stubner figured I'd give you a visual
df1 <- data.frame(a = sample(1:100, 3), b = sample(1:100, 3), c = sample(0:99, 3))
gives (didn't set.seed):
df1
a b c
28 70 70
14 63 5
8 12 20
dsets<-do.call("list", replicate(10, df1, simplify=FALSE)) #to replicate this 10 times
#and store as list
Run this
listDf(dsets)
And output is as follows:
[[9]]
a b c
35.0 77.0 70
14.5 63.5 5
10.0 14.0 20
[[10]]
a b c
35.0 77.0 70
14.5 63.5 5
10.0 14.0 20
Probably something simple I'm missing?
Upvotes: 0
Reputation: 26823
I am not sure what you are trying to do exactly, but here some quick and dirty code to loop over the columns in a list of data frames:
#include <Rcpp.h>
// [[Rcpp::export]]
Rcpp::List listDf(Rcpp::List l) {
for (int i = 0; i < l.length(); ++i) {
Rcpp::DataFrame df = Rcpp::as<Rcpp::DataFrame>(l[i]);
for (int j = 0; j < df.cols(); ++j) {
Rcpp::NumericVector col = df[j];
df[j] = 1.23 * col;
}
}
return l;
}
/*** R
set.seed(42)
df1 <- data.frame(a = sample(1:100, 3),
b = sample(1:100, 3),
c = sample(1:100, 3))
df2 <- data.frame(a = sample(1:100, 3),
b = sample(1:100, 3),
c = sample(1:100, 3))
l <- list(df1 = df1, df2 = df2)
listDf(l)
*/
And if you actually want to add 1/100 of the last column to the other columns, you can use:
#include <Rcpp.h>
// [[Rcpp::export]]
Rcpp::List listDf(Rcpp::List l) {
for (int i = 0; i < l.length(); ++i) {
Rcpp::DataFrame df = Rcpp::as<Rcpp::DataFrame>(l[i]);
Rcpp::NumericVector last = df[df.cols() - 1];
for (int j = 0; j < df.cols() - 1; ++j) {
Rcpp::NumericVector col = df[j];
df[j] = col + last / 100.0;
}
}
return l;
}
/*** R
set.seed(42)
df1 <- data.frame(a = sample(1:100, 3),
b = sample(1:100, 3),
c = sample(0:99, 3))
df2 <- data.frame(a = sample(1:100, 3),
b = sample(1:100, 3),
c = sample(0:99, 3))
l <- list(df1 = df1, df2 = df2)
listDf(l)
*/
Output:
> listDf(l)
$df1
a b c
1 92.73 84.73 73
2 93.13 64.13 13
3 29.64 51.64 64
$df2
a b c
1 71.94 94.94 94
2 46.96 26.96 96
3 100.11 46.11 11
Upvotes: 4