zeekster26
zeekster26

Reputation: 45

R to C++ code to loop through list of data frames (Rcpp)

I have a list of data frames and I want to loop through the columns of each data frame within the list to create new variables using c++ code (as I'm learning Rcpp).

The input will look like:

 $`df1`
 a  b  c
 5 30  2
 4  2 15
 3  2 17

$df2
a  b  c
5 30  2
4  2 15
3  2 17 

Ideally, the output would be:

    $`df1`
    a     b     c
    5.02 30.02  2
    4.15 2.15   15
    3.17 2.17   17

    $df2
    a     b      c
    5.02  30.02  2
    4.15  2.15   15
    3.17  2.17   17

I would like to drop column c afterwards, but right now I'm trying to figure out the c++ code for doing this.

NOTE: I want the 2 in column C row 1 to come in as 02 and not 20 when it's pasted on (so they're all same width and it's accurate).

Upvotes: 1

Views: 975

Answers (2)

zeekster26
zeekster26

Reputation: 45

@Ralf Stubner figured I'd give you a visual

df1 <- data.frame(a = sample(1:100, 3), b = sample(1:100, 3), c = sample(0:99, 3))

gives (didn't set.seed):

  df1
  a  b  c
  28 70 70
  14 63  5
   8 12 20

dsets<-do.call("list", replicate(10, df1, simplify=FALSE)) #to replicate this 10 times 
#and store as list 

Run this

       listDf(dsets)

And output is as follows:

[[9]]
  a    b  c
35.0 77.0 70
14.5 63.5  5
10.0 14.0 20

[[10]]
  a    b  c
35.0 77.0 70
14.5 63.5  5
10.0 14.0 20

Probably something simple I'm missing?

Upvotes: 0

Ralf Stubner
Ralf Stubner

Reputation: 26823

I am not sure what you are trying to do exactly, but here some quick and dirty code to loop over the columns in a list of data frames:

#include <Rcpp.h>

// [[Rcpp::export]]
Rcpp::List listDf(Rcpp::List l) {
  for (int i = 0; i < l.length(); ++i) {
    Rcpp::DataFrame df = Rcpp::as<Rcpp::DataFrame>(l[i]);
    for (int j = 0; j < df.cols(); ++j) {
      Rcpp::NumericVector col = df[j];
      df[j] = 1.23 * col;
    }
  }
  return l;
}

/*** R
set.seed(42)
df1 <- data.frame(a = sample(1:100, 3),
                  b = sample(1:100, 3),
                  c = sample(1:100, 3))

df2 <- data.frame(a = sample(1:100, 3),
                  b = sample(1:100, 3),
                  c = sample(1:100, 3))

l <- list(df1 = df1, df2 = df2)

listDf(l)

*/

And if you actually want to add 1/100 of the last column to the other columns, you can use:

#include <Rcpp.h>

// [[Rcpp::export]]
Rcpp::List listDf(Rcpp::List l) {
  for (int i = 0; i < l.length(); ++i) {
    Rcpp::DataFrame df = Rcpp::as<Rcpp::DataFrame>(l[i]);
    Rcpp::NumericVector last = df[df.cols() - 1];
    for (int j = 0; j < df.cols() - 1; ++j) {
      Rcpp::NumericVector col = df[j];
      df[j] = col + last / 100.0;
    }
  }
  return l;
}

/*** R
set.seed(42)
df1 <- data.frame(a = sample(1:100, 3),
                  b = sample(1:100, 3),
                  c = sample(0:99, 3))

df2 <- data.frame(a = sample(1:100, 3),
                  b = sample(1:100, 3),
                  c = sample(0:99, 3))

l <- list(df1 = df1, df2 = df2)

listDf(l)

*/

Output:

> listDf(l)
$df1
      a     b  c
1 92.73 84.73 73
2 93.13 64.13 13
3 29.64 51.64 64

$df2
       a     b  c
1  71.94 94.94 94
2  46.96 26.96 96
3 100.11 46.11 11

Upvotes: 4

Related Questions