noirritchandra
noirritchandra

Reputation: 115

How to use mlpack in my Rcpp code in macOS

I am trying to build an R package using mlpack. As suggested in this link I am using the following cpp function

#include <Rcpp/Rcpp>
#include <mlpack.h>

// Two include directories adjusted for my use of mlpack 3.4.2 on Ubuntu
#include <mlpack/core.hpp>
#include <mlpack/methods/kmeans/kmeans.hpp>
#include <mlpack/methods/kmeans/random_partition.hpp>
#include <mlpack/methods/neighbor_search/neighbor_search.hpp>

// [[Rcpp::depends(RcppArmadillo)]]
// [[Rcpp::depends(mlpack)]]

// This is 'borrowed' from mlpack's own src/mlpack/tests/kmeans_test.cpp
// and src/mlpack/tests/kmeans_test.cpp. We borrow the data set, and the
// code from the first test function. Passing data from R in easy thanks
// to RcppArmadillo, 'and left as an exercise'.

// Generate dataset; written transposed because it's easier to read.
arma::mat kMeansData("  0.0   0.0;" // Class 1.
                     "  0.3   0.4;"
                     "  0.1   0.0;"
                     "  0.1   0.3;"
                     " -0.2  -0.2;"
                     " -0.1   0.3;"
                     " -0.4   0.1;"
                     "  0.2  -0.1;"
                     "  0.3   0.0;"
                     " -0.3  -0.3;"
                     "  0.1  -0.1;"
                     "  0.2  -0.3;"
                     " -0.3   0.2;"
                     " 10.0  10.0;" // Class 2.
                     " 10.1   9.9;"
                     "  9.9  10.0;"
                     " 10.2   9.7;"
                     " 10.2   9.8;"
                     "  9.7  10.3;"
                     "  9.9  10.1;"
                     "-10.0   5.0;" // Class 3.
                     " -9.8   5.1;"
                     " -9.9   4.9;"
                     "-10.0   4.9;"
                     "-10.2   5.2;"
                     "-10.1   5.1;"
                     "-10.3   5.3;"
                     "-10.0   4.8;"
                     " -9.6   5.0;"
                     " -9.8   5.1;");


// [[Rcpp::export]]
arma::Row<size_t> kmeansDemo() {

    mlpack::kmeans::KMeans<mlpack::metric::EuclideanDistance, 
                           mlpack::kmeans::RandomPartition> kmeans;

    arma::Row<size_t> assignments;
    kmeans.Cluster((arma::mat) trans(kMeansData), 3, assignments);

    return assignments;
}

If I sourceCpp the above in Ubuntu linux Sys.setenv("PKG_LIBS"="-lmlpack") then it compiles successfully. However, I am unable to use it on macOS with Apple M2 architecture. I am getting the following error in macOS

/Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/library/mlpack/include/mlpack.h:52:10: fatal error: mlpack/core.hpp: No such file or directory
   52 | #include <mlpack/core.hpp>
      |          ^~~~~~~~~~~~~~~~~
compilation terminated. 

I have installed mlpack R package installed as well as the system mlpack using brew. Seems to me that R cannot link to the mlpack libraries that are located in /opt/homebrew/include/ in my system. Is there a way to link to these? I have tried brew link mlpack which shows linking is successful but still got the same compilation error. Additionally I tried the following in R before sourceCpping but no luck!

Sys.setenv("LDFLAGS"="-L/opt/homebrew/lib")
Sys.setenv("CPPFLAGS"="-I/opt/homebrew/include")
Sys.setenv("PKG_LIBS"="-lmlpack")

Please let me know if there is any way out for this in macOS.

P.S. Both R and Rstudio are installed in my system using brew.

Upvotes: 1

Views: 193

Answers (1)

Dirk is no longer here
Dirk is no longer here

Reputation: 368489

mlpack 4.2.0 is now on CRAN and ships exported headers we can use! A minimally modified version of your example follows.

Code

#include <Rcpp/Rcpp>
#include <mlpack.h>

#include <mlpack/methods/kmeans.hpp>

// -- use C++17
// [[Rcpp::plugins(cpp17)]]
// -- use Armadillo, Ensmallen and mlpack headers
// [[Rcpp::depends(RcppArmadillo)]]
// [[Rcpp::depends(RcppEnsmallen)]]
// [[Rcpp::depends(mlpack)]]

// This is 'borrowed' from mlpack's own src/mlpack/tests/kmeans_test.cpp

// Generate dataset; written transposed because it's easier to read.
arma::mat kMeansData("  0.0   0.0;" // Class 1.
                     "  0.3   0.4;"
                     "  0.1   0.0;"
                     "  0.1   0.3;"
                     " -0.2  -0.2;"
                     " -0.1   0.3;"
                     " -0.4   0.1;"
                     "  0.2  -0.1;"
                     "  0.3   0.0;"
                     " -0.3  -0.3;"
                     "  0.1  -0.1;"
                     "  0.2  -0.3;"
                     " -0.3   0.2;"
                     " 10.0  10.0;" // Class 2.
                     " 10.1   9.9;"
                     "  9.9  10.0;"
                     " 10.2   9.7;"
                     " 10.2   9.8;"
                     "  9.7  10.3;"
                     "  9.9  10.1;"
                     "-10.0   5.0;" // Class 3.
                     " -9.8   5.1;"
                     " -9.9   4.9;"
                     "-10.0   4.9;"
                     "-10.2   5.2;"
                     "-10.1   5.1;"
                     "-10.3   5.3;"
                     "-10.0   4.8;"
                     " -9.6   5.0;"
                     " -9.8   5.1;");


// [[Rcpp::export]]
arma::Row<size_t> kmeansDemo() {

    // Originally written to use RandomPartition, and is left that
    // way because RandomPartition gives better initializations here.
    mlpack::KMeans<mlpack::EuclideanDistance, mlpack::RandomPartition> kmeans;

    // mlpack::KMeans<> kmeans;    // default arguments as an alternative

    arma::Row<size_t> assignments;
    kmeans.Cluster((arma::mat) trans(kMeansData), 3, assignments);

    return assignments;
}

/*** R
kmeansDemo()
*/

Output

> Rcpp::sourceCpp("~/git/stackoverflow/76336745/answer.cpp")

> kmeansDemo()
[INFO ] KMeans::Cluster(): iteration 1, residual 13.7285.
[INFO ] KMeans::Cluster(): iteration 2, residual 2.51215e-15.
[INFO ] KMeans::Cluster(): converged after 2 iterations.
[INFO ] 186 distance calculations.
     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16] [,17]
[1,]    2    2    2    2    2    2    2    2    2     2     2     2     2     0     0     0     0
     [,18] [,19] [,20] [,21] [,22] [,23] [,24] [,25] [,26] [,27] [,28] [,29] [,30]
[1,]     0     0     0     1     1     1     1     1     1     1     1     1     1
> 

Packages

> sapply(c("RcppArmadillo", "RcppEnsmallen", "mlpack"), \(x) format(packageVersion(x)))
RcppArmadillo RcppEnsmallen        mlpack 
 "0.12.4.1.0"  "0.2.19.0.1"       "4.2.0" 
> 

Upvotes: 1

Related Questions