Reputation: 1
Apologies for the re-post; the earlier time I'd posted I did not have all the details.
My colleague, who quit the firm was a C# programmer, was forced to write Java code that involved (large, dense) matrix multiplication.
He's coded his own DataTable class in Java, in order to be able to
a) create indexes to sort and join with other DataTables
b) do matrix multiplication.
The code in its current form is NOT maintainable/extensible. I want to clean up the code, and thought using something like R within Java will help me focus on business logic rather than sorting, joining, matrix multiplication, etc.
Plus, I'm very new to the concept of DataTable; I just want to replace the DataTable with 2D arrays, and let R handle the rest.
(I currently do not know how to join 2 large datasets in java very efficiently
Please let me know what you think. Also, are there any simple examples that I can take a look at?
Upvotes: 0
Views: 542
Reputation: 44118
Here are some options: Parallel Colt is a numerics library for Java, and Incanter is an R-like system that runs on the JVM.
Upvotes: 0
Reputation: 78316
Don't take this too harshly but you seem to be preparing to replace one chunk of unmaintainable code with another chunk of unmaintainable code. How do I reach this remarkable conclusion ? By your own admission your Java expertise is not quite up to the task you face and you propose to replace a pure Java solution with Java+R.
I suggest that you identify your core skills and use whatever toolset you are most comfortable with to refactor the code. If you don't I foresee a post on SO in a year or so from your replacement complaining about the unmaintainable code you left behind !
Upvotes: 1
Reputation: 66886
Mahout implements matrix and vector operations of this type. It also supports dsitributed, large-scale matrix operations though you may want to ask around on the mailing list for guidance on how to use this pretty new code.
Upvotes: 0