Tristan
Tristan

Reputation: 6906

Infix operators in Scala and Jython

I'm evaluating languages for a computational oriented app that needs an easy embedded scripting language for end users. I have been thinking of using Scala as the main underlying language and Jython for the scripting interface. An appeal of Scala is that I can define methods such as :* for elementwise multiplication of a matrix object and use it with infix syntax a :* b. But :* is not a valid method name in Python. How does Jython deal with this?

I would consider using Scala as the scripting language, due to its flexibility. But even with type inference, all the val and var and required type definitions are too much for lay users used to dynamic language like matlab. By comparison, Boo has the option -ducky option which might work, but I'd like to stay on the JVM rather than .NET. I assume there is no -ducky for Scala.

More generally, consider the following DSL (from http://www.cs.utah.edu/~hal/HBC/) to model a Latent Dirichlet Allocation:

model {
      alpha     ~ Gam(0.1,1)
      eta       ~ Gam(0.1,1)
      beta_{k}  ~ DirSym(eta, V)           , k \in [1,K]
      theta_{d} ~ DirSym(alpha, K)         , d \in [1,D]
      z_{d,n}   ~ Mult(theta_{d})          , d \in [1,D] , n \in [1,N_{d}]
      w_{d,n}   ~ Mult(beta_{z_{d,n}})     , d \in [1,D] , n \in [1,N_{d}]
}

result = model.simulate(1000)

This syntax is terrific (compared to PyMCMC for instance) for users familiar with hierarchical Bayesian modeling. Is there any language on the JVM that would make is easy to define such syntax, along with having access to a basic scripting language like python?

Thoughts appreciated.

Upvotes: 4

Views: 683

Answers (3)

Flaviu Cipcigan
Flaviu Cipcigan

Reputation: 7243

EDIT:

After reading all the discussion, probably the best way to go is to define the grammar of your DSL and then parse it with the inbuilt parsing utilities of scala.

I'm not sure though what you are trying to achieve. Will your scripting language be more of a "what" or of a "how" type? The example you have given me is a "what" type DSL -> you describe what you are trying to achieve, and not care about the implementation. These are languages best used to describe a problem, and by the domain you are building the app for, I think it's the best way to go. The user just describes the problem in a syntax very familiar to the problem domain, the application parses this description and uses it as an input in order to run the simulation. For this, building a grammar and parsing it with the scala parsing utilities will probably be the best way to go (you only want to expose a small subset of features for the users).

If you need a "how" script, then using an already established scripting language is the way to go (unless you want to implement loops, basic data structures, etc yourself).

In designing a system, there will always be trade-offs to be made. Here it is between the amount of features you want to expose to the user and the terseness of your script. Myself, I'll go with exposing as few features as possible to get the job done, and get it done in a "how" way - the user doesn't need to know how you are going to simulate its problem if the simulation gives correct results and runs in reasonable time.

If you expose a full scripting language to the user, your DSL will just be a small API in that scripting language and the user will have to learn a full language to be able to use its full power. And you may not want a user to use its full power (it may wreck havoc to your app!). Why would you expose, for example, TCP socket support when your application doesn't need to connect to the internet? That could be a possible security hole.

-- The following section discusses possible scripting languages. My above answer advises against using them, but I have left the discussion for completeness.

I have no experience with it, but have a look at Groovy. It is a dynamically typed scripting language for the JVM (with JVM support probably going to get better in JDK 7 due to invokedynamic). It also has good support for operator overloading and writing DSLs. Unfortunately, it doesn't have support for user defined operators, at least not to my knowledge.

I would still go with scala though (partially because I like static typing and I find its type inference good :). It's scripting support is quite good, and you can make almost anything look like native language support (for example have a look at its actors library!). It also has very good support for functional programming, which can make scripts very short and concise. And as a benefit, you'll have all the power of the Java libraries at your disposal.

In order to use scala as a scripting language, just put your script in a file ending with .scala and then run scala filename.scala. See Scala as a scripting Language for a discussion, comparing scala with JRuby.

Upvotes: 0

Alexey Romanov
Alexey Romanov

Reputation: 170713

None of the obvious suspects among JVM scripting languages -- JavaScript Rhino, JRuby, Jython, and Groovy -- have support for user-defined operators (which you'll probably need). Neither does Fan.

You might try using JRuby with superators gem.

Upvotes: 0

Daniel C. Sobral
Daniel C. Sobral

Reputation: 297155

Personally, I think you overstate the overhead of Scala. For instance, this:

alpha     ~ Gam(10,10)
mu_{k}    ~ NorMV(vec(0.0,1,dim), 1, dim)     , k \in [1,K]
si2       ~ IG(10,10)
pi        ~ DirSym(alpha, K)
z_{n}     ~ Mult(pi)                          , n \in [1,N]
x_{n}     ~ NorMV(mu_{z_{n}}, si2, dim)       , n \in [1,N]

could be written as

def alpha =                   Gam(10, 10)
def mu    = 1 to 'K map (k => NorMV(Vec(0.0, 1, dim), 1, dim)
def si2   =                   IG(10, 10)
def pi    =                   DirSym(alpha, 'K)
def z     = 1 to 'N map (n => Mult(pi))
def x     = 1 to 'N map (n => NormMV(mu(z(n)), si2, dim))

In this particular case, almost nothing was done, except define Gam, Vec, NorMV, etc, and create an implicit definition from Symbol to Int or Double, reading from a table where you'll store such definitions later on (such as with a loadM equivalent). Such implicit definitions would go like this:

import scala.reflect.Manifest
val unknowns = scala.collection.mutable.HashMap[Symbol,(Manifest[_], Any)]()
implicit def getInt(s: Symbol)(implicit m: Manifest[Int]): Int = unknowns.get(s) match {
  case Some((`m`, x)) => x.asInstanceOf[Int]
  case _ => error("Undefined unknown "+s)
}
// similarly to getInt for any other desired type

It could be written as such, too:

Model (
  'alpha    -> Gam(10, 10),
  'mu -> 'n -> NorMV(Vec(0.0, 1, dim), 1, dim)      With ('k in (1 to 'K)),
  'si2      -> IG(10, 10),
  'pi       -> DirSym('alpha, 'K),
  'z -> 'n  -> Mult('pi)                            With ('n in (1 to 'N)),
  'x -> 'n  -> NorMV('mu of ('z of 'n), 'si2, dim)) With ('n in (1 to 'N)) 
)

In which case Gam, Mult, etc would need to be defined a bit different, to handle the symbols being passed to them. The excess of "'" is definitely annoying, though.

It's not like HBC doesn't have it's own idiosyncrasies, such as the occasional need for type declarations, underscores before indices, the occasional need to replace "~" with "\in", or even the backslash that needs to preceed the later. As long as there is a real benefit from using it instead of HBC, MathLab, or whatever else the person is used to, they'll trouble themselves a bit.

Upvotes: 2

Related Questions