Reputation: 70
I am using Amazon Deequ Scala library for data quality check. The format for calling methods in Deequ library below
checks = hasDistinctness(Seq("column1","column2"), _ >= 0.70)
I was planning to pass the condition check(>= 0.70) from config file.
code:
val chk_val = config.getString("chk_val")
println(chk_val) // ">= 0.70"
checks = hasDistinctness(Seq("column1","column2"),_ chk_val)
Method definition in Deequ library:
def hasDistinctness(
columns: Seq[String], assertion: Double => Boolean,
hint: Option[String] = None)
: CheckWithLastConstraintFilterable = {
addFilterableConstraint { filter => distinctnessConstraint(columns, assertion, filter, hint) }
}
Error:
Error: value chk_val is not a member of Double
How to solve this issue?
Upvotes: 0
Views: 139
Reputation: 26921
_ >= 0.70
is a function that compares it's element with 0.70. desugaring the _
it would look like value => value >= o.70
Compiler understands _ chk_val
as a postfix-notation call to chk_val
on whatever is passed to the function. Desugared, it would look like value => value.chk_val
Obviously, there is no chk_val
member on the Double - and that's exactly what compiler tells you.
So, _
is not a black magic - it doesn't just magically parse the string and turn it into executable code :) In order to get the condition from the config file, you'd need to parse it into a function - probably the most straightforward (and risky, and probably compiler will not type check it) is to use some sort of eval
functionality (e.g. see this answer, or this question).
The faster, easier and more straightforward (but probably less flexible and less scalable) approach is to probably define a parser yourself. Something along the lines of:
def parseCondition(input: String): Double => Boolean = {
val splitInput = input.split(" ")
// You might want to add some validation - e.g. ">=0.7" will just throw here
val (operator, operand) = (splitInput.first, splitInput.last)
operator match {
case ">=": _ >= operand
case ">": _ > operand
case "<": _ < operand
...
}
}
Or maybe use Atto::
Upvotes: 2