Reputation: 227
I have a function in developed in clojure using flambo spark api functions
(:require [flambo.api :as f]
[clojure.string :as s])
(defn get-distinct-column-val
"input = {:col val}"
[ xctx input ]
(let [{:keys [ col ]} input
column-values []
result (f/map (:rdd xctx) (f/fn [row]
(if (s/blank? (get row col))
nil
(assoc column-values (get row col)))))]
(println "Col values: " column-values)
(distinct column-values)))
And I try to print out the values of column-values and I'm getting
Col values: []
Is there any reason as to why this is so?
I tried replacing the println in the above function with this:
(println "Result: " result)
and got the following:
#<JavaRDD MapPartitionsRDD[16] at map at NativeMethodAccessorImpl.java:-2>
Thanks!
Upvotes: 2
Views: 84
Reputation: 20194
Nothing in your code alters the column-values
binding. I'm not sure how flambo works in the specifics here, but you should be looking at result
, not column-values
.
assoc
takes two arguments - an associative collection and a position. I suspect that here you actually want conj
. Neither assoc
or conj
alters the collection it is provided - we are using immutable data types here.
I expect that accessing result
won't yet have the answer you expect, because you expect assoc
to build up a value across each call (differently from the result from f/map
). In this case, you probably want reduce
.
Upvotes: 1