Leothorn
Leothorn

Reputation: 1345

Scala , Spark code : Iterating over array and evaluating expression using an element in the array

I am coding in scala-spark and trying to segregate all strings and column datatypes . I am getting the output for columns(2)_2 albeit with a warning but when i apply the same thing in the if statement i get an error . Any idea why. This part got solved by adding columns(2)._2 : David Griffin

var df = some dataframe
var columns = df.dtypes
var colnames = df.columns.size

var stringColumns:Array[(String,String)] = null;
var doubleColumns:Array[(String,String)] = null;
var otherColumns:Array [(String,String)] = null;

columns(2)._2
columns(2)._1

for (x<-1 to colnames)
{ 
    if (columns(x)._2 == "StringType")
     {stringColumns = stringColumns ++  Seq((columns(x)))}

    if (columns(x)._2 == "DoubleType")
     {doubleColumns = doubleColumns ++  Seq((columns(x)))}

    else
     {otherColumns = otherColumns ++ Seq((columns(x)))}
}

Previous Output:

stringColumns: Array[(String, String)] = null
doubleColumns: Array[(String, String)] = null
otherColumns: Array[(String, String)] = null
res158: String = DoubleType
<console>:127: error: type mismatch;
 found   : (String, String)
 required: scala.collection.GenTraversableOnce[?]
                  {stringColumns = stringColumns ++ columns(x)}

Current Output:

stringColumns: Array[(String, String)] = null
doubleColumns: Array[(String, String)] = null
otherColumns: Array[(String, String)] = null
res382: String = DoubleType
res383: String = CVB
java.lang.NullPointerException


                  ^

Upvotes: 0

Views: 1165

Answers (2)

Leothorn
Leothorn

Reputation: 1345

This is the answer modified from David Griffins answer so please up-vote him too . Just altered ++ to +:=

var columns = df.dtypes
var colnames = df.columns.size

var stringColumns= Array[(String,String)]();
var doubleColumns= Array[(String,String)]();
var otherColumns= Array[(String,String)]();


for (x<-0 to colnames-1)
{ 
    if (columns(x)._2 == "StringType"){
        stringColumns +:= columns(x)
    }else if (columns(x)._2 == "DoubleType") {
        doubleColumns +:= columns(x)
    }else {
        otherColumns +:= columns(x)
     }
}
println(stringColumns)
println(doubleColumns)
println(otherColumns)

Upvotes: 0

David Griffin
David Griffin

Reputation: 13927

I believe you are missing a .. Change this:

columns(2)_2

to

columns(2)._2

If nothing else, it will get rid of the warning.

And then, you need to do:

++ Seq(columns(x))

Here's a cleaner example:

scala> val arr = Array[(String,String)]()
arr: Array[(String, String)] = Array()

scala> arr ++ (("foo", "bar"))
<console>:9: error: type mismatch;
 found   : (String, String)
 required: scala.collection.GenTraversableOnce[?]
          arr ++ (("foo", "bar"))

scala> arr ++  Seq(("foo", "bar"))
res2: Array[(String, String)] = Array((foo,bar))

Upvotes: 2

Related Questions