prosseek
prosseek

Reputation: 191039

Detecting the index in a string that is not printable character with Scala

I have a method that detects the index in a string that is not printable as follows.

def isPrintable(v:Char) = v >= 0x20 && v <= 0x7E
val ba = List[Byte](33,33,0,0,0)
ba.zipWithIndex.filter { v => !isPrintable(v._1.toChar) } map {v => v._2}
> res115: List[Int] = List(2, 3, 4)

The first element of the result list is the index, but I wonder if there is a simpler way to do this.

Upvotes: 0

Views: 581

Answers (5)

Azzie
Azzie

Reputation: 366

I am not sure whether list of indexes or tuples is needed and I am not sure whether 'ba' needs to be an list of bytes or starts off as a string.

 for {  i <- 0 until ba.length if !isPrintable(ba(i).toChar) } yield i

here, because people need performance :)

 def getNonPrintable(ba:List[Byte]):List[Int] = {
  import scala.collection.mutable.ListBuffer
  var buffer = ListBuffer[Int]()
  @tailrec
  def go(xs: List[Byte],  cur: Int): ListBuffer[Int] = {
    xs match {
      case Nil => buffer
      case y :: ys => {
        if (!isPrintable(y.toChar)) buffer += cur
        go(ys,  cur + 1)
      }
    }
  }
  go(ba, 0)
  buffer.toList
}

Upvotes: 0

elm
elm

Reputation: 20415

If desired only the first occurrence of non printable char

Method span applied on a List delivers two sublists, the first where all the elements hold a condition, the second starts with an element that falsified the condition. In this case consider,

val (l,r) = ba.span(b => isPrintable(b.toChar))
l: List(33, 33)
r: List(0, 0, 0)

To get the index of the first non printable char,

l.size
res: Int = 2

If desired all the occurrences of non printable chars

Consider partition of a given List for a criteria. For instance, for

val ba2 = List[Byte](33,33,0,33,33)  

val (l,r) = ba2.zipWithIndex.partition(b => isPrintable(b._1.toChar))
l: List((33,0), (33,1), (33,3), (33,4))
r: List((0,2))

where r includes tuples with non printable chars and their position in the original List.

Upvotes: 0

Mariusz Nosiński
Mariusz Nosiński

Reputation: 1288

You can use directly regexp to found unprintable characters by unicode code points.

Resource: Regexp page

In such way you can directly filter your string with such pattern, for instance:

val text = "this is \n sparta\t\r\n!!!"
text.zipWithIndex.filter(_._1.matches("\\p{C}")).map(_._2)
> res3: Vector(8, 16, 17, 18)

As result you'll get Vector with indices of all unprintable characters in String. Check it out

Upvotes: 0

Ashalynd
Ashalynd

Reputation: 12573

For getting only the first index that meets the given condition:

ba.indexWhere(v => !isPrintable(v.toChar))

(it returns -1 if nothing is found)

Upvotes: 3

Dan Simon
Dan Simon

Reputation: 13137

If you want an Option[Int] of the first non-printable character (if one exists), you can do:

ba.zipWithIndex.collectFirst{
  case (char, index) if (!isPrintable(char.toChar)) => index
}
> res4: Option[Int] = Some(2)

If you want all the indices like in your example, just use collect instead of collectFirst and you'll get back a List.

Upvotes: 3

Related Questions