Marcus
Marcus

Reputation: 115

How to change Map keys to Int when parsing CSV?

val source = scala.io.Source.fromFile("src/main/Mapping.csv");
val data_map: Map[String, String] = source.getLines().map(csv=> (csv.split(",")(0),csv.split(",")(1))).toMap

When I get the values from my CSV file, it create a [String, String]. How do I make it [Int, String]? I've seen people say doing some thing like is suppose to work

val new_map = data_map.map { case (k, v) => (k.toInt,v )}

But it doesn't seem to work becaue it throws a NumberFormatException error saying the first key is a string when it needs to be an Int. My map is still [String, String]. How do I make the keys integer values when I need to parse through a CSV file?

My CSV file is something simply like:

1,2,
3,4,

And so on.

Edit: Updated code to show better what the issue is. Changed it based on feedback. When I am making a new map, I get the NumberFormatExceptionError saying that it can not make a new map because the keys are Strings and I want them to be Ints.

Upvotes: 0

Views: 103

Answers (1)

stefanobaghino
stefanobaghino

Reputation: 12814

The point was already made in a comment, but due to the conciseness allowed by the medium, the point might have got lost and it's worth expanding with a longer explanation.

Scala has a strong, static type system. As such, while it's technically possible, mutating the type of the value in a collection would make your code relatively awkward. On top of that, by default in Scala you use immutable collections, including Maps, which would mean that doing so is just impossible (without resorting to reflection, but you definitely do not want to go there).

The approach one would take to go from a Map[String, String] to a Map[Int, String] in a language like Scala, would be that of returning a new map altogether and use that going forward:

val data_map: Map[String, String] =
  source.getLines().map(csv => (csv.split(",")(0), csv.split(",")(1))).toMap

val data_map_with_int_keys: Map[Int, String] =
  data_map.map { case (k, v) => (k.toInt, v) }

In the snippet above, data_map is unchanged (as it is immutable), while data_map_with_int_keys is the map that you want.

A few additional notes:

  • while at first glance immutable data structures seem wasteful, the reality is that immutable data can be shared, meaning that, for example, under the hood the values of your map are not changed and can therefore be accessed both from the original map and the new one, reducing unnecessary copying
  • you can read more mutable and immutable collections in Scala here on the official documentation
  • you can play around with the code above here on Scastie
  • in Scala, the tendency is normally to use camelCase over snake_case, but of course YMMV and stick to the style of the project you're in
  • while it's technically not incorrect, there's no need to csv.split multiple times, you can do the following (code on Scastie):
val data_map: Map[String, String] =
  source.getLines().map(csv => {
    val fields = csv.split(",")
    (fields(0), fields(1))
  }).toMap

Upvotes: 2

Related Questions