arjunsv3691
arjunsv3691

Reputation: 829

How to split string in clojure on number and convert it to map

I have a string school_name_1_class_2_city_name_3 want to split it to {school_name: 1, class:2, city_name: 3} in clojure I tried this code which didn't work

(def s "key_name_1_key_name_2")
(->> s
     (re-seq #"(\w+)_(\d+)_")
     (map (fn [[_ k v]] [(keyword k) (Integer/parseInt v)]))
     (into {}))

Upvotes: 0

Views: 308

Answers (3)

Gwang-Jin Kim
Gwang-Jin Kim

Reputation: 9865

Given

(require '[clojure.string :as str])

(def s "school_name_1_class_2_city_name_3")

Following the accepted answer:

(->> s (re-seq #"(.*?)_(\d+)_?") 
       (map rest) ;; take only the rest of each element 
       (map (fn [[k v]] [k (Integer. v)])) ;; transform second as integer
       (into {})) ;; make a hash-map out of all this

Or:

(apply hash-map ;; the entire thing as a hash-map
       (interleave (str/split s #"_(\d+)(_|$)") ;; capture the names 
                   (map #(Integer. (second %))  ;; convert to int
                         (re-seq #"(?<=_)(\d+)(?=(_|$))" s)))) ;; capture the integers

or:

(zipmap
  (str/split s #"_(\d+)(_|$)")   ;; extract names
  (->> (re-seq #"_(\d+)(_|$)" s) ;; extract values
       (map second)              ;; by taking only second matched groups
       (map #(Integer. %))))     ;; and converting them to integers
  • str/split leaves out the matched parts
  • re-seq returns only the matched parts
  • (_|$) ensures that the number is followed by a _ or is at an end position

The least verbose (where (_|$) could be replaced by _?:

(->> (re-seq #"(.*?)_(\d+)(_|$)" s)        ;; capture key vals 
     (map (fn [[_ k v]] [k (Integer. v)])) ;; reorder coercing values to int
     (into {}))                            ;; to hash-map

Upvotes: 0

Alan Thompson
Alan Thompson

Reputation: 29958

When faced with a problem, just break it down and solve one small step at a time. Using let-spy-pretty from the Tupelo library allows us to see each step of the transformation:

(ns tst.demo.core
  (:use tupelo.core tupelo.test)
  (:require [clojure.string :as str]))

(defn str->keymap
  [s]
  (let-spy-pretty
    [str1 (re-seq #"([a-zA-Z_]+|[0-9]+)" s)
     seq1 (mapv first str1)
     seq2 (mapv #(str/replace % #"^_+" "") seq1)
     seq3 (mapv #(str/replace % #"_+$" "") seq2)
     map1 (apply hash-map seq3)
     map2 (tupelo.core/map-keys map1 #(keyword %) )
     map3 (tupelo.core/map-vals map2 #(Long/parseLong %) )]
    map3))

(dotest
  (is= (str->keymap "school_name_1_class_2_city_name_3")
    {:city_name 3, :class 2, :school_name 1}))

with result

------------------------------------
   Clojure 1.10.3    Java 11.0.11
------------------------------------

Testing tst.demo.core
str1 => 
(["school_name_" "school_name_"]
 ["1" "1"]
 ["_class_" "_class_"]
 ["2" "2"]
 ["_city_name_" "_city_name_"]
 ["3" "3"])
seq1 => 
["school_name_" "1" "_class_" "2" "_city_name_" "3"]
seq2 => 
["school_name_" "1" "class_" "2" "city_name_" "3"]
seq3 => 
["school_name" "1" "class" "2" "city_name" "3"]
map1 => 
{"city_name" "3", "class" "2", "school_name" "1"}
map2 => 
{:city_name "3", :class "2", :school_name "1"}
map3 => 
{:city_name 3, :class 2, :school_name 1}

Ran 2 tests containing 1 assertions.
0 failures, 0 errors.

Passed all tests

Once you understand the steps and everything is working, just replace let-spy-pretty with let and continue on!

This was build using my favorite template project.

Upvotes: 2

danieltan95
danieltan95

Reputation: 860

You are looking for the ungreedy version of regex.

Try using #"(\w+?)_(\d+)_?" instead.

user=> (->> s (re-seq #"(\w+?)_(\d+)_?"))
(["key_name_1_" "key_name" "1"] ["key_name_2" "key_name" "2"])

Upvotes: 4

Related Questions