Reputation: 13
I want to convert a string to a hash table, whose keys are the characters in the string, and values are are the list of index of this character in the string. Is there some elegant way to do this?
For example:
String: "abcaad"
Hash table: { a: [0 3 4] , b: [1] , c: [2] , d: [5] }
I tried to write one with hash-set!
and iterate the string from the end to avoid append
which is slower than insert at begin. But I hope there will be some more elegant way (better if I can get a vector instead of a list for each hash value):
(define (s->hash s)
(define h (make-hash))
(for ([c (in-string s (sub1 (string-length s)) -1 -1)]
[i (in-range (sub1 (string-length s)) -1 -1)])
(hash-set! h c (cons i (hash-ref h c empty))))
h)
Upvotes: 1
Views: 144
Reputation: 52409
Oscar mentioned the issue with generating vectors - because they're fixed size, and you don't know how many times a letter is going to be present, it's hard to figure out what size to use. But there's ways around that. One approach is to generate lists of indexes using one of the approaches Oscar gave, and then afterwards convert them to vectors using list->vector
(or SRFI-43's reverse-list->vector
to get them in order). hash-map/copy
makes this easy:
Welcome to Racket v8.7 [cs].
> (hash-map/copy '#hash((#\a . (4 3 0)) (#\b . (1)) (#\c . (2)) (#\d . (5)))
(lambda (k v) (values k (list->vector v))))
'#hash((#\a . #(4 3 0)) (#\b . #(1)) (#\c . #(2)) (#\d . #(5)))
> (require srfi/43)
> (hash-map/copy '#hash((#\a . (4 3 0)) (#\b . (1)) (#\c . (2)) (#\d . (5)))
(lambda (k v) (values k (reverse-list->vector v))))
'#hash((#\a . #(0 3 4)) (#\b . #(1)) (#\c . #(2)) (#\d . #(5)))
>
Blatant self-promotion ahead:
If you don't mind not using native vectors, my implementation of SRFI-214 Flexvectors (Like C++ std::vector
or java ArrayList
in that they can resize as needed) from my extra-srfi-libs
package will work, or the data/gvector
package that they're based on (Less functionality, but distributed with Racket instead of an add on). Example:
#lang racket/base
(require srfi/214)
(define (s->hash s)
(for/fold ([table (hasheqv)])
([ch (in-string s)]
[idx (in-naturals)])
(hash-update table ch (lambda (fv) (flexvector-add-back! fv idx)) flexvector)))
(define table (s->hash "abcaad"))
; '#hasheqv((#\a . #<gvector: 0 3 4>) (#\b . #<gvector: 1>) (#\c . #<gvector: 2>) (#\d . #<gvector: 5>))
(println table)
(println (flexvector-ref (hash-ref table #\a) 1)) ; 3
Upvotes: 1
Reputation: 236004
We can simplify the solution a bit by using in-indexed
:
; procedural programming
(define (s->hash s)
(define ht (make-hash))
(for (((c i) (in-indexed s)))
(hash-set! ht c (cons i (hash-ref ht c empty))))
ht)
Or we can write a functional programming style solution by composing existing built-in procedures and using hash-update
to get rid of the hash-set!
mutation operation, which is frowned upon when programming in Scheme:
; functional programming
(define (s->hash s)
(for/fold ((ht #hash()))
(((key index) (in-indexed s)))
(hash-update ht
key
(lambda (acc) (cons index acc))
'())))
Either solution returns a list of indexes in reverse order, if this is a problem you can reverse the lists later as shown below, or use append
instead of cons
(discouraged for performance reasons.)
; procedural programming
(define (reverse-index-lists ht)
(hash-for-each
ht
(lambda (key value) (hash-set! ht key (reverse value))))
ht)
; functional programming
(define (reverse-index-lists ht)
(foldl (lambda (key acc) (hash-update acc key reverse))
ht
(hash-keys ht)))
Using a standard vector
is not a good fit for this problem, because we don't know beforehand the size of each vector (in Scheme vectors are more like arrays in other programming languages, in the sense that they cannot be resized). Let's see the solutions in action:
(s->hash "abcaad")
'#hash((#\a . (4 3 0)) (#\b . (1)) (#\c . (2)) (#\d . (5)))
(reverse-index-lists (s->hash "abcaad"))
'#hash((#\a . (0 3 4)) (#\b . (1)) (#\c . (2)) (#\d . (5)))
Upvotes: 1