Bill'o
Bill'o

Reputation: 514

akka stream design pattern for files consumption

I have a problem where I am asked to use akka streams to design a search API to look for data into several related .tsv files.
For ex you have 2 files:
movies.tsv (id, title)
actors.tsv (name, movieIds)
Say you want to create an endpoint listing all the actors that played in one movie just specifying the name
def principalsForMovieName(name: String): Source[Actor, _]
you would have to read the first file to get all the movie ids containing the input name and then process the second file to list the related actors.
I thought I could to that by piping 2 Sources (first movies then actors) together but that does not appear like something common with akka reactive streams.
I might have missed something in the whole stream concept I guess. Could you point me in the right direction please?

Upvotes: 0

Views: 144

Answers (1)

Levi Ramsey
Levi Ramsey

Reputation: 20561

This is workable, albeit inefficient if multiple movies happen to share a title:

  • read a stream of lines from movies.tsv
  • filter the stream for titles matching the name of the movie and map to the movie IDs
  • for each movie ID, emit a stream of lines from actors.tsv (flatMapConcat is probably the stream operator of interest here)
  • filter the stream for records matching that movie ID
  • map each record to the actor name

The inefficiency arises from repeatedly re-reading and scanning actors.tsv.

Upvotes: 1

Related Questions