Andrzej Zabost
Andrzej Zabost

Reputation: 1531

Asynchronously iterate over all Room records only once

I would like to perform an asynchonous operation on each record in a large Room table.

I thought I could add a method returning Flow in my DAO like this:

    @Query("SELECT * FROM events")
    fun getEvents(): Flow<EventEntity>

But according to this blog post and this documentation page returning a Flow is making an "observable read" so the Flow never completes and it watches for database changes.

My goal is to iterate over all the entities only once. I don't want the "observability" behavior. Also, since the table is very large, I don't want to load all the records into a List at once in order to avoid consuming too much memory.

Could you recommend some solution, please?

Upvotes: 1

Views: 1051

Answers (1)

Froyo
Froyo

Reputation: 18477

Create a new method that does not use Flow.

@Query("SELECT id FROM events")
fun getAllIds(): List<Int> // If your primary key is Integer.

@Query("SELECT * FROM events WHERE id = :id")
fun getById(id: Int): EventEntity?

Use Kotlin coroutines to call this method on IO thread.

There could be several strategies to load one row at a time. This is the simplest - get all ids and load each item one at a time.

suspend fun getEvents() {
    withContext(Dispatchers.IO) {
        // Get entities from database on IO thread.
        val ids = dao.getAllIds()
        ids.forEach { id ->
             val event = dao.getById(id)
        }
    }
}

Pagination based approach

This approach assumes that you have a column that stores timestamp (eg. created_at).

@Query("SELECT * from events WHERE created_at > :timestamp ORDER BY created_at LIMIT 10")
fun getAfter(timestamp: Long): List<EventEntity>

You can use this method to paginate.

suspend fun getEvents() {
    withContext(Dispatchers.IO) {
        var timestamp: Long = 0
        while (true) {
          // Get entities from database on IO thread.
          val events = dao.getAfter(timestamp)

          // Process this batch of events

          // Get timestamp for pagination offset.
          timestamp = events.maxBy { it.createAt }?.createAt ?: -1

          if (timestamp == -1) {
             // break the loop. This will be -1 only if the list of events are empty.
          }

        }
    }
}

Upvotes: 2

Related Questions