Haisam Tarek Elkewidy
Haisam Tarek Elkewidy

Reputation: 107

MongoDB Scala - query document for a specific field value

So I know that in Mongo Shell, you use dot notation to get the field you want in any document.

How is dot notation achieved in MongoDB Scala. I'm confused as to how it works. Here is the code that fetches a document from a collection:

val record = collection.find().projection(fields(include("offset"), excludeId())).limit(1)

EDIT:

I'm trying to work on a mechanism to basically re-consume Kafka records at a point where the consumer was shutdown. To do this, I store my kafka records in an external database, and then try to fetch the most recent offset from there and start consuming from that point. Here is my Scala method that should do that:

def getLatestCommitOffsetFromDB(collectionName: String): Long = {

import com.mongodb.Block
import org.bson.Document

val printBlock = new Block[Document]() {
  override def apply(document: Document): Unit = {
    println(document.toJson)
  }
}

import com.mongodb.async.SingleResultCallback
val callbackWhenFinished = new SingleResultCallback[Void]() {
  override def onResult(result: Void, t: Throwable): Unit = {
    System.out.println("Latest offset fetched from database.")
  }
}

var obj: String = " "

try {

  val record = collection.find().projection(fields(include("offset"), excludeId())).limit(1)
  //TODO FIND A WAY TO GET THE VALUE AND STORE IT IN A VARIABLE

} catch {
  case e: RuntimeException =>
    logger.error(s"MongoDB Server Error : Unable to fetch data from collection : $collection")
    logger.error(e.printStackTrace().toString())
}

obj.toLong

}

The problem isn't that I can fetch documents from Mongo, more-so that I'm trying to access a particular field in Mongo. The Document has four fields in it: topic, partition, message, and offset. I want to get the "offset" field and store that in a variable, so I can use it as a restarting point to re-consume Kafka records.

where do I go from there?

POM.xml

<?xml version="1.0" encoding="UTF-8"?>

http://maven.apache.org/xsd/maven-4.0.0.xsd"> 4.0.0

<groupId>OffsetManagementPoC</groupId>
<artifactId>OffsetManagementPoC</artifactId>
<version>1.0-SNAPSHOT</version>

<dependencies>
    <dependency>
        <groupId>org.apache.kafka</groupId>
        <artifactId>kafka-clients</artifactId>
        <version>1.0.0</version>
    </dependency>
    <dependency>
        <groupId>org.apache.kafka</groupId>
        <artifactId>kafka_2.12</artifactId>
        <version>1.0.0</version>
    </dependency>
    <dependency>
        <groupId>org.scala-lang</groupId>
        <artifactId>scala-compiler</artifactId>
        <version>2.11.8</version>
    </dependency>
    <dependency>
        <groupId>org.slf4j</groupId>
        <artifactId>slf4j-api</artifactId>
        <version>1.7.25</version>
    </dependency>
    <dependency>
        <groupId>org.apache.kafka</groupId>
        <artifactId>kafka-streams</artifactId>
        <version>0.10.0.1</version>
    </dependency>
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-streaming_2.11</artifactId>
        <version>2.2.0</version>
    </dependency>
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-streaming-kafka-0-10_2.11</artifactId>
        <version>2.2.0</version>
    </dependency>
    <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-streaming-kafka-0-10 -->
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-streaming-kafka-0-10_2.11</artifactId>
        <version>2.2.0</version>
    </dependency>
    <!-- https://mvnrepository.com/artifact/com.fasterxml.jackson.core/jackson-core -->
    <dependency>
        <groupId>com.fasterxml.jackson.core</groupId>
        <artifactId>jackson-core</artifactId>
        <version>2.6.5</version>
    </dependency>
    <dependency>
        <groupId>com.fasterxml.jackson.core</groupId>
        <artifactId>jackson-databind</artifactId>
        <version>2.6.5</version>
    </dependency>
    <!-- https://mvnrepository.com/artifact/com.fasterxml.jackson.core/jackson-annotations -->
    <dependency>
        <groupId>com.fasterxml.jackson.core</groupId>
        <artifactId>jackson-annotations</artifactId>
        <version>2.6.5</version>
    </dependency>
    <dependency>
        <groupId>org.mongodb</groupId>
        <artifactId>casbah_2.12</artifactId>
        <version>3.1.1</version>
        <type>pom</type>
    </dependency>
    <dependency>
        <groupId>com.typesafe</groupId>
        <artifactId>config</artifactId>
        <version>1.2.1</version>
    </dependency>
    <dependency>
        <groupId>org.mongodb.scala</groupId>
        <artifactId>mongo-scala-driver_2.12</artifactId>
        <version>2.1.0</version>
    </dependency>
    <dependency>
        <groupId>org.scala-lang</groupId>
        <artifactId>scala-compiler</artifactId>
        <version>2.11.8</version>
    </dependency>
    <dependency>
        <groupId>org.mongodb</groupId>
        <artifactId>mongo-java-driver</artifactId>
        <version>3.4.2</version>
    </dependency>
    <dependency>
        <groupId>org.mongodb.scala</groupId>
        <artifactId>mongo-scala-driver_2.11</artifactId>
        <version>2.1.0</version>
    </dependency>
    <dependency>
        <groupId>org.mongodb</groupId>
        <artifactId>bson</artifactId>
        <version>3.3.0</version>
    </dependency>
    <dependency>
        <groupId>org.mongodb</groupId>
        <artifactId>mongodb-driver-async</artifactId>
        <version>3.4.3</version>
    </dependency>
    <dependency>
        <groupId>org.mongodb.scala</groupId>
        <artifactId>mongo-scala-bson_2.11</artifactId>
        <version>2.1.0</version>
    </dependency>
</dependencies>

Upvotes: 2

Views: 5833

Answers (2)

Xavier Guihot
Xavier Guihot

Reputation: 61666

You can modify your query this way:

import com.mongodb.MongoClient
import com.mongodb.client.MongoCollection
import com.mongodb.client.model.Projections

def getLatestCommitOffsetFromDB(
  databaseName: String,
  collectionName: String
): Long = {

  val mongoClient = new MongoClient("localhost", 27017);

  val collection =
    mongoClient.getDatabase(databaseName).getCollection(collectionName)

  val record = collection
    .find()
    .projection(
      Projections
        .fields(Projections.include("offset"), Projections.excludeId()))
    .first

  record.get("offset").asInstanceOf[Double].toLong
}

I think you were missing the com.mongodb.client.model.Projections imports in order to use fields, include and excludeId

I used first instead of limit(1) to make it easier to extract the result.

first returns a Document object on which you can call get to retrieve the value of the requested field.

But in fact, since you just want one record and one field, you can remove the projection!:

val record = collection.find().first

Upvotes: 3

JoseM
JoseM

Reputation: 4302

According to the documentation, collection.find() accepts a com.mongodb.DBObject

One of the implementations of that interface that you can use is BasicDBObject which is basically like a mutable.Map[String, Object]. You can use the constructor which accepts a map like:

val query = new com.mongodb.BasicDBObject(Map(
  "foo.bar" -> "value1"
  "bar.foo" -> "value2"
))
val record = collection.find(query)....

Upvotes: 0

Related Questions