Does Apache Spark DataFrame loads data from DB for every processing or does it use the same data unless told otherwise?

Question

We have a use case where we need to search for specific records that fulfill certain conditions. There are multiple of these conditions for which we need to identify the records. We plan to use apache Spark Dataframes. Does Apache Spark dataframes load the table data from db for every search that we plan to execute or does it load & distribute the table data among the spark cluster nodes once and then run the search conditions on these till it is explicitly told to load the data from db?

Does Apache Spark DataFrame loads data from DB for every processing or does it use the same data unless told otherwise?

Answers (1)

Related Questions