Reputation: 740
I am working through the correct syntax and structure for the following problem.
I have two datasets with two separate schemas--call them ClientEvent
and ServerEvent
--stored on disk. The codebase I am working on has defined a class, Reader[T :< Asset]
where ClientEvent
and ServerEvent
are subtypes of Asset
. Asset
is a trait.
I am writing a function:
def getPathAndReader(config): (String, Reader[Asset]) = {
if (config.readClient) {
return getClientPathAndReader(config)
} else {
return getServerPathAndReader(config)
}
}
This does not compile in my Scala code. From my understanding, T
must be a subtype of Asset
, which both ServerEvent
and ClientEvent
are, therefore Reader[ServerEvent] <: Reader[Asset]
. But since functions are covariant in their inputs, the function I wrote cannot just return this lower type, I'd have to cast it to a supertype? Does that lose too much information?
load
is a function on the trait Asset
trait Reader[T <: Asset] {
def load(raw: DataFrame): Dataset[T]
}
What would be an alternative way to structure this code?
The code's intent is to take the file path returned, and call Reader::load(filePath: String)
to get data back. The subtyped readers have some internal logic to clean the data that it retrieves from disk before it's returned as a Dataframe
. This means it relies on the type that it passes in. I come from a C++/C# background so my thinking is that if you have a generic Reader[Asset]
but call Reader::load(path: String)
it will know what to do based on the type it actually is, similar to Base* ptr
and calling a derived method.
Upvotes: 1
Views: 249
Reputation: 23788
Your claim that
"From my understanding, T
must be a subtype of Asset
, which both ServerEvent
and ClientEvent
are, therefore Reader[ServerEvent] <: Reader[Asset]
." is not correct. Generally if A
and B
are usual types such as A <: B
and G[T]
is a generic type, then all 3 cases are possible:
G[A] <: G[B]
- typical example is some read-only collection like Iterator
G[A] :> G[B]
- typical example is some kind of a consumer like a function T => ()
G[A]
and G[B]
are not related. The most typical case when some uses of the T
are co-variant and some a contravariant. For example, a simple mapping function T => T
is invariant. Also most of the mutable collections are invariant as well because the both "produce" and "consume" objects.Unfortunately for you Dataset[T]
is invariant (rather than covariant Dataset[+T]
or contravariant Dataset[-T]
). This effectively makes your Reader
also invariant. As to how to work this around, it is hard to advice without understanding a larger context. For example, why your getClientPathAndReader
and getServerPathAndReader
do not return Dataset[Asset]
? If you really then use specific ServerEvent
and ClientEvent
, then your design is not type-safe anyway. If you use only Asset
, then changing your readers to return Dataset[Asset]
seems the easiest solution.
Upvotes: 2