pickypg
pickypg

Reputation: 22332

Web Service Contributing ID Disambiguation

I work with a Web Service API that can pump through a generic type of Results that all offer certain basic information, most notably a unique ID. That unique ID tends to be--but is not required to be--a UUID defined by the sender, which is not always the same person (but IDs are unique across the system).

Fundamentally, the API results in something along the lines of this (written in Java, but the language should be irrelevant), where only the base interface represents common details:

interface Result
{
    String getId();
}

class Result1 implements Result
{
    public String getId() { return uniqueValueForInstance; }
    public OtherType1 getField1() { /* ... */ }
    public OtherType2 getField2() { /* ... */ }
}

class Result2 implements Result
{
    public String getId() { return uniqueValueForInstance; }
    public OtherType3 getField3() { /* ... */ }
}

It's important to note that each Result type may represent a completely different kind of information. Some of it cannot be correlated with other Results, and some of it can, whether or not they have identical types (e.g., Result1 may be able to be correlated with Result2, and therefore vice versa, but some ResultX may exist that cannot be correlated because it represents different information).

We are currently implementing a system that receives some of those Results and correlates them where possible, which generates a different Result object that is a container of what it correlated together:

class ContainerResult implements Result
{
    public String getId() { return uniqueValueForInstance; }
    public Collection<Result> getResults() { return containedResultsList; }
    public OtherType4 getField4() { /* ... */ }
}

class IdContainerResult implements Result
{
    public String getId() { return uniqueValueForInstance; }
    public Collection<String> getIds() { return containedIdsList; }
    public OtherType4 getField4() { /* ... */ }
}

These are two containers, which present different use cases. The first, ContainerResult, allows someone to receive the correlated details as well as the actual complete, correlated data. The second, IdContainerResult, sacrifices the complete listing in favor of bandwidth by only sending the associated IDs. The system doing the correlating is not necessarily the same as the client, and the client can receive Results that those IDs would represent, which is intended to allow them to show correlations on their system by simply receiving the IDs.

Now, my problem may be non-obvious to some, and it may be obvious to others: if I send only the ID as part of the IdContainerResult, then how does the client know how to match the Result on their end if they do not have a single ID-store? The types of data that are actually represented by each Result implementation lend themselves to being segregated when they cannot be correlated, which means that a single ID-store is unlikely in most situations without forcing a memory or storage burden.

The current solution that we have come up with entails creating a new type of ID, we'll call it TypedId, which combines the XML Namespace and XML Name from each Result with the Result's ID.

My main problem with that solution is that it requires either maintaining a mutable collection of types that is updated as they are discovered, or prior knowledge of all types so that the ID can be properly associated on any client's system. Unfortunately, I cannot come up with a better solution, but the current solution feels wrong.

Has anyone faced a similar situation where they want associate generic Results with its original type, particularly with the limitations of WSDLs in mind, and solved it in a cleaner way?

Upvotes: 1

Views: 131

Answers (1)

Glen Best
Glen Best

Reputation: 23105

Here's my suggestion:

  1. You want to have "the client know how to match the Result on their end". So include in your response an extra discriminator field called "RequestType", a String.

  2. You want to avoid "maintaining a mutable collection of types that is updated as they are discovered, or prior knowledge of all types so that the ID can be properly associated on any client's system". Obviously, each client request call DOES know what area of processing the Result will relate to. So you can have the client pass the "RequestType" string in as part of the request. As long as the RequestType is a unique string for each different type of client request, your service can process and correlate it without hard-coding any knowledge.

  3. Here's one possible example of java classes for request and response messages (i.e. not the actual service endpoint):

    interface Request {
        String getId();
        String getRequestType();
        // anything else ...
    }
    
    interface Result {
        String getId();
        String getRequestType();
    }
    
    class Result1 implements Result {
        public String getId() { return uniqueValueForInstance; }
        public OtherType1 getField1() { /* ... */ }
        public OtherType2 getField2() { /* ... */ }
    }
    
    class Result2 implements Result {
        public String getId() { return uniqueValueForInstance; }
        public OtherType3 getField3() { /* ... */ }
    }
    
  4. Here's the gotcha. (2) and (3) above do not give a completely dynamic solution. You want your service to be able to return a flexible record structure relating to each different request. You have the following options:

    4A) In XSD, declare Result as a singular strongly-typed variant record type, and in WSDL return Result from a single service endpoint and single operation. The XSD will still need to hardcode the values for the discriminator element when declaring variant record structure.

    4B) In XSD, declare multiple strongly-typed unique types Result1, Result2, etc for each possible client request. In WSDL, have a multiple uniquely named operations to return each one of these. These operations can be across one or many service endpoints - or even across multiple WSDLs. While this avoids hard coding the request type as a specific field per se, it is not actually a generic client-independent solution because you are still explicitly hard-coding to discriminate each request type by creating a uniquely name for each result type and each operation. So any apparent dynamism is a mirage.

    4C) In XSD, define a flexible generic data structure that is not variant, but has plenty of generally named fields that could be able to handle all possible results required. Example fields could be "stringField1", "stringField2", "integerField1", "dateField1058", etc. i.e. use extremely weak typing and put the burden on the client to magically know what data is in each field. This option may be very generic, but it is usually considered terrible practice. It is inelegant, pretty unreadable, error prone and has limitations/assumptions built in anyway - how do you know you have enough generic fields included? In your case, (4A) is probably the best option.

    4D) Use flexible XSD schema design tactics - type substitutability and use of "any" element. See http://www.xfront.com/ExtensibleContentModels.html.

    4E) Use the @Produces @SomeQualifier annotations against your own factory class method which creates a high level type. This tells CDI to always use this method to construct the specificied bean type & qualifier. Your factory method can have fancy logic to decide which specific low-level type to construct upon each call. @SomeQualifier can have additional parameters to give guidance towards selecting the type. This potentially reducing the number of qualifiers to just one.

    If you use (4D) you will have a flexible service endpoint design that can deal with changing requirements quite effectively. BUT your service implementation still needs to implement the flexible behaviour to decide which results fields to return for each request. Fact is, if you have a logical requirement for varying data structures, your code must know how to process these data structures for each separate request, so must depend on some form of RequestType / unique operation names to discriminate. Any goal of completely dynamic processing (without adapting to each client's needs for results data) is over-ambitious.

Upvotes: 1

Related Questions