Reputation: 3569
I need to provide an HTTP API for clients to push massive data, in the shape of a set of records. My first idea was to provide a set of three calls, like:
The first call should be used to initialize some temporary data structure and give the user an identifier, so that subsequent calls can refer to it and data from multiple users don't mess up. The second call should be invoked as many times as needed, until all data is sent to the server. Finally, invoking the last call, the client confirms that all data has been pushed, so the server can process all the temporary data just stored.
In general, it's considered a good practice to conform to REST principles, but this strategy of uploading large data clearly violates the REST principle of being stateless. For this reason, I'm looking for some better alternative way of doing the job. References to well-known patterns would be appreciated!
Upvotes: 0
Views: 1196
Reputation: 59174
First, note that your current idea has a fatal flaw: If the client is disconnected during PushSomeData
, then it has no way to know whether or not the push succeeded, and can't reliably resume the operation. The final solution has to fix that.
With that out of the way...
If you want to be able to resume an interrupted transfer, then there has to be some state somewhere, but:
The most REST-like implementation of this capability would be like this:
PushData
API. In one form, you just accept the data. In the second form, you instead accept a list of URLs from which parts of data can be retrieved.So in the normal upload process, the client just uploads the parts to its private space, and then sends the urls for all the parts to the PushData
API. The server doesn't need to know anything about the actual state of the client's upload.
By separating the management of multi-part uploads from the specific endpoint you're pushing data to, you allow the same procedure to be used for many target resources without having many implementations.
Note that in (1), you can restrict the URLs you accept in any way your like. Initially, you will probably require that they point into the client's private area following the normal process. The API is very future-proof, however, and it allows you to support different or multiple kinds of staging areas in the future. Maybe you want to let clients upload to Amazon S3 and you can get the data from there. In that case you don't even need to do (2)!
There is also a lot of flexibility in the kind of API you provide for (2). You can make it very specifically an upload-staging API. See Amazon S3 for an example. Or you could provide more a file-system-like view.
Upvotes: 2
Reputation: 17066
The design sounds perfectly reasonable to me, and I think it conforms to the ReSTful principle of Statelessness as well.
Each request from the client to server must contain all of the information necessary to understand the request, and cannot take advantage of any stored context on the server.
That requirement is satisfied by the id
returned from the initial push. The id
is maintained and reused by the application; so there is no session state stored on the server, only resource state.
Upvotes: 3