-
-
Notifications
You must be signed in to change notification settings - Fork 19
[API design] Streaming upload API: push-style or pull-style? #199
Description
[This is partially discussed in #124, but that's a huge omnibus issue so I'm pulling it out into a more focused one. This is also capturing some notes from a voice chat @sethmlarson and I had last week.]
Say you're uploading some data as a request body, and you want to generate it on the fly. There are two basic API approaches that hip could offer. In a "push" style API, the user pushes us data as it becomes available:
request_object = await hip.start_request(...)
await request_object.send_all(data1)
await request_object.send_all(data2)
response = await request_object.send_eof()In a "pull" style API, the user provides an (async) iterable, and hip pulls data out of it:
async def generate_body_parts():
yield data1
yield data2
await hip.post(..., body=generate_body_parts())These are kind of duals of each other. Generally, Trio's style prefers the "push" approach, for two reasons: First, keeping the loop in user code makes the control flow is more explicit and transparent, so I think it's easier to reason about. And second, if you have a push-style API, then it's easy to implement a pull-style API, but going the other direction is much more awkward:
# Pull on top of push is just a for loop:
async for data in generate_body_parts():
await request_object.send_all(data)
# Push on top of pull needs more complicated machinery:
send_side, receive_side = trio.open_memory_channel()
async with trio.open_nursery() as nursery:
nursery.start_soon(partial(hip.post, ..., body=receive_side))
await send_side.send(data1)
await send_side.send(data2)
# FIXME: also have to retrieve the return value from hip.post somehowPush-style APIs are also much more natural if we want to support bidirectional HTTP, i.e. letting people send a bit of the request, then read a bit of the response, then send a bit more of the request, etc. (I'm not sure we do want to do this, but still, it's a trade-off to consider.)
However, for hip's purposes, the push-style approach has some significant downsides:
-
For HTTP uploads specifically, it's very common to already have some kind of object that represents the body (a byte-string, some JSON, an open file, an async generator, etc.), and it's natural to want to just pass this object as a simple kwarg like
body=. We probably don't want to make people explicitly write aforloop every time. So, if we supported a push-style API, then we'd also end up adding a pull-style option as sugar on top of it, and then we have two APIs. -
Push-style APIs expose a lot more states in the API. With a classic
requests.getAPI, you call the function, and it doesn't return until at least the whole request body has been delivered and the response headers have been received. With a push-style API, I guess we'd need to... first return an object that represents the request-in-progress, and then the user would push data through it for a while, and then when they mark the request body complete then that would need to return a second object that has the response headers etc.? And what happens if the server starts responding early? This makes the API much more complex to design and use.- In particular, the "early response from server" case is important for bidi HTTP: for normal standard-conformant HTTP, if the server does an early response then you want to hard-abort sending the request body. The whole point of bidi HTTP though is that you don't do this. So even if we did have a push-style API, that wouldn't be enough on its own to support bidi HTTP; we'd still need some extra API surface area beyond that.
-
Push-style APIs are even more awkward if you want to handle redirects and retries, where you might have to rewind the body and re-send it. So now we need some way for hip to tell the user that they've been redirected and need to start over on pushing the body, etc.
Putting this all together, I think the conclusion is that we should follow requests instead of Trio, and expose a pull-style API instead of a push-style API.
If we do decide that bidi HTTP is an important use case (e.g. so grpc clients can use hip), then another possibility would be to make the API look similar to an Upgrade/CONNECT. For Upgrade/CONNECT, I imagine that our API will involve sending a request, getting a response object, and then if it succeeded calling some method on the response object to get a bidi byte stream for the new protocol to use. For bidi HTTP, we could use a similar API pattern, where during the request you set some special flag that says "I want to do bidi HTTP", and then we use some special internal processing that doesn't send any request body immediately, but then lets you use the pull-out-a-byte-stream method to get a byte stream referring to the request and response bodies. I think that will end up being much more natural.