Resumable uploads over HTTP. Protocol specification

Valery Kholodkov <valery@grid.net.ru>, 2010

1. Introduction

This document describes application protocol that is used by nginx upload module to implement resumable file uploads. The first version of the module that supports this protocol is 2.2.0.

2. Purpose

The HTTP implements file uploads according to RFC 1867. When the request length is excessively large, the probability that connection will be interrupted is high. HTTP does not foresee a resumption mechanism. The goal of the protocol being described is to implement a mechanism of resumption of interrupted file transfer or suspension of upload upon user request.

2.1. Splitting file into segments

When TCP-connection interrupts abnormaly there is no way to determine what part of data stream has been succesfully delivered and what hasn't been delivered. Therefore a client cannot determine what position to resume from without communicating to server. In order to eliminate additional communication file is represented as an array of segments of reasonable length. When TCP-connection interrupts while transmitting certain segment, client retransmits the whole segment until a positive reponse will be received from server or maximal number of tries will be reached. In the protocol being described the client is responsible for choosing optimal length of a segment.

For tracking the progress of file upload client and server use identical numbering scheme for each byte of a file. The first byte of a file has number 0, the last byte has number n-1, where n is the length of file in bytes.

The order of transmission of a segment is not defined. Client may choose arbitrary order. However it is recommended to send segments in order ascention of byte numbers. Moreover, a user agent might decide to send multiple segments simultaneously using multiple independent connections. If a client exceeds maximal number of simultaneous connections allowed, server might return 503 "Service Unavailable" response.

In case of simultaneous transmission it is prohibited to send 2 or more requests with overlapping ranges within one session. Whenever server detects simultaneous requests with overlapping ranges it must return an errorneous response.

2.2. Encapsulation

Each segment of a file is encapsulated into a separate HTTP-request. The method of the request is POST. Each request contains following specific headers:

Header nameFunction
Content-Dispositionattachment, filename="name of the file being uploaded"
Content-Typemime type of a file being uploaded (must not be multipart/form-data);
X-Content-Range
or
Content-Range
byte range of a segment being uploaded;
X-Session-ID
or
Session-ID
identifier of a session of a file being uploaded (see 2.3);

The body of the request must contain a segment of the file, corresponding to the range that was specified in X-Content-Range or Content-Range headers.

Whenever a user agent is not able to determine mime type of a file, it may use application/octet-stream.

2.3. Session management

In order to identify requests containing segments of a file, a user agent sends a unique session identified in headers X-Session-ID or Session-ID. User agent is responsible for making session identifiers unique. Server must be ready to process requests from different IP-addresses corresponding to a single session.

2.4. Acknowledgment

Server acknowledges reception of each segment with a positive response. Positive responses are: 201 "Created" whenever at the moment of the response generation not all segments of the file were received or other 2xx and 3xx responses whenever at the moment of the response generation all segments of the file were received. Server must return positive response only when all bytes of a segment were successfully saved and information about which of the byte ranges were received was successfully updated.

Upon reception of 201 "Created" response client must proceed with transmission of a next segment. Upon reception of other positive response codes client must proceed according to their standart interpretation (see. RFC 2616).

In each 201 "Created" response server returns a Range header containing enumeration of all byte ranges of a file that were received at the moment of the response generation. Server returns identical list of ranges in response body.

Appendix A: Session examples

Example 1: Request from client containing the first segment of the file

POST /upload HTTP/1.1
Host: example.com
Content-Length: 51201
Content-Type: application/octet-stream
Content-Disposition: attachment; filename="big.TXT"
X-Content-Range: bytes 0-51200/511920
Session-ID: 1111215056 

<bytes 0-51200>

Example 2: Response to a request containing first segment of a file

HTTP/1.1 201 Created
Date: Thu, 02 Sep 2010 12:54:40 GMT
Content-Length: 14
Connection: close
Range: 0-51200/511920

0-51200/511920 

Example 3: Request from client containing the last segment of the file

POST /upload HTTP/1.1
Host: example.com
Content-Length: 51111
Content-Type: application/octet-stream
Content-Disposition: attachment; filename="big.TXT"
X-Content-Range: bytes 460809-511919/511920
Session-ID: 1111215056

<bytes 460809-511919>

Example 4: Response to a request containing last segment of a file

HTTP/1.1 200 OK
Date: Thu, 02 Sep 2010 12:54:43 GMT
Content-Type: text/html
Connection: close
Content-Length: 2270

<response body>