For an SNS such as Facebook, the major contents must be processed based on personalized data. This means that there is always an inevitable limit to any improvement in performance that is achieved only by improving the speed of the server, such as the cache application and system structure. Facebook found the solution to this limit on performance in a new transfer format, and parallel processing between the server and the client. The solution is particularly focused on reducing the delay in showing the page that is requested.
Problems related to the existing dynamic Web service system
When the Web server is overloaded while generating a page, in most cases the browser becomes idle and does nothing. When the server finishes generating the page and sends it to the browser at once, it cause a bottleneck effect, both in traffic and in performance. Therefore, the server that has already completed transfer cannot be helpful any more. By overlapping the time taken for the server to generate pages and for the browser to render, you can not only reduce the end-to-end latency but also the initial response time for the Web page, consequently reducing overall loading time. The problem in which a single slow query holds back the entire loading process can be addressed as well.
As mentioned earlier, BigPipe is a technology that combines existing Web technologies to allow the parallel processing to generate Web pages and render browsers simultaneously (just as AJAX did). It is named because its mechanism is similar to Pipelining which is common for microprocessors. For this reason, Facebook calls BigPipe technology a Web page pipeline for better performance.
Something you need to know first
In order to understand the BigPipe technology, it is necessary to understand Chunked Encoding, which is supported in version 1.1 of HTTP protocol, and the Script Flush method which is based on Chunked Encoding.
Chunked Encoding is an encoding method of transfer supported in HTTP 1.1 and higher, which allows the server to ignore the header value and divide the response result from HTTP into a series of chunks, and then transfer each chunk of the result. The size of each chunk is transferred along with the content. With this feature, the server can send a response to the Web browser in the middle of a process (Flush transfer) so that the user can see at least partial content.
The figure below shows an example of HTTP responses with a Chunked Encoding format which named Transfer-Encoding from header values as chunked. The sizes of individual chunked text which are transferred sequentially are expressed in hexadecimal instead of the Content-Length header value. You can also see it is ended with 0.