Nginx in-depth Xiangjie filtration module

A brief introduction, module       
        Filter (filter) module is the filter response header and content of the module, can be processed to reply to header and content. Its processing time after obtaining the replies, sent to the user before the response. The processing is divided into two stages, the filter HTTP Reply of head and body, in this two stages can be respectively on the head and body modification. The following is the function of the filter function of head and body, the response content for all modules should be returned to the client, you must call the two interface:
        ngx_http_top_header_filter(r);
        ngx_http_top_body_filter(r, in);

Two, the order of execution
        The filtering module is in order, which determines at compile time, control the compiled scripts in auto/modules, when you compile the Nginx, you can see a ngx_modules.c file in the objs directory, open the file to see the following data structure:
        ngx_module_t *ngx_modules[] = {
            ...
            &ngx_http_write_filter_module,
            &ngx_http_header_filter_module,
            &ngx_http_chunked_filter_module,
            &ngx_http_range_header_filter_module,
            &ngx_http_gzip_filter_module,
            &ngx_http_postpone_filter_module,
            &ngx_http_ssi_filter_module,
            &ngx_http_charset_filter_module,
            &ngx_http_userid_filter_module,
            &ngx_http_headers_filter_module,
            &ngx_http_copy_filter_module,
            &ngx_http_range_body_filter_module,
            &ngx_http_not_modified_filter_module,
            NULL
        };
        From write_filter to not_modified_filter, the order of execution module is inverted, the earliest implementation of the not_modified_filter, then each module are sequentially executed. All third party modules only added to between copy_filter and headers_filter module implementation. Each filter module processing function assignment to the global variable ngx_http_top_header_filter, and a filter processing module functions are assigned to the local variable ngx_http_next_header_filter, the response headers and body of a filter function execution order as below:

Fig. 1 header_filter and body_filter execution sequence diagram
        
Three, the module compiler
        Nginx can be easily added third square filter module. Accession to the config file in the filtering module directory, as follows:
        ngx_addon_name=ngx_http_example_filter_module
        HTTP_AUX_FILTER_MODULES="$HTTP_AUX_FILTER_MODULES ngx_http_example_filter_module"
        NGX_ADDON_SRCS="$NGX_ADDON_SRCS $ngx_addon_dir/ngx_http_example_filter_module.c"
        Among them, ngx_http_example_filter_module filter module name, ngx_http_example_filter_module.c is the module source code.

Four, the output content
        Based on the Nginx stream output mode, in the filtering module, all output content through a one-way linked list, each time the Nginx is read part of the contents into the list, and then output. The following one-way linked list structure:
        typedef struct ngx_chain_s ngx_chain_t;
        struct ngx_chain_s {
            ngx_buf_t *buf;
            ngx_chain_t *next;
        };
        General buffer structure can be expressed in a block of memory, the memory of the starting and ending address respectively by start and end, POS and last represent the actual content. If the content has already been treated, POS position can move backward. If you read the new content, position will last moving back. So buffer can be used in many times during a call. If last equals end, that the memory has run out. If POS equals last, the memory has been done. The following is a simple diagram, stated in the buffer pointer usage:

Figure 2 buffer memory structures

Five, the filter function
1, Response header filtering function
        Response of main use head filter function is processing the HTTP response of the head, can according to the actual situation in response to head to modify or add delete. Response header filtering function precedes the response body filter function, and is called only once, so is the initialization work filter module, response entrance header filtering function as follows:
        ngx_int_t ngx_http_send_header(ngx_http_request_t *r)
        {
            ...
            return ngx_http_top_header_filter(r);
        }
        The function call when you reply to send to the client, the return value is NGX_OK, NGX_ERROR and NGX_AGIN, respectively, said the deal with success, failure and the unfinished.
        Ngx_http_header_filter_module filter module to the HTTP head combination all into a complete buffer, finally by ngx_http_write_filter_module filter module to the buffer output.
2, The response body filter function
        The response body filter function is the filter response body. For each request, ngx_http_top_body_filter function may be executed multiple times, the functions of its entrance.:
        ngx_int_t ngx_http_output_filter(ngx_http_request_t *r, ngx_chain_t *in)
        {
            ngx_int_t rc;
            ngx_connection_t *c;
            
            c = r->connection;
            
            rc = ngx_http_top_body_filter(r, in);

            if (rc == NGX_ERROR) {
                /* NGX_ERROR may be returned by any filter */
                c->error = 1;
            }

            return rc;
        }
        Processing stage to the request, the ngx_http_output_filter function is the response of memory filter, and then sent to the client, the response body filter function module format will be similar to the following:
        static int ngx_http_example_body_filter(ngx_http_request_t *r, ngx_chain_t *in)
        {
            ...
            return ngx_http_next_body_filter(r, in);
        }
        The return value of the function is generally NGX_OK, NGX_ERROR and NGX_AGAIN, respectively, said the deal with success, failure and the unfinished.
        The main content is stored in the response of single chain in, the list is generally not too long, sometimes the in parameter may be NULL. In has buf structure, the static files, the buf size is 32K; for the application of reverse proxy, the buf may be 4K or 8k. In order to keep the memory of the low consumption, Nginx generally not assigned too much memory, treatment principle is to receive a certain data, sends out.
        In the response body filter module, with particular attention to is the buf flag, which can be refer to figure 2. If contains the last mark buf, that is the last piece of buf, can be directly output and end the request. If the flush flag, that the buf needs to be output, not cache. If the entire buffer after the treatment finished, no data, you can set the sync flag to buffer devices, that only synchronous use. When the filter module all processing is completed, the final module in write_fitler, Nginx will be copied to the r-> in output chain; at the end of the out output chain, then call the sendfile or writev interface output. Because the Nginx socket interface is non blocking, the write operation will not necessarily succeed, there may be part of the data is also remaining in r->out. In the next call, Nginx will continue to attempt to send, until success.

Six, the sub request
        Nginx filter module is a major feature of a sub request, i.e. when the filter response content, you can send the new request, Nginx will according to the order to your call, the plurality of content splicing reply response subject to normal. A simple example can refer to the addtion module. When Nginx sends the sub request, the ngx_http_subrequest function is invoked, the sub requests into the parent request r-> postponed list;. Sub request will be executed sequentially invoked in the main request. Sub request will also have a request of survival and all processing, will also enter the filtering module process.
        The key point is in the postpone_filter module, it will response content splicing main request and sub requests. R-> postponed in order to save the father and son request request, it is a linked list, if the front of a request is not complete, that after a request was not output. A current when the request is completed and the output, after a request can be output, when all the sub request is completed, the response content all is the output end.

Posted by Brady at December 04, 2013 - 6:45 PM