Apache and Nginx network model

Recommended for you: Get network issues from WhatsUp Gold. Not end users.

The high concurrent with Nginx using the epoll model in which, unlike traditional server architecture, epoll is the Linux kernel 2.6 and later. The following through the comparison of the Apache and the working principle of Nginx to compare.


The traditional Apache is a multi process or multi thread to work, that is multi process work (prefork), Apache Mr. into several processes, like working principle of process pool, but here the process pool will increase with the increase of the number of requests. For each connection, Apache is disposed within a process. Concrete is recv (), and according to the URI I/O to find the file to disk, and (send) are blocked. In fact, it is apche for I socket/O, Read or write, But the read or write is blocked, Blocking means that the process is pending in the state of sleep, So once the connection number, Apache is bound to generate more process in response to a request, Once the process, CPU for the switching process is frequent, Is the consumption of resources and time, So the Apache performance, It is not so much the process of the treatment. Actually think carefully, if the process of each request is not blocked, then certainly efficiency will improve a lot.


Nginx uses the epoll model, asynchronous non blocking. For Nginx, a connection request complete processing is divided into event, an event. For example, accept (), recv (disk), I/O, send (), and each part has a corresponding module to handle, a complete request may be composed of hundreds of modules to handle. The true core is the event collection and distribution module, this is the core management of all modules. Only the core module scheduling to make the corresponding module using CPU, which processes the request. With an HTTP request, first collected distribution module register interest in monitoring events in the event, registered good after blocking direct return, then you don't need to pipe, waiting for the connection to the kernel will tell you (the epoll polling will tell process), CPU can deal with other things go. Once the request, So the request distribution corresponding context (actually pre allocated), At this time to register a new event of interest (read function), The same client data to the kernel will automatically notify the process can read the data, After reading data is analytic, After the disk to find resources to resolve(I/O), Once I/O will inform the process, The process began to the client sent the data send(), This time also not blocked, After the call is the kernel sends notifications sent to the line. The whole down a request are divided into a number of stages, each stage are very many modules to register, and then processing, are asynchronous and non blocking. Asynchronous here refers to do a thing, do not need to return the results, will be ready to automatically notify you.


To give a simple example to illustrate the Apache workflow, we usually go to the restaurant. The restaurant's work is a full service customer service, This process is the, Waiter waiting for the guests at the door(listen), The guests to the reception table(accept), Waiting for customers(request uri), Go to the kitchen to cook called master place an order (disk I/O), Wait for the kitchen to do(read), And then to serve the guests(send), The whole down the waiter (process) many places are blocked. So that the guests a (HTTP requests a restaurant), only service by calling more attendant (fork process), but because the restaurant limited resources (CPU), once the waiter too much management cost is very high (CPU context switch), thus entered a bottleneck.
Come have a look Nginx do? The door hung a bell (Registration epoll model listen), Once the guests arrive (HTTP requests), To send a chambermaid to receive(accept), After the waiter to busy with other things (such as to receive guests), The guests a good meal is called the waiter (data to read()), The waiter came to take the menu to the kitchen (disk I/O), The waiter then do other things go, Do the dishes are also called the waiter and other kitchen (disk I/O end), The waiter to serve guests(send()), The kitchen to make a dish to a guest, Intermediate staff can do other things. The whole process is divided into many stages, each stage has a service module. We think, so once the guests, restaurants can serve more people.

Whether it is Nginx or Squid of the reverse proxy, the network model is event driven. Event driven is actually a very old technology, is so early select, poll. Then the kernel notification event mechanism based on more advanced, such as libevent, epoll, event driven performance can be improved. The event driven nature of IO events, the application of switching rapidly in multiple IO handle, the implementation of an asynchronous IO call. Event driven server, this is IO intensive work is best suited to do, such as reverse proxy, it plays a role in data transfer between client and WEB server, is pure IO operation, does not involve complex computation. The reverse proxy with event driven to do better, obviously, a work in progress on the run, there is no process, thread management overhead, CPU, memory consumption is small.

  So Nginx, Squid do. Of course, Nginx can also be multi process + event driven model, several processes run libevent, does not require that the number of processes that hundreds of Apache. Nginx static document the effect is also very good, it is because of a static file itself is disk IO operations, process. As with many thousands of connections, this meaningless. I write a network program can handle tens of thousands of concurrency, but if most of the client block where, not what pricevalue.

  Then have a look Apache or Resin this kind of application server, application server, call them, because they really want to run the business application specific, such as scientific computing, graphics, database read and write. They are likely to be CPU intensive services, event driven is not appropriate. For example, a computation time of 2 seconds, then this 2 seconds is completely blocked, what event is of no use. Consider MySQL if changed event will happen, a large join or sort will be blocked all client. This time the multi thread or process to reflect the advantages of each process, each do their things, without obstruction and interference. Of course, modern CPU more and more quickly, a single computing blocking time may be small, but as long as there is obstruction, event programming is no advantage. So the process, thread of this kind of technology, and will not disappear, but with the event mechanism complement each other, exist for a long time.

  In conclusion, event driven IO suitable for intensive services, multi thread or process is suitable for CPU intensive services, they each have their own advantages, there is no tendency to replace whose.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download

Posted by Matilda at February 01, 2014 - 2:36 PM