One of the major Internet website architecture experience:

We know that, for a large site, scalability is very important, how in the longitudinal and transverse has good scalability, we need do in architecture design to take into account a principle, I think in many aspects about how to divide:

The first is the transverse branch:
1 big website dissolve into a plurality of small website: when a site has multiple functions, can consider the site is divided into several modules, each module can be a web site, so that we can be very flexible to these sites deployed to a different server.
2 static and dynamic separation: static and dynamic file documents the best split into 2 sites, we know that focuses on the static and dynamic website to website server pressure are different, the former may IO the latter CPU, so when we choose the hardware can be focused, and caching strategies and static and dynamic content is not the same. Typical application, we will have a separate file or image server. Moreover, using no domain but also improve the ability of parallel loading of the browser.
3 in accordance with the function points: such as a module is responsible for the upload, upload operation is very time consuming, if it is possible and other mixed together, a little bit of access will cause paralysis of the server, this special module should be separated. Safety is not safe to separate, also need to take into account the future purchase of SSL.
4 we don't have to use their own servers, search, statement can rely on others to services, such as Google search and report service, they do not necessarily better than the others, the server bandwidth all province.


The second is the longitudinal division:
The 1 file is also equivalent to the database, IO traffic can be larger than the database, this is also the longitudinal level of access, separate the uploaded file picture must and WEB server. Of course, few database and website are put in on one server, which is the most basic.
2 for the dynamic program related to database access, we can use an intermediate layer (application layer and logic layer called) to access the database (deployment on a separate server), the biggest advantage is that the cache and flexibility. Cache memory is relatively large, we need to separate and site process, and in so doing, we can be very convenient to change some data access strategy, even if the database is distributed to do a deployment can work in here, this flexibility is very big. There are benefits is the intermediate layer can do wire Netcom Netcom access bridge, may double to access telecommunication than direct access to telecommunications Netcom server quickly.


Some people say that I do not, I can do load balancing, for, can be, but if the words, the same 10 machine is certainly less than 10 machine can withstand more traffic, but also to the demand of hardware may not be very high, because need to know which hardware is particularly good. For each service period is not idle, and are not too busy, reasonable combination of adjustment and expansion, the scalability of such high, can according to the premise of visits to the adjustment is before considering the points, points advantage is flexibility, scalability, isolation and security.

On the server, we have something to the long-term observation, any point may be the bottleneck:
1. CPU: Analytical dynamic file needs more CPU, CPU bottlenecks to see what is not too long time thread function, if it points out. Or is that each request processing time is not long, but very high traffic, then add server. CPU is good stuff, can't let him do, do not do things.
2: cache memory independence from the IIS process, the general lack of memory on the WEB server is not the case in many. Memory faster than the disk, to the rational use of.
3 disk IO: use performance monitor to find what the file IO in particular, was found to separate a group of file servers, or directly to the CDN. Disk scale slow, read data by large-scale application cache, write data applications can rely on to reduce the burst of concurrent queue.
The 4 network: we know, network communication is slow, slower than the disk, if doing distributed caching, distributed calculation, taking into account the network communication between physical server time, of course, after the flow is large, this can improve the system acceptance can force a grade. Static content can use CSD to share part, when doing the server should also consider the situation that telecom and Netcom China characteristics and firewall.

The SQL SERVER database server[UPDATE]:
In fact, or level of segmentation and vertical segmentation, a two-dimensional table, horizontal segmentation is to cross over to the cutting knife, vertical segmentation is vertical cutting knife:
1, The vertical segmentation is different, we used can be divided into different DB, different instances, or put a table is split into many fields have Xiaobiao.
2, Transverse segmentation is, some applications may not load, such as user registration, but the user table is very large, can put the big table apart. You can use table partitioning, data are stored in different files, and then deployed to separate physical servers to increase the IO throughput in order to improve the performance of reading and writing, soil and a little practice is his regular old data archive. Another advantage of table partition can increase the data query speed, because the page index we can have multiple layers, like a folder of documents is not too much, much a few layer folder.
3, Also through the database mirroring, replication subscription, transaction log, read and write into separate mirror physical database, general enough, if not load balancing can be achieved by hardware database. Of course, for the BI, we may have a data warehouse.

Structure considering that, large flow, you can on this basis to adjust or load balancing WEB server or application server. Many times we are in repeated find the problem - "find the bottleneck -" to solve the process.

The typical architecture are as follows:


Dynamic WEB server with a good point of CPU, static WEB server and the file server disk.
Application server memory, cache server and database server, of course, memory and CPU.

Posted by Julian at December 10, 2013 - 2:16 PM