Memcached study notes

Recommended for you: Get network issues from WhatsUp Gold. Not end users.

Memcached is a distributed memory cache server with high performance. General purpose is, through the cache database query results, reduce the access of the database, in order to improve the dynamic application speed, improve scalability.

Memcached to HashMap a storage key / value based on, when the table is full, then the new data to replace the LRU mechanism. The daemon is written in C, but the client can use any language to write, and through the memcached protocol and the daemon communication.

A, distributed

  One of the most important contents of distributed is to ensure that the same key every time must hit the same server. Following is a brief introduction of two kinds of patterns: simple hash distribution and consistency of hash distribution.

  1,Simple hash distribution

    For example, for each visit, can according to the following algorithm to compute the hash value:

      h = Hash(key) % N

    Where Hash is a string to hash mapping function of positive integers, N is the number of servers. So, if we have three servers, which are numbered 0,1,2, then it can be based on the type and key to calculate the number of H server, and then go to visit.

  2,The consistency of hash distribution

    A simple hash algorithm mentioned above has a fatal weakness, is scalability and fault tolerance. In simple terms, the so-called fault tolerance refers to when the system in one or several server becomes unavailable, the whole system can correct and efficient operation; and scalability refers to when adding or reducing server, the whole system is correct and efficient operation.

We leave a server is down, then in order to fill the vacancy, the server down removed from the numbers in the list, behind the server in order to reach a and the number value minus one, each key to Hash at h = (key)% (N-1) re calculation; the same, if a new server, although the original server number does not change, but according to Hash H = (key)% (N+1) to calculate the hash value. Therefore the system once the server changes, a large number of key will be relocated to a different server, resulting in a large number of cache misses. But this situation is very bad in the distributed system.

    A good design of distributed hash scheme should have monotonicity good, namely the service node or not caused a large number of hash relocation. Consistent hashing algorithm is a hash scheme.

    The 2.1 algorithm

      In simple terms, consistent hash of the hash value space into a virtual ring, such as a hash function assumes that the value space of H is 0 - 2^ (32-1) (i.e., the hash value is a 32 bit unsigned integer), the hash space ring.:


    The space in a clockwise direction, organization. 0 and 2^ (32-1) in the direction of zero overlap.

    The next step will be to the server using the H for a Hashi, specific can choose the server IP or host name as a keyword to Hashi, so that each machine can determine the Hashi ring position, the assumption here will be above three servers using the IP address after Hashi in the ring space position are as follows:


    Then use the following algorithm to locate the data access to the corresponding server: key data using the same function H to calculate the hash value H, according to h determine the location of the data on a ring, then position along the ring clockwise “ &rdquo walking;, first met the need to locate the server is the server.

    For example, we have A, B, C, D four data objects, after the hash computation, in the loop space position are as follows:


    According to the consistent hash algorithm, the data can be set to A to Server 1, D is set to Server 3, and B, C were made to Server 2.

    Analysis of 2.2 fault tolerance and scalability

      The analysis of consistent hashing algorithm for fault tolerance and scalability. Now suppose that Server 3 is down:


    You can see the A, C, B are not affected, only the D node is relocated to Server 2. In general, the consistent hashing algorithm, if a server is not available, then the affected data is only the server into the ring space before a server (along the counterclockwise walk encountered the first server) between data, the other is not affected.

    Now consider another situation, if we add a server of Memcached Server 4:


    The A, D, C are not affected, only the B needs to be relocated to the new Server 4. In general, the consistent hashing algorithm, if you add a server, then the affected data is just the new server to the ring space before a server (along the counterclockwise walk encountered the first server) between data, the other is not affected.

    In summary, consistent hashing algorithm for node or only a small portion of the data re positioning ring space, has good fault tolerance and scalability.


    2.3 virtual nodes

      Consistent hashing algorithm at the service node is too small, easily because the node branch caused by non-uniform data skew problem. For example, our system has two servers, the ring distribution is as follows:


    This will cause a great deal of data on Server 1, but only a few will navigate to Server 2. In order to solve the data skew problem, Hashi consistency algorithm introduces virtual nodes, namely a service for each node calculates a plurality of Hashi, each calculation result position are placed a service node, called the virtual node. Specific practices can behind IP server or host name increase number to implement. For example, in the case of the above, We decided for each server computes three virtual nodes, So we can calculate“Memcached Server 1#1”,“ Memcached Server 1#2”,“ Memcached Server 1#3”,“ Memcached Server 2#1”,“ Memcached Server 2#2”,“Memcached Server 2#3”Hash value, So the formation of six virtual nodes:


    At the same time the data location algorithm remains the same, just a step of virtual node mapping to the actual node, such as the location to “ Memcached Server 1#1” Memcached, “ Server, 1#2” “ Memcached Server 1#3” three virtual node data were localized to Server 1. This would solve the service node when data skew problem. In practical application, usually set the number of virtual nodes for 32 or even more, so even if the service node few can do relatively uniform data distribution.

     The ps: hash algorithm is reproduced to consistency:

Two, the mechanism of LRU


    What is the LRU algorithm? LRU is Least Recently Used abbreviation, namely the least frequently used page replacement algorithm, for the virtual page storage management service.

Memory management on the operating system, how to save the use of small capacity memory provides resources for most of the process, has been an important research direction. And the memory of the virtual memory management, is now the most common, the most successful — — in memory limited circumstances, extended memory as part of the virtual memory, memory stores only the information currently running. This will undoubtedly greatly expanded memory function, greatly improved computer concurrency. Virtual page storage management, is the process the required space is divided into multiple pages, memory store only part of the page the needed, the rest of the pages into memory management.

    However, there is a trade-off, virtual page storage management reduces the process required memory space, but also brought the running time longer this shortcoming: in the running process, inevitably have to make some information and memory stored in memory in the exchange, from low speed to the external memory, the spend the time step can not be ignored. Therefore, take the best algorithm to reduce the number of read out, is also very meaningful things.

    For virtual memory page replacement and internal storage, information is based on the page for — — when you need a on the external memory page, put it into memory, at the same time in order to maintain the original size of the space, but also to an in memory pages to disk. This mobilization of less efficiency, execution is higher. So, to which the page out to mobilize to minimize the objective? We need an algorithm.

    Nature, to such a situation of algorithm is the most ideal — — each time the page is the transpose all memory pages will be used in the latest — — it can maximize the deferred page replacement, this algorithm, called ideal page replacement algorithm. Unfortunately, this algorithm is unable to realize.


    In order to minimize the gap with the ideal algorithm, produced a variety of sophisticated algorithms, least frequently used page replacement algorithm is one of. LRU algorithm, is based on the fact: the frequent use of few instructions are frequently used in the front page a few instructions might have in the back of the. On the other hand, is no longer used pages may not be used in the future for a long period of time. This is the principle of locality, the famous — — faster than memory speed of cache, is also based on the principle of operation of the same. Therefore, we need only after each change, find the page out of memory at least use. This is all the contents of the LRU algorithm.

Three, memcached communication protocol

    Here the get and set operations to briefly describe the internal process.

    The first is the set operation, we usually use the string as a storage key (of course, you can also choose to use other), according to the key server and client to hit available connection, began to deal with the data, including check character encoding, data mining in character or object serialization forms, whether compression. On the format of the data processing is completed, will send the data to the server, it uses protocol format is very simple, just a space and the line. The first line contains data information, similar to the HTTP header information, including the operation name (set/add/update/delete/get… &hellip, key;), flags (binary numbers to do marking, labeling including data using the form characters or objects serialized form, whether to compress), expiration time, data length. Byte array output second lines is the data content. Third lines of output a newline/r/n. The last step, waiting for the server to return the success or failure of information storage.

    Get basically is the inverse operation of the set operation, directly send the get command (contain only key information), reverse processing of the returned data.


1 String cmd = String.format( "%s %s %d %d %d\r\n", cmdname, key, flags, (expiry.getTime() / 1000), val.length );
2 sock.write( cmd.getBytes() );
3 sock.write( val );
4 sock.write( "\r\n".getBytes() );
5 sock.flush();

Four, connection pool management

  The memcached client service to each server creates N initial connection, the initial connection number default 10. Then start a SocketPool thread management is coordinated each socket connection, for each server:

    1,Would be too idle connection off

    2,If the available a connection set least connection number B is less, then create a B-A connection

    3,If the available connection a ratio set the maximum number of connections B more, then close the A-B connection

    4,Will take off its connection too long and are still working

    5,Will have been identified as death connection off

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download

Posted by Beatrice at November 18, 2013 - 2:12 AM