Performance optimization of personal views

Recommended for you: Get network issues from WhatsUp Gold. Not end users.
Performance optimization of personal views
                                                                                                       by ahuner

A, network IO

1 treat high concurrent program to replace the select model using epoll model
2 according to the amount of network data properly adjust the buffer size

Two, lock and atomic operation

1 can not use the lock can minimize the use of lock, if need to lock to ensure inter thread data synchronization is the bold use of lock
Reference counting 2 objects can use atomic operations, GCC provides the function of __sync_* series, with addition and subtraction and logical operations are atomic operation
The return value is the value before the update:
type __sync_fetch_and_add (type *ptr, type value, ...)
type __sync_fetch_and_sub (type *ptr, type value, ...)
type __sync_fetch_and_or (type *ptr, type value, ...)
type __sync_fetch_and_and (type *ptr, type value, ...)
type __sync_fetch_and_xor (type *ptr, type value, ...)
type __sync_fetch_and_nand (type *ptr, type value, ...)
The return value is the updated value:
type __sync_add_and_fetch (type *ptr, type value, ...)
type __sync_sub_and_fetch (type *ptr, type value, ...)
type __sync_or_and_fetch (type *ptr, type value, ...)
type __sync_and_and_fetch (type *ptr, type value, ...)
type __sync_xor_and_fetch (type *ptr, type value, ...)
type __sync_nand_and_fetch (type *ptr, type value, ...)
Comparison of oldval and *ptr, if they are equal, the copy newval to *ptr
If oldval and *ptr matching, the return value is true, or false:
bool __sync_bool_compare_and_swap (type *ptr, type oldval type newval, ...)
Returns the old value before operation:
type __sync_val_compare_and_swap (type *ptr, type oldval type newval, ...)

Three, the level of language

There are two members of the class of 1 class constructor initializer list constructors, and assignment, if the member is a class type, the initialization list performance advantage, if a class member is int float built-in types have no difference.
class Base

   int m_basemember;
public: 
   Base()
       : m_basemember(1) {
       printf("Base Construct Function\n");
   } 
   ~Base(){
       printf("Base Destruct Function\n");
   }
   Base(const Base& b) {
       printf("Base Copy constructor Function\n");
       this->m_basemember = b.m_basemember ;
   }
   Base& operator = (Base& b){ 
       printf("Base operator = Function\n");
       this->m_basemember = b.m_basemember; 
       return *this; 
   } 
}; 
class Derive 
{    
   Base m_base; 
public: 
   Derive (Base &base)
       /*:m_base(base)*/
   { 
       this->m_base = base; 
   } 
};
int main(int argc, char* argv[])
{
   Base base;
   Derive D(base);
   return 0;
}
In the example above:
If the Derive constructor to initialize the way for assignment, this-> m_base = base; the output results, the code to run:
Base Construct Function
Base Construct Function
Base operator = Function
Base Destruct Function
Base Destruct Function
If the Derive constructor to initialize the way as the initializer list, namely: m_base (base), the code to run output results:
Base Construct Function
Base Copy constructor Function
Base Destruct Function
Base Destruct Function
Type initializer list will use less call a constructor, in a large number of members of the class for the class initializer better performance type.
The following conditions must use the initializer list:
·Const member, because constants can only be initialized cannot be assigned, so must be placed in the initializer list.
·Reference types, reference must be initialized in the definition, and can not be re assignment, so also want to write in the initialization list.
·There is no default constructor class types, because the use of initialization list can not invoke the default constructor to initialize, but directly call the copy constructor to initialize.
Note:
Initializes the class members, is in accordance with the order of initialization statement, rather than according to appear in the initializer list in the order.
2.strlen and list container in the size () function
The implementation of strlen is traversal string until the ’ strlen的实现是遍历字符串直至’\0’,从而计算出字符串的len, 如果是经常性计算或者字符串超长(HLS协议中当时移m3u8文件索引字符串长度会随着时移时间而增长), 这种情况就不适合每次都strlen(str),可以自行用变量记录长度’, and calculated the string len, if it is often calculated or long string (HLS protocol was shifted m3u8 file index string length with time shift time and growth), this situation is not suitable for every strlen (STR), can be used in variable record length.
while(m_list.size() > 0) This statement by while (! M_list.empty ()) replacement, some size () is realized through the list to obtain the length value, instead of using a variable length record values.
3 temporary objects
Reduces the appearance of temporary objects.

Four, the memory usage

1 byte aligned
Do not use #pragma in pack to set the memory based on the number of bytes alignment is used, the default 4 byte aligned, 1 than 2 regardless of occupation of size and access variables in memory addressing has certain advantages.
Methods 1
#pragma pack(4)
struct test
{
   int   a;
   short b;
   char  c;
   double d;
};
#pragma pack(pop)
Methods 2
#pragma pack(4)
struct test
{
   char  c;
   double d;
   short b;
   int   a;
};
#pragma pack(pop)
2 memory pool
Memory pools are pre allocated chunk of memory based on memory, establish index for external use, so as to avoid frequent create release memory causes low efficiency.
3 memory copy
Take up more memory in the streaming media server of audio and video data, program module copy memory processing is also a lot of personal experience, try to transfer the memory pointer to avoid memcpy influence the performance by modifying the same block of memory.

Five, summary

From the two eight principles, to affect the performance of the program is likely to be 20% of the code, and there is no need to optimize the performance of the early stage, there is no need to spend a lot of time in the 80% code on the body.
In the framework of stage considering the performance bottleneck, the performance tool in the whole function is completed to test the performance bottleneck and targeted optimization.
These are common to talk about, but still have to specific analysis of specific solutions to specific problems.
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download

Posted by Elmer at November 14, 2013 - 7:00 PM