Database design and related information

The integrity of the database role

Database integrity forDatabase application systemVery critical, and its role is mainly reflected in the following aspects:

1.Database integrity constraints can prevent adding different semantic data into a database using the database of legal users.

2.Based on the DBMSIntegrity controlMechanisms to implement business rules, is easy to define, easy to understand, but also can reduce theApplication programComplexity, improve the operation efficiency of application. At the same time, based on the DBMSIntegrity controlMechanism is centralized management, thereforeApplication programIntegrity is easier to realize database.

3.The design of database integrity and reasonable, taking into account the integrity and effectiveness of the system database. Such as the loading of large amounts of data, as long as before loading the temporary database integrity constraints based on DBMS failure, then to take effect, can ensure that does not affect the efficiency of data loading and can guarantee the integrity of the database.

4.In theApplication softwareTheFunction testIn the database integrity, sound is helpful to find software errors as soon as possible.

Database integrity constraints can be divided into 6 categories: column level static constraints, TupleStatic constraints, the relationship between static constraints, column level dynamic constraints, tuple level dynamic constraint, dynamic constraint relations. Dynamic constraint usually consists ofApplication softwareTo achieve. Database integrity of different DBMS support is basically the same.

Edit this paragraph relationship model integrity

Integrity to ensure the correctness of the data in the database. The system will check the integrity of the data in the update, insert or delete operation, check the constraint conditions, namely the relationship model of integrity rules. There are four types of integrity constraints in the relational model: entity integrity, domain integrity, referential integrity and user - defined integrity, in which the entity integrity and referential integrity constraints, called the two invariant relations.

The 1 entity integrity

Integrity rules of the relational database is the important content of database design. Most of the relational database management system RDBMS can automatically support integrity rules, as long as the user in the definition of (a) the table structure, pay attention to the primary key, foreign key and selected reference table, RDBMS can automatically realize the integrity constraints.

(1)Entity integrity (Entity Integrity). Entity integrity refers to the surface integrity of Bank of china. The main guarantee for the operation of the data (records), the only non empty and not repeat. The entity integrity requirement each relationship (table) has only one primary key, each value in the primary key must be unique, but does not allow for the “ ” (NULL) or repeated.

(2)Request entity integrity rules. If the attribute A is the main property of basic relations for R, then the attribute A cannot be null value, which is the main attribute cannot be null. The null value (NULL) is not 0, is not the space or the empty string, but there is no value. In fact, the null value refers to the temporary “ &rdquo values are not stored,; “ don't know ” or “ &rdquo value is meaningless;. Because the primary key is the entity data (records) unique identifier, if the main attribute null value, there will be no identifiable relationship (between) the entity data (records), and this entity definition of contradictions, and for non main attributes can be null value (NULL), therefore, the gauge is called entity integrity rules. If the student relationship (table) in the main properties of &ldquo number; ” (column) is not null, or can not operation calls the roll table data (records).

The 2 domain integrity

Domain integrity (Domain Integrity) refers to a column in a database table must meet certain specific data types and constraints. The constraints include provisions for range, precision. Table CHECK, FOREIGN DEFAULT, NOT NULL KEY constraints and definitions belong to the domain integrity category.

3 referential integrity

Referential integrity (Referential Integrity) rules are among the table. Correlation table for permanent relationship, in the update, insert or delete records, if the only change one, affect data integrity will. Such as delete the parent table of a record, the record of the child table is not deleted, resulting in these records as an isolated record. For update, insert or delete data integrity between tables, collectively referred to as the referential integrity. Usually, there is a certain relation between the objective reality of entities, entities in relation model and link between the entities are analyzed, therefore, the operation may have relationship with relationships and references.

In a relational database, relation is realized through the public property. The public property is often one of the primary key of the table, at the same time is another table. Referential integrity embodied in two aspects: the relationship between tables, the foreign key value must be a valid value of another of the primary key of the table is empty, or “ &rdquo value;.

Referential integrity rules (Referential Integrity): if the attribute group F is the relationship between the pattern of R1 primary key, foreign key F is also at the same time the relation mode of R2, in the R2 relationship, the value of F allows only two possibilities: a null value or to a primary key value R1 relationship.

R1 called “ is referenced by ” R2 model, called “ according to &rdquo mode;.

Note: in the practical application, the foreign key is not necessarily associated with the corresponding primary key name. Foreign key common down curve mark.

4 user-defined integrity

User defined integrity (User-defined Integrity) is the field attribute table data constraints, user defined integrity rules (User-defined integrity) also known as domain integrity rule. Effective rule types and fields, including range of fields (such as decimal digits) constraints, is to determine the relationship between attribute fields defined by the structure of the decision. Such as, percentile score ranges between 0~100.

Edit this paragraph database integrity design stage

A good database design integrity first in theDemand analysisPhases identified through database integrity constraints to achieve the business rules, and then provide a full understanding of the specific DBMSIntegrity controlBased on the mechanism, on the basis of the whole systemSystem structureAnd performance requirements, in accordance with theDatabase designMethod andApplication softwareThe design method, the reasonable selection of implementation of each business rule; finally, a careful test, exclusion constraint conflict and performance issues implied. Design of database integrity of DBMS can be divided into the following stages based on:

1.The requirements analysis phase

AfterSystems Analyst, Joint efforts of database analyst, user, determineSystem modelThe object should be included, such as personnel andThe wages management systemIn the Department, staff, managers, and various business rules.

After completing the search business rules, as determined to database integrity of business rules, and the classification of business rules. The database schema integrity as a part of the design according to the following process. And by theApplication softwareThe design of database integrity to achieve will be in accordance with theSoftware engineeringMethod.

2.Conceptual design phase

The concept of structure designPhase is the basisDemand analysisThe result of the conceptual model into an independent of a specific DBMS, i.e.The entity relationship diagram(ERD). In theThe concept of structure designPhase will begin substantive stage database integrity of design, because this stage of the entity relationship will be inLogical structure designPhase transformation for the entity integrity and referential integrity constraints, the main task of the design will be completed by the logical structure design stage.

3.Logic structure design stage

This stage is the data model will be the concept structure conversion is supported by a DBMS, and carries on the optimization, including the normalization of relation model. At this time, on the basis of integrity constraint mechanisms provided by DBMS, on has not yet joined the logical structure of the integrity constraints in the list, one by one to choose a suitable way to achieve.

In theLogical structure designAt the end of the stage, as a part of the database schema integrity design is completed. Each business rule may have a way to achieve several, should choose the database performance effects of a minimum, sometimes need to be determined by the actual test.

Edit this paragraph database integrity design principles

When designing a complete database implementation, there is a need to grasp some basic principles:

1.According to the type of database integrity constraints determine the realization of system level and mode, and consider the impact on the performance of the system in advance. In general, static constraints should be included in the database schema, and dynamic constraints byApplication programRealization.

2.Entity integrity, referential integrity constraintsRelational databaseIntegrity constraints are the most important, need to be applied in the premise of not affecting the key performance system. With a certain amount of time and space for the system's ease of use is worth.

3.With the current mainstream DBMS supportThe triggerFunction, on the one hand the performance overhead due to trigger larger, on the other hand, trigger multi-level trigger is not well controlled, error prone, not, it is best to use Before type a statement level trigger.

4.In theDemand analysisStage must name the specification of integrity constraints, try to use meaningful English words, abbreviations, table name, column name and underline combined, making it easy to identify and memory, such as: CKC_EMP_REAL_INCOME_EMPLOYEE, PK_EMPLOYEE, CKT_EMPLOYEE. If you use the CASE tool, generally have the default rules, can change based on the use of.

5.According to the business rules for detailed testing of database integrity, integrity constraints as soon as possible to the conflict between the exclusion of implied and the impact on Performance.

6.Shall have full-timeDatabase designThe team, from first to last for database analysis, design, implementation, testing and early maintenance. Database designPersonnel not only responsible for the design and implementation of database integrity constraints based on DBMS, but also be responsible for theApplication softwareVerify database integrity constraints to achieve.

7.Should be to reduce the use of appropriate CASE toolsDatabase designEach phase of the work. Good CASE tools can support the entire database of life cycle, which will make theDatabase designThe work efficiency is greatly improved, but also easy to communicate with users.

What is the ER map database


Is the table and the relationships between the tables, use the tool, the map has 1; N, n:1, N:N diagram, so as to establish the relationship between table and table, such as the 1:N relationship:

Students and the curriculum, the curriculum table reference table student number, so students form a one to many relationship with curriculum,


Establish a database using the tool:

      world,UML, power;

The relationship between the table for:

     Contains, association, generic, expansion

1 concepts of affairs

    In order to understand the.NET support for the transaction, it is important to establish the overall understanding of the transaction. To ensure the realization of transaction, unless all operations completed successfully, or face data resources are not persistent update. Operational definition of transaction consists of a set of either success or failure. That is to say, if all operations within a transaction are completed successfully, then commit the transaction, at the same time persistence write update data. However, if one fails, then the data back to the transaction rollback, before the start of the state. For example, if need to 100 yuan from the account to the account of A B. The operation consists of two steps: (1) 100 yuan deducted from the account in A. (2)To account for B to add 100 yuan. In the event of the successful completion of step 1, but due to some reasons leading to step 2 in case of failure. If you do not cancel the reduction step 1, then the entire operation error will occur. The transaction will help to avoid this situation. If all the steps are executed successfully, then the same operations in the transaction will modify the database. In this case, if step 2 steps lead to failure, changes will not be committed to the database.

    Usually, the transaction follows specific rules, called ACID characteristics. The ACID property ensures that the complex transaction is self contained and trustworthy. Following a brief overview of the characteristics of.

    Transactions must be atomic (Atomicity), consistency (Consistency), isolation (Isolation) and persistent (Durability). The four characteristics of the first letter combination is ACID. Although the acronym to remember every word, but the meaning is not very clear. The following is a brief description:

    Atomicity: atomicity ensures that all updates or execution, or nothing. Because transaction atomicity in security, developers do not need to write code to handle a successful update, and another without success case.

    Consistency: consistency means that the transaction results enables the system to a consistent state. The transaction is started, data maintain effective state, and at the end of the transaction. Consistency also ensures that the transaction must make the database remains in a consistent state, if part of the operating business failure, the other part should return to the original state.

    Isolation: multiple users may access the same database at the same time. The use of isolation guarantees before the transaction is completed, the transaction outside can not see the transaction data change. Cannot access some intermediate state, if the transaction terminates the state will not occur.

    Persistence: persistence means even system collapse, but also to ensure that a consistent state. If the database system collapse, is persistent must ensure that have committed transaction is written into the database.

2 database transaction

    The transaction is often used in many commercial applications, because the transaction has brought stability and predictability for the system. Generally speaking, when the development of software system, using the data source data storage. In order to application of the concept of transaction in such a system, data source must support transactions. Modern database, such as Microsoft SQL Server 2005 and Oracle 9i support transaction. For example, SQL Server 2005 provides a T-SQL statement some support services, such as BEGIN TRANSACTION, SAVE TRANSACTION, COMMIT TRANSACTION and ROLLBACK TRANSACTION. Data access API, such as ODBC, OLE DB and ADO.NET, allows developers to use transactions in your application. Usually, as long as the use of a single database, RDBMS and data access API provide support for transactions. In many including a plurality of large database application, you might want to use the Microsoft Distributed Transaction Coordinator(MSDTC). COM+ is a popular middleware products, using MSDTC inside to help realize multi database transaction, even between different known business entity transaction, but usually as a resource manager. It should be noted that, in.NET 2, you can use the System.Transactions namespace to set the distributed transaction, to replace the System.EnterpriseServices.

    The transaction is divided into local and distributed transactions two types. (1)Local affairs: the type of transaction using the known data source (e.g., SQL Server), at the same time, or a single transaction. When a single database to store all the transaction data, so on their own can be forced to use the ACID rule. This means that in a single database server (such as SQL Server), as long as you use the same connection, you can use a local transaction databases. (2)The distributed transaction: the type of transaction using several known transaction data source. Distributed behavior may need to read messages from the message queue server, access to data from the SQL Server database, and writes a message to other databases.

    Some software packages (such as MSDTC) to programmatically implement distributed transactions, by using some methods (such as the two phase commit and rollback) can control a commit or rollback behavior across all data sources, so as to ensure the integration. MSDTC can only be used for applications that are compatible with the transaction management interface. The currently available application has MSMQ, SQL Server, Oracle, Sybase and other currently available applications (known as the resource manager).

    In a distributed transaction environment, various resource managers need to achieve reliable commit protocol, the most common is the implementation of the two phase commit. Submitted in the two stage, the actual submitted work is divided into two stages: the first stage includes submitted to prepare some required changes. In this way, the resource manager will and transaction coordinator communication, inform the updating preparation is ready, ready to perform commit, but actually not to commit. In the second stage, once all of the resource managers told transaction coordinator preparations were made, then the transaction coordinator will allow all participants to continue to work ready, and then perform the change. Presented in the two stage, single or multiple database can participate in a distributed transaction. In fact, any registered in the MSDTC transaction object can participate in a distributed transaction management of MSDTC. For example, MSMQ can participate in the two SqlConnection objects connected to two different database transaction,


The concept of the stored procedure
       SQL Server provides a method, it can be some fixed operating together by SQL Server database server to complete, in order to achieve a task, this approach is the stored procedure.
       The stored procedure SQL statements and optional control of flow statements precompiled collection, stored in the database, the application can through a call, but allows the user declared variables, conditional execution, and other powerful programming function.
       In the SQL Server stored procedure is divided into two types: stored procedures and user-defined system provides storage process.

You can use a stored procedure using the SQL statement for any purpose, it has the following advantages:
       You can execute a series of SQL statements in a single storage process.
       You can reference other stored procedures from the stored procedure itself, which can simplify a series of complex sentences.
       A stored procedure is created that is compiled on the server, so it executes faster than a single SQL statement quickly, but also reduce the burden of the network communication.
       Higher security.
Create a stored procedure

       In SQL Server, can use three methods to create a stored procedure :
         Use the wizard to create a stored procedure to create a stored procedure.
         The SQL Server enterprise manager to create the stored procedure.
         The use of Transact-SQL statement using the CREATE PROCEDURE command to create the stored procedure.

The following describes the use of Transact-SQL statement using the CREATE PROCEDURE command to create the stored procedure
    Create a stored procedure, should consider the following matters:
     The CREATE cannot be PROCEDURE statements and other SQL statements in a single batch.
     The stored procedure can be nested, nested maximum depth of not more than 32 layer.
     The default permissions to create a stored procedure belongs to the database owner, the owner can grant this permission to another user.
     The stored procedure is a database object, and its name must conform to the rules for identifiers.
     It can only create a stored procedure in the current database.
     The maximum size for a stored procedure for 128M.

What is the temporary table

 The temporary table is similar to a permanent table, but it was created in Tempdb, it is only connected to a database or ordered by SQL DROP after the fall, will disappear, otherwise there will always be. The temporary table system log will have SQL Server at the time of creation, although they are reflected in the Tempdb, is in memory, they also support the physical disk, but the user in the specified disk not see the file.

  The temporary table is divided into local and global two, the name of a local temporary table are based on the “ #” for the prefix, and only in the local current user connection is visible, when the user is disconnected from the instance is deleted. Name of a global temporary tables are based on the “ ##” for the prefix, created for any users are visible, when all refer to the table of user is disconnected is deleted.

Comparison of temporary tables and table variables are available through SQL selection, insert, update and delete statements, their difference is mainly reflected in the following these:

  1)Table variable is stored in memory, when the user access table variables, SQL Server is not generated log, while in the temporary table is generated log;

  2)In table variables, there is a non clustered index not allowed;

  3)Table variables are the DEFAULT default value is not allowed, nor allow constraints;

  4)Statistical information on a temporary table is sound and reliable, but the table statistics variables are not reliable;

  5)There is a lock mechanism of temporary tables and table variables, there is no lock mechanism.

What is the table variables

 Table variables create syntax is similar to a temporary table, the difference lies in the time of creation, must be named. Table variable is a variable, table variables are classified into two types: local and global name, local variables are based on the “ @” for the prefix, and only in the local current user connection can access. Name of a global table variables are based on the “ @@” for the prefix, generally is a global variable system, like we used to, such as @@Error represents the wrong number, @@RowCount represents the number of rows affected,


 When in fact in the choice of a temporary table or table variable, most of our cases, when in use can be, but we need to follow the situation, choose the corresponding mode:

  1)Using table variables mainly need to consider is the application of pressure on the memory, if a lot of running example code, we should pay special attention to the consumption of memory variables of memory. We for the smaller data or by calculated recommended table variables. If the result is relatively large, for temporary calculation in code, select a time when no polymerization what grouping, can consider to use a table variable.

  2)General for large data, or because the statistical data in order to optimize the order, we would recommend the use of a temporary table, but also can create index, due to the temporary table is stored in Tempdb, the general default allocation of space is very few, need to be tuned to tempdb, increasing the storage space.

Supplement:
      1 so that the transaction log does not record table variables. Therefore, the scope of their out of the transaction mechanism.

      2 any use of a temporary table stored procedure will not be pre compiled, but use the stored procedure table variable execution plan can be pre static compilation. The main advantage of pre compile a script is to accelerate the speed of execution. The benefits for a stored procedure long more significant, because it is too costly to build.

      Table 3 variables exist only the same range from those variables can exist within the. And the temporary table instead, they in the internal storage process and exec (string) statement is not visible. They also cannot be used in a insert/exec statement.


Discriminant functions and stored procedures

1 function and the process is a collection of subroutines or subroutine, only the function has a return value and can be of no return value.

2 user defined functions cannot be used to perform a set of modify the global state of the database operation.
3 user defined function in processing the same data line in various fields, especially handy. Although here the use of stored procedures can achieve the query, but apparently without the use of function. Moreover, even if the use of stored procedures can not handle each field the same data for SELECT queries in the operation. Because the stored procedure that does not return a value, when used alone can only call; and the function can be placed anywhere expressions appear in can.

The difference between the stored procedure and function specific

     Stored procedures: can make the management of user information, and display their work much easier. The stored procedure SQL statements and optional control of flow statements precompiled collection, stored under a name and processed as a unit. Stored procedures in the database, the application can through a call, but allows the user declared variables, conditional execution, and other powerful programming function. The stored procedure contains program flow, logic, and queries to a database. They can accept parameters, output parameters, return a single or multiple result sets and return value.

    You can use a stored procedure using the SQL statement for any purpose, it has the following advantages:

     1)Powerful, less limitation.

    (2)You can execute a series of SQL statements in a single storage process.

    (3)Can reference other stored procedure from the stored procedure itself, which can simplify a series of complex sentences.

    (4)A stored procedure is created that is compiled in, so to implement than a single SQL statement soon.

    (5)There can be multiple return values, namely the multi output parameters, and can use SELECT to return a result set.

     Function: is composed of one or more SQL statements subroutine, can be used to encapsulate code for reuse. Custom function restrictions, there are many statements cannot use, many functions can not be achieved. Function and can be used directly for the return value, using table variables, return a recordset. However, user defined functions cannot be used to perform a set of modify the global state of the database operation.

 

Key point and requirement of data warehouse is accurate, safe, reliable to pull data from the database, after processing into the law of information, and then supply management personnel to analyze and use,

Data mining (Data Mining) is from the large, incomplete, noise. Fuzzy, random data in which the extraction of implicit, previously unknown, and potentially useful information and knowledge process.,

On line analytical processing (OLAP, On— Analytical Pro— cessing) is to enable the analysis of personnel, management personnel or executives from different point of view can be transformed from the original data, the real for the user to understand, and reflect the enterprise to the characteristics of the information fast, consistent, interactive access, a kind of software techniques to gain insight into the data.,

What is the 5 database optimization scheme

1 secondary data files,

Automatic 2 growth settings file (a large amount of data, a small amount of data without setting)

3 data and log files are stored separately in different disk 

4: optimizing the partition table, partition (① rough optimization of partition, optimizing the accurate data partition)

5: optimizing the design of distributed database

The 6 Optimization: consolidation database fragment

7: Design Optimization of a normalized table, to eliminate data redundancy

8: optimizing the appropriate redundancy, increase the computed column

The 9 Optimization: index

10: optimization of the necessity of primary keys and foreign keys

11: optimization and appropriate use of the stored procedure, view, function

12: optimization and split your table, reduce the table size

Optimization of the 13: The Legend of ‘ three principles’

14 optimization and design principles: Field

15 database performance optimization three: program optimization

Database design paradigm of the three

The first paradigm is the most basic paradigm. If all of the fields in the database table values are not biodegradable atomic values,

The second paradigm more into a layer on the basis of the first paradigm. The second paradigm needs to ensure that the database each column in a table and the primary key, not only with the primary key a part (mainly for the composite key terms),

     Third paradigms need to ensure that the data in each row of the data in the table are directly associated with the primary key, and not directly related to.

The character data type

The character data type is the most used data types. It can be used to store all kinds of letters, numbers, special symbols. Under normal circumstances, the use of character type data must be in the before and after adding the enclosed in single or double quotes in ’” .

1, CHAR

The CHAR data type is defined as the form of CHAR[(n). In each of the characters and symbols of type CHAR storage accounted for a bytes of storage space. N said all the characters of storage space, the value of n is 1 to 8000, can accommodate 8000 ANSI characters. If you do not specify a value for N, then the system default value is 1. If the input character data is less than N, the system automatically after adding spaces to fill the preset space. If the input data is too long, will cut off the excess.

2, NCHAR

The NCHAR data type is defined as the form of NCHAR[ (n)]. It is similar to the CHAR type. Different values of NCHAR data type n is 1 to 4000. Because NCHAR uses the standard UNICODE character set(CharacterSet). The UNICODE standard provides storage space for each character occupies two bytes, so the storage space it than standard UNICODE data types occupy more than one times. Using the UNICODE standard is good because it uses two bytes as storage unit, a unit of storage capacity is greatly increased, can be all the world's languages are all included, in a data column can occur at the same time Chinese, English, French, Devin, but not the coding conflict.

3, VARCHAR

The VARCHAR data type is defined as the form of VARCHARn). It is similar to the CHAR type, the value of n is 1 to 8000, if the input data is too long, will cut off the excess. Different is, the VARCHAR data type has the characteristics of variable length, because the storage length VARCHAR data type for the actual value length, if the input character data is less than N, then the system will not add space to fill in the following set space.

In general, because the CHAR data type fixed length, thus it than the processing speed of VARCHAR type fast.

4, NVARCHAR

The NVARCHAR data type is defined as the form of NVARCHAR[(n)]. It is similar to the VARCHAR type. Different is, the NVARCHAR data type using the standard UNICODE character set (Character Set), the value of n is 1 to 4000.


IndexesThe database table in one or more columns of a structure to sort, using the index for fast access to specific information in a database table.

The index is divided intoThe clustered indexAndNon clustered indexTwo, the cluster index is the order according to the physical location of data storage, and non clustered index is not the same; the clustered index can improve the multi line retrieval speed, while the non clustered index for single line search soon.

EstablishIndexesCan greatly improve the performance of the system.

First, through the creation of uniquenessIndexes, Can guarantee the uniqueness of each row of data in a database table.

Second, can greatly accelerate the speed of data retrieval, this is to create aIndexesThe main reason.

Third, can speed up the table and the connecting between tables, especially in referential integrity data of particular significance.        Fourth, in the use of grouping and sorting clause for data retrieval, can significantly reduce the query grouping and sorting time. Fifth, through the use ofIndexes, In the query process, optimizing the use of hidden devices, improve the performance of the system.

IncreaseIndexesThere are so many advantages.

First, createIndexesAnd maintenanceIndexesIt is a waste of time, this time increased with the increasing amount of data.

Second, IndexesThe need for physical space, in addition to the data table of data space, every index also accounts for the physical space, if the establishmentThe clustered index, You need space will be greater.

Third, when the data in the table, delete and modify the increase, IndexesAlso dynamic maintenance, thus reducing the data maintenance rate.

The difference between system functions and custom functions

Aggregate functions are used to statistics, table data such as: Sum, count, avg, ..
    Custom functions written by developers, to meet the business needs, can be in a variety of application scenarios.

Figure UML: a static map, a kind of dynamic graph

    The use case diagram, class diagram: static diagram, object diagram, component diagram, deployment diagram,

    Dynamic diagrams: sequence diagram, collaboration diagram, state diagram, activity diagram,

     UML is a modeling language, not a development tool,

Posted by Wright at November 19, 2013 - 5:03 AM