Study of XML 2 -- DTD

  In an article on the notes, the document type classification, according to whether the document using once and comply with the DTD or Schema to differentiate into well formed XML and effective XML, what is DTD and Schema? DTD and Schema are used to regulate the XML document, to the semantic constraints of the XML document, DTD is simple to use, Schema is a powerful. In this note, we learn about DTD (Document Type Definition, document type definition).

1,How to use the DTD in the XML document

Import syntax description internal DTD

<!The DOCTYPE root element name[
     Element description
]>

The DTD definitions in an XML document, followed by the XML statement and the processing instructions

Such as: <DOCTYPE model list![

     <!The list of ELEMENT model (model)*>

]>

External DTD

<!The DOCTYPE root element name SYSTEM "DTD URI">

DTD separate definitions in a file, and then by the keyword SYSTEM into DTD

Such as: <DOCTYPE model SYSTEM model list! "List DTD file a relative or absolute path">

Public DTD

<!The DOCTYPE root element name PUBLIC "DTD name" DTD URI "public">

Public DTD, this DTD is specified by an authority, for a particular industry or public use, through the keyword PUBLIC import

Such as: <!DOCTYPE web-app PUBLIC "-//Sun Microsystems, Inc.//DTD Web Application 2.3//EN" "http://java.sun.com/dtd/web-app_2_3.dtd">

2, The structure of DTD

The DTD document itself is not a XML document, but XML document defined semantic constraints, the syntax of the DTD document is very simple, generally has the following structure:

(1)The first line is the DTD document, statement, and XML statement the same grammar

(2)0 or more notes, DTD notes and XML comment syntax is the same

(3)0 or more <! ELEMENT... > definitions, each <! ELEMENT... > the definition of a XML element

(4)0 or more <! ATTLIST... > definitions, each <! ATTLIST... > defines an attribute for the XML element

(5)0 or more <! ENTITY... > definitions, each <! ENTITY... > define an entity

(6)0 or more <! NOTATION... > definitions, each <! NOTATION... > the definition of a symbol

The <ELEMENT, <!... > > ATTLIST;!..., <ENTITY, <!... > > NOTATION;!... 4 definition is entirely independent of each other, need not be nested within each other, below the four definition.

3,Define the elements

(1)Element type definition, Element Type Definition, referred to as ETD

(2) Element type

The element type definition format that any type of <ANY>! ELEMENT element name element can be a string;, Can be empty, Can also contain sub elements string value <! The ELEMENT element name (#PCDATA) > element value can only be a string;, Cannot be an empty element, Also cannot contain child elements empty element <the ELEMENT element name! EMPTY>     complex contain sub elements;, Need to define the order of elements between the sub and sub elements of the number of mixed type <ELEMENT element name! (the #PCDATA| child element 1| element 2|...) *> the specified value is the only type of several identified, There is stronger than any type of constraint, But quite, Should as far as possible the use of mixed type

Definition of mixed type, to mention a few points:

A: #PCDATA must be placed in the front

B: #Between the PCDATA and the sub elements can only use a separate (|), do not use commas

C: Do not use in the sub elements?, *, + indicates the frequency of modifier

(3)Definition of child elements

Definition of child element description of the grammar element frequency modifiers that (sub elements 1, 2 sub elements,...) to use English comma defined ordered sub element of default (no modifier) a, And can appear only once (sub element 1| element 2|...) sub elements using a definition of mutual exclusion? 0 or 1 times ((1 sub elements, sub elements in 2 (3) | sub element, element 4)) use parentheses to sub group elements + 1 or more (sub element of 1| promoter element 2|...) + use a mutex, Then use the frequency modified implementation defined sub elements * disorder occurred in 0 or more times

4,Define the properties

In XML, attribute cannot exist alone, thus defining attributes must be specified when the element to which. Defining attribute syntax is as follows:

<!The ATTLIST property belongs to the element name attribute name attribute type constraints on properties [elements] [default]>

(1)Property type

That type of CDATA the value of this attribute is only the string data (en1|en2|en3) the value of this attribute must be a series of one of the ID enumeration values of the attribute value must be some identifier, And the value of this attribute can be used to identify the elements, So we must only IDREF this XML document the value of this attribute must be a reference attribute of type ID another existing value IDREFS the value of this attribute attribute must be one or more ID type reference existing value, Multiple attributes of a value of type ID separated by a space between the NMTOKEN attribute value must be a valid XML name, Must be a string data, A stronger constraint than CDATA, Only by letters, numbers, underline, underlined, Dots and colon NMTOKENS the attribute value must be a property of one or more values of type NMTOKEN, Multiple use a space to separate ENTITY the value of this attribute is an external entity, Such as picture ENTITIES the value of this attribute is an attribute or a plurality of values of type ENTITY, Multiple use a space to separate NOTATION the value of this attribute is declared in the DTD of symbols(NOTATION), This is a will be expired specification, Try to avoid using xml: the value of this attribute is a predefined XML value

(2)Relationship of elements on the properties with default values

Constraint elements on the properties that the default value of   is not specified; must specify the default value for the attribute #REQUIRED must, must provide this attribute is not specified a default value of #IMPLIED the property not essential cannot specify a default value of #FIXED the attribute value is fixed to a corresponding element definition, must be specified when the fixed value must be specified with a default value

5,Define the entity

The entity reference is to use a string instead of another string, similar to the C language macro, a note has been mentioned 5 entities in XML built-in reference, then have a look how the custom entity references here.

Entity type site definition syntax syntax description general entities XML <!ENTITY Named entity "Entity value"> &Named entity;   Parameter entities DTD <!ENTITY % Named entity "Entity value"> %Named entity; We must first define the external entity before use XML <!ENTITY Named entity SYSTEM "The entity value fileURI"> &Named entity; Here the external file must be satisfiedXMLText document common external entity document structure XML <!ENTITY Named entity PUBLIC "The public entity name" "The entity value fileURI"> &Named entity;   External parameter entities DTD <!ENTITY % Named entity SYSTEM "The entity value fileURI"> %Named entity;   Common external parameter entities DTD <!ENTITY % Named entity PUBLIC "The public entity name" "The entity value fileURI"> %Named entity;   Unparsed entity XML <!ENTITY % Named entity SYSTEM "The entity value fileURI" NDATA The symbol name> ThroughENTITYCalling and other types of unparsed entity not byXMLDocument analysis, But the need according to the corresponding symbol name to resolve public unparsed entity XML <ENTITY PUBLIC!% entity name "public entity identifier name" and "entity value file URI"   NDATA symbol name>

6,The definition of symbols

To define the symbol also has two syntax forms, respectively, the common definition and a common symbol:

Symbol type definition syntax common symbol <SYSTEM! NONATION symbol name "symbol value" > <a common symbol; symbol name PUBLIC! NONATION "public symbol name" symbol value "">

Symbolic value is generally one of two forms:

(1)MIME: The generic MIME file types always by the corresponding procedures for processing

(2)The external program where the path: specifies an external program directly responsible for handling external data in XML documents

Symbols are usually have two purposes:

(1)As above, the symbol can be used to define the unparsed entity

(2)Symbols can be used as ENTITY or attribute values of type ENTITIES

(3)Symbols can also be used as a NOTATION type attribute value, attribute definition NOTATION types, the syntax is as follows:

<!Elements of the property name NOTATION ATTLIST properties of (1| 2|...) constraint default values>

Than the general attributes that define a list of values.


Posted by Fitzgerald at November 15, 2013 - 1:10 PM