You are on page 1of 6

1

Document Type Definition


A DTD defines the structure of the content of an XML document. A DTD allows you to store data in a consistent format. A DTD specifies the elements that can be present in an XML document, the attributes of these elements, and their arrangement with relation to each other. It also allows you to specify whether an element or an attribute is optional or mandatory. The DTD is responsible for performing three primary tasks: 1. Specifying the documents root element (for example, html is the root element of HTML documents) 2. Defining elements, attributes, and entities specific to the document. 3. Identifying an external DTD for the document. Creating a DTD is similar to creating a table in a database. In DTDs you specify the structure of data by declaring elements to represent the data. This is similar to creating columns in a table. You can also specify whether providing a value for an element is mandatory or optional. You can then store the data in an XML document that conforms to the DTD for an application. This is similar to adding records to a table. XML allows you to create your own DTDs for applications. This gives you complete control over the process of checking the content and structure of XML documents created for an application. This checking process is called validation. XML documents that conform to a DTD are considered valid documents. A DTD allows you to specify the structure and type of a data element. The data stored in XML documents is enclosed within elements that describe the data. Therefore, to store data in a consistent format, you need to determine what information needs to be represented by elements. Declaring Elements in a DTD To store structured data in an XML document, you need to declare elements in a DTD. An XML document can then be checked against the DTD. In a DTD, elements are declared using the following syntax: <!ELEMENT elementname (content-type or content-model)> In this syntax, elementname specifies the name of an element. Content-type or content-model specifies whether the element contains textual data or other elements. While declaring elements or attributes, you must consider some naming conventions. These conventions include the following: A name consists of atleast one letter. A letter may be in uppercase or lowercase. An element name may start with an underscore. One or more letters, digits, hyphens, underscores, or full stops can follow the initial letter. Spaces and tabs are not allowed in element names. Only two punctuation signs hyphen (-) and period (.), are allowed in element names.

2 An element can be of the following types: Empty: Empty elements have no content and are marked up as <emptyelement/>. Unrestricted: The opposite of an empty element is an unrestricted element, which can contain any element declared elsewhere in a DTD. Container: Container elements can contain character data and other elements. Declaring Empty Elements: An empty element can be declared by specifying the content type as EMPTY. Consider the following example: <!ELEMENT emptyelement EMPTY> In this example, the element emptyelement is declared and the content type is specified as EMPTY. In this case, emptyelement can contain attributes. However, it cannot contain textual content or other elements. Declaring Unrestricted Elements: An unrestricted element can be declares by specifying the content type an ANY. Consider the following example: <!ELEMENT anyelement ANY> In this example, the element anyelement is declared and its content-type is specified as ANY. In this case, anyelement can contain any typeof data , including other elements that are declared elsewhere in a DTD. Declaring Container Elements: Using element declaration in a DTD, you can specify which other elementsare allowed inside an element, how often they may appear, and in what order. You do this by specifying the element content model. Consider the following structure: <BOOK> <TITLE> LET US C <\TITLE> <AUTHOR> YASHWANT KANETKAR <\AUTHOR> <\BOOK> For this XML document to be valid, you need to create a DTD that contains declaration of three elements: BOOK, TITLE, AUTHOR. In addition, you also need to decide whether TITLE and AUTHOR are mandatory or optional, whether they can be in any order or have to be in a specific order, and the number of times they can appear in an XML document. You can write element declarations for these decisions. For example, if both TITLE and AUTHOR have to be specified and TITLE should be followed by AUTHOR, the DTD would be written as: <! ELEMENT BOOK (TITLE, AUTHOR)> <!- Element content -> <! ELEMENT TITLE (#PCDATA)> <!- Character content -> <! ELEMENT AUTHOR (#PCDATA)> <!- Character content -> In this code, the BOOK element is declared with TITLE and AUTHOR as child elements. The TITLE and AUTHOR elements have the content type as PCDATA(Parsable Character Data). PCDATA is used to represent character content. PCDATA is prefixed with a hash (#) symbol so that it is not confused with a normal element name. In a DTD, different symbols are used to specify whether an element is mandatory or optional and whether it can occur more than once.

3 Table given below lists the various symbols used whi8le specifying the element content in a DTD. Symbol Meaning Example Description , and TITLE,AUTHOR TITLE and AUTHOR , in that order | or TITLE|AUTHOR TITLE or AUTHOR ? optional, can occur AUTHOR? AUTHOR need not only once within the be present, but if it parent element is present, it can occur only once * Can be zero or (TITLE|AUTHOR)* Any number of multiple occurrences TITLE or of the element order AUTHOR elements can be present. + At least one AUTHOR+ Can have multiple occurrence of the AUTHOR element; Can have elements multiple occurrences within the parent element Declaring Attributes In addition to declaring elements, you can also declare attributes in a DTD. These declarations are used during the process of validation to check the structure of an XML document. The syntax for declaring attributes in a DTD is: <! ATTLIST elementname attributename valuetype [attributetype] [default]> The attributename valuetype [attributetype] [default] section is repeated as often as necessary to create multiple attributes for an element. Each attribute declaration must include the attribute name and a value type. For assigning the values to an attribute you must know the different types of values that can be assigned to attributes. The following table shows various value types that can be specified for an attribute in a DTD. Table: Value Types used in a DTD Value Type Description PCDATA Used to represent plain text values ID Used to assign a unique value to each element in the document; must begin with an alphabetic character (enumerated) Used to assign a specific range of values; values are specified within parenthesis

4 In addition to specifying the value type of an attribute, you also need to specify whether the attribute is optional or mandatory. You can do so by setting the attribute type in a DTD. The attribute types are displayed in the following table. Table: Attribute Types used in DTD Attribute Type Description REQUIRED If the attribute of an element is specified as #REQUIRED, then the value for that attribute must be specified each time the element is used in an XML document. If the value for the REQUIRED attribute is not specified, the XML document will be invalid. FIXED If the attribute of an element is specified as #FIXED, then the value of the attribute cannot be changed in an XML document. IMPLIED If the attribute of an element is specified as #IMPLIED, then the attribute is optional. In other words, an IMPLIED attribute need not be used each time its associated element is used. An IMPLIED attribute can take text strings as its values. Consider the example, <! ATTLIST PRODUCT PRODID #REQUIRED> An attribute called PRODID is declared for the PRODUCT element. The value type for this attribute is set to ID, which indicates that the value of PRODID is unique for each appearance of the PRODUCT element in an XML document. Types of DTDs A DTD can be classified into two types: 1. Internal DTD and 2. External DTD Table shows the difference between Internal and External DTD. Internal DTD External DTD A part of an XML document Is maintained as a separate file. A reference to this file is included in an XML document Can be used only by the document Can be used across multiple in which it is created and cannot documents. be used across multiple documents Validating the structure of Data To validate the structure of the data stored in an XML document against a DTD, you need to use parsers. Parsers are software programs that check the syntax used in XML file. There are two types of Parsers. 1. Nonvalidating parsers 2. Validating parsers

Nonvalidating Parsers A nonvalidating parser checks if a document follows the XML syntax rules. It builds a tree structure from tags used in an XML document and returns an error only when there is a problem with the syntax of the document. Nonvalidating parsers process a document faster than a validating parser because they do not have to check every elements against a DTD. In other words, these parsers check whether an XML document adheres to the rules of well formed documents. Validating Parsers A validating parser checks the syntax of the elements, builds the tree structure of an XML document, and compares the structure of an XML document with structure specified in the DTD associated with the document. In other words, in addition to checking whether an XML document is well formed, validating parsers also check whether the XML document adheres to the rules in the DTD used by the XML document. Example for DTD: <?xml version1.0 encoding=UTF-8?> <DOCUMENT> <GREETING> Hello From XML <\GREETING> <MESSAGE> Welcome to the world of XML. <\MESSAGE> <\DOCUMENT> Most of the XML browsers will check your document to see whether it is well formed. Some of them also can check whether its valid. An XML document is valid if there is a Document Type Definition(DTD) associated with it and if the document compiles with that DTD. A documents DTD specifies the correct syntax of the document. DTDs can be stored in a separate file or in the document itself, using <!DOCTYPE> element.

6 The previous example using DTD becomes: <?xml version1.0 encoding=UTF-8?> <?xml stylesheet type=text/css href=greeting.css?> <! DOCTYPE DOCUMENT [ <! ELEMENT DOCUMENT (GREETING, MESSAGE)> <! ELEMENT GREETING (#PCDATA)> <! ELEMENT MESSAGE (#PCDATA)> ]> <DOCUMENT> <GREETING> Hello From XML <\GREETING> <MESSAGE> Welcome to the world of XML. <\MESSAGE> <\DOCUMENT> DTD indicates that you can have <GREETING> and <MESSAGE> elements inside a <DOCUMENT> element, that the <DOCUMENT> element is the root element and that the <GREETING> and <MESSAGE> elements can hold text.

You might also like