MOD-CO normative XSD representation

From Meta-Omics Data of Collection Objects
Jump to: navigation, search

MOD-CO normative XSD representation

For exchanging data, the format XML is commonly used. The structure of the XML document is defined by an XML schema document (XSD). By using dedicated tools (e.g. XML Validator), a specific XML document may be checked for compilance with the normative XML schema representation cited in the XML header.

The MOD-CO normative XML schema representation (http://schema.mod-co.net/MOD-CO_1.0.xsd) has been generated using the MOD-CO schema representation in DiversityDescriptions applying the following method:

  • Top level element "Datasets"

Datasets.gif

The "Datasets" element includes at least one element "Dataset" of the dedicated DatasetType

  • Dataset type

DatasetType1.gif DatasetType2.gif

The "DatasetType" includes the first mandatory element "DatasetIdentity" with the "Title" and optional "Details" (see left screenshot above). The "DatasetIdentity" follows the concepts of the five main schema hierary level collections (Level 0 to Level 4). Their sequence is determined by the concept's data field "Display order".

The XML element's name is derived from the concept's name:

  1. Prefix "modco:" omitted
  2. Rest converted into "CamelCase", i.e. the character following a blank resp. underline is capitalized and the blank resp. underline is removed
Example: modco:unit_domain_elementary_category becomes UnitDomainElementaryCategory (see left screenshot above)

Depending on the concept's "Data type", the following XML elements are included:

Data type "float"

The element "Numeric" contains the numeric value and the optional element "Notes" holds additional text.

Data type "text"

Depending on the expected data two cases are possible:

  1. The element "Text" contains the text value and the optional element "Notes" holds additional text.
  2. The element "Sequence" contains the nucleotide resp. protein sequence text and the optional element "Notes" holds additional text. In the corresponding concepts an item "Sequence type" is included.
Data type "enumerated"

The element "Entity" contains one the text values specified in the referenced enumeration type (see below). The optional element "EntityValue" contains the string value characterizing the "Entity" field and the optional element "Notes" holds an additional text. In general an "EntityValue" is expected, if the corresponding concept name is marked with suffix "<category> <string>".

For all of the specified cases examples may be found in the screenshot above, e.g. UnitDomainElementaryCategory, UnitIdentifierString, UnitDigitalNucleotideString and UnitDigitalSequenceLengthValue.

Enumeration types

The enumeration types include the allowed string literals of the "Enumerated" and "Entity" elements. Their names consist of the ambient XML element and the suffix "Enum". See below as an example the type UnitPhysicalTransactionRepositoryLocationNameIdentifierCategoryStringEnum. DatasetType4.gif