Next: , Previous: SENTENCE Class and its Children, Up: Top



5 XML Sentence Description Files

Besides the SENTENCE class, Transtalo also uses XML to describe sentences.

The input and output modules use them to communicate with each other, and the user can use them by making a sentence entirely in XML format to let an output module translate it.

In this chapter, the full format of these files is described.

You might want to have a sample XML sentence file while reading this chapter. You can easily generate it by translating a sentence with Transtalo:

     transtalo lang2xml eo "^Cu en la arbo la grandaj knaboj vidas min tre bone?"

This command will generate an XML sentence file with the sentence “Do the big boys see me very well in the tree?” and put it into sentence.xml (the default value).

5.1 XML Header

The sentence file begins with an XML header:

     <?xml version="1.0"?>

5.2 Toplevel Sentence

After the header, the first tag is the <sentence> tag. It fills up the rest of the file and represents the whole sentence.

<original-lang>
The original language code (existing of 2 or 3 letters) is between these tags. Should be omitted if it is not applicable (e.g. when you create the XML sentence file manually).
<original-sentence>
The original sentence before the translation. Should be omitted if it is not applicable (e.g. when you create the XML sentence file manually).
<subject>
Represents one subject in the sentence, see the Object section.
<d-object>
Represents one direct object in the sentence, see the Object section.
<i-object>
Represents one indirect object in the sentence, see the Object section.
<subcomp>
Represents one subject complement in the sentence, see the Subject Complement section.
<predicate>
Represents the predicate of the sentence, see the Predicate section.
<adverbial-preposition>
Represents one adverbial with preposition and object in the sentence, see the Adverbial Preposition section.
<adverbial-adverb>
Represents one adverbial with adverb in the sentence, see the Adverbial Adverb section.
<negative />
Is defined if the sentence is negative, e.g. “Low-flying pinguins are not funny!”
<asking />
Is defined if the sentence is asking, e.g. “Does that provide perspectives?”

5.3 Object

This node represents a subject, direct object, indirect object, object belonging to a subject complement or object belonging to an adverbial.

The tag for an object can be <subject>, <d-object>, <i-object> or <object>. It has a mandatory argument type, it can be one out of:

type="noun-object"
An object with a noun and possible adjectivals, adverbs, etc.
type="pers-pronoun"
An object with a personal pronoun
type="ind-pronoun"
An object with an independent indicating pronoun

Only for type="noun-object":

<noun>
The noun in Esperanto; has argument unknown="true" if the word couldn't be translated by the input module. Mandatory.
<original-noun>
The noun before it was translated in Esperanto by the input module. Optional.
<definite />
Defined if the noun is definite, i.e. `the' instead of `a(n)' in English.
<plural />
Defined if the noun is plural.
<force-no-article />
Defined if the output module should not add an article to the object.
<little />
Defined if the noun is in its little form. In most languages this must be translated by adding the adjective `little'.

Only for type="pers-pronoun":

<person>
The person, can be 1, 2 or 3. Mandatory.
<plural />
Defined if the personal pronoun is plural.
<gender>
Gender, can be 0 (neuter), 1 (male) or 2 (female); only applicatable if person is 2. Optional.
<definite />
Article will be definite if there is an article due to presence of an adjectival or adverbial.
<polite />
Defined if the personal pronoun is polite, e.g. the German `Sie' instead of `du'.

Only for type="ind-pronoun":

<ind-pronoun type="type" />
Tag for the indicating pronoun. Type can be this or that.

For all types:

<adjectival>
Represents an adjectival, see section Adjectival
<adverbial-preposition>
Represents an adverbial with a preposition and an object, see section Adverbial Preposition
<ind-pronoun type="type" />
Represents an indicating pronoun. Type can be this or that.
<original-position begin="begin" end="end" />
Defines the position of the phrase in the original sentence before the translation.

5.4 Subject Complement

The <subcomp> tag has one mandatory argument: type, which can be one of the following:

type="adjectivals"
The subject complement consists of one or more adjectivals, e.g. “My cat is blue.”
type="object"
The subject complement consists of an object, e.g. “This is a nice green cat.”

The node consists of one or more <adjectival> tags or an <object> tag and can contain the <original-position> tag, which was already described at the end of the Object section.

5.5 Predicate

The <predicate> tag contains the full predicate of the sentence. It has the following tags:

<time>
The finite form's time is between these tags. 1=present, 2=past, 3=future.
<person>
The finite form's person is between these tags. Can be 1, 2 or 3.
<plural />
Defined if the finite form is plural.
<passive />
Defined if the finite form is passive.
<imperative />
Defined if the finite form is an imperative.
<perfect />
Defined if the finite form is perfect (i.e. finished).
<verb>
Represents one verb. Normally the first one is the finite form. See the Verb section for this.

5.6 Verb

The <verb> node represents a verb and is always a child node of <predicate>.

<verb>
Verb in Esperanto is between these tags. It has the argument unknown="true" if the verb was not known by the input module.
<original-verb>
Original verb before the translation is between these tags. Optional.
<original-position />
See the Object section for this tag.

5.7 Adjectival

The <adjectival> node represents, such as the name suggests, an adjectival.

<adjective>
The adjective in Esperanto is between these tags. It has two arguments: degree="degree", which can be 0 (1st degree), 1 (2nd degree) or 2 (3rd degree); the second argument is optional and is unknown="true" which is defined if the adjective wasn't known by the input module.
<original-adjective>
The original adjective before the translation is between these tags. Optional.
<original-position />
See the Object section for this tag.

5.8 Adverbial Preposition

The <adverbial-preposition> node represents an adverbial that consists of one or more prepositions with one or more objects.

<preposition>
Preposition in Esperanto between these tags. Because there exist so few prepositions, the known and original things are omitted. There may be more prepositions within one adverbial preposition.
<object>
Represents one object. See the Object section for this.
<original-position />
See the Object section for this tag.

5.9 Adverbial Adverb

The <adverbial-adverb> node represents an adverbial that consists of an adverb.

<adverb adverb="adverb" original-adverb="original adverb" degree="degree">
The adverb. Degree is 0 (1st degree), 1 (2nd degree) or 2 (3rd degree).
<original-position />
See the Object section for this tag.