5 XML Sentence Description Files
Besides the SENTENCE
class, Transtalo also uses XML to describe
sentences.
The input and output modules use them to communicate with each other,
and the user can use them by making a sentence entirely in XML format
to let an output module translate it.
In this chapter, the full format of these files is described.
You might want to have a sample XML sentence file while reading this
chapter. You can easily generate it by translating a sentence with
Transtalo:
transtalo lang2xml eo "^Cu en la arbo la grandaj knaboj vidas min tre bone?"
This command will generate an XML sentence file with the sentence
“Do the big boys see me very well in the tree?” and put it into
sentence.xml (the default value).
5.1 XML Header
The sentence file begins with an XML header:
<?xml version="1.0"?>
5.2 Toplevel Sentence
After the header, the first tag is the <sentence> tag. It fills up the
rest of the file and represents the whole sentence.
<original-lang>
- The original language code (existing of 2 or 3 letters) is between these
tags. Should be omitted if it is not applicable (e.g. when you create
the XML sentence file manually).
<original-sentence>
- The original sentence before the translation. Should be omitted if it
is not applicable (e.g. when you create the XML sentence file manually).
<subject>
- Represents one subject in the sentence, see the Object section.
<d-object>
- Represents one direct object in the sentence, see the Object section.
<i-object>
- Represents one indirect object in the sentence, see the Object section.
<subcomp>
- Represents one subject complement in the sentence, see the Subject Complement section.
<predicate>
- Represents the predicate of the sentence, see the Predicate section.
<adverbial-preposition>
- Represents one adverbial with preposition and object in the sentence, see the
Adverbial Preposition section.
<adverbial-adverb>
- Represents one adverbial with adverb in the sentence, see the Adverbial Adverb
section.
<negative />
- Is defined if the sentence is negative, e.g. “Low-flying pinguins are not funny!”
<asking />
- Is defined if the sentence is asking, e.g. “Does that provide perspectives?”
5.3 Object
This node represents a subject, direct object, indirect object, object belonging to
a subject complement or object belonging to an adverbial.
The tag for an object can be <subject>
, <d-object>
, <i-object>
or <object>
. It has a mandatory argument type
, it can be one out
of:
type="noun-object"
- An object with a noun and possible adjectivals, adverbs, etc.
type="pers-pronoun"
- An object with a personal pronoun
type="ind-pronoun"
- An object with an independent indicating pronoun
Only for type="noun-object"
:
<noun>
- The noun in Esperanto; has argument
unknown="true"
if the word couldn't be
translated by the input module. Mandatory.
<original-noun>
- The noun before it was translated in Esperanto by the input module. Optional.
<definite />
- Defined if the noun is definite, i.e. `the' instead of `a(n)' in English.
<plural />
- Defined if the noun is plural.
<force-no-article />
- Defined if the output module should not add an article to the object.
<little />
- Defined if the noun is in its little form. In most languages this must be translated
by adding the adjective `little'.
Only for type="pers-pronoun"
:
<person>
- The person, can be 1, 2 or 3. Mandatory.
<plural />
- Defined if the personal pronoun is plural.
<gender>
- Gender, can be 0 (neuter), 1 (male) or 2 (female); only applicatable if person is 2. Optional.
<definite />
- Article will be definite if there is an article due to presence of an adjectival or adverbial.
<polite />
- Defined if the personal pronoun is polite, e.g. the German `Sie' instead of `du'.
Only for type="ind-pronoun"
:
<ind-pronoun type="type" />
- Tag for the indicating pronoun. Type can be
this
or that
.
For all type
s:
<adjectival>
- Represents an adjectival, see section Adjectival
<adverbial-preposition>
- Represents an adverbial with a preposition and an object, see section Adverbial
Preposition
<ind-pronoun type="type" />
- Represents an indicating pronoun. Type can be
this
or that
.
<original-position begin="begin" end="end" />
- Defines the position of the phrase in the original sentence before the translation.
5.4 Subject Complement
The <subcomp>
tag has one mandatory argument: type
, which can be
one of the following:
type="adjectivals"
- The subject complement consists of one or more adjectivals, e.g. “My cat is blue.”
type="object"
- The subject complement consists of an object, e.g. “This is a nice green cat.”
The node consists of one or more <adjectival>
tags or an <object>
tag
and can contain the <original-position>
tag, which was already described
at the end of the Object section.
5.5 Predicate
The <predicate>
tag contains the full predicate of the sentence. It has the
following tags:
<time>
- The finite form's time is between these tags. 1=present, 2=past, 3=future.
<person>
- The finite form's person is between these tags. Can be 1, 2 or 3.
<plural />
- Defined if the finite form is plural.
<passive />
- Defined if the finite form is passive.
<imperative />
- Defined if the finite form is an imperative.
<perfect />
- Defined if the finite form is perfect (i.e. finished).
<verb>
- Represents one verb. Normally the first one is the finite form. See the Verb
section for this.
5.6 Verb
The <verb>
node represents a verb and is always a child node of <predicate>
.
<verb>
- Verb in Esperanto is between these tags. It has the argument
unknown="true"
if the
verb was not known by the input module.
<original-verb>
- Original verb before the translation is between these tags. Optional.
<original-position />
- See the Object section for this tag.
5.7 Adjectival
The <adjectival>
node represents, such as the name suggests, an adjectival.
<adjective>
- The adjective in Esperanto is between these tags. It has two arguments:
degree="degree"
, which can be 0 (1st degree), 1 (2nd degree) or 2 (3rd degree);
the second argument is optional and is unknown="true"
which is defined if
the adjective wasn't known by the input module.
<original-adjective>
- The original adjective before the translation is between these tags. Optional.
<original-position />
- See the Object section for this tag.
5.8 Adverbial Preposition
The <adverbial-preposition>
node represents an adverbial that consists of
one or more prepositions with one or more objects.
<preposition>
- Preposition in Esperanto between these tags. Because there exist so few prepositions,
the known and original things are omitted. There may be more prepositions within
one adverbial preposition.
<object>
- Represents one object. See the Object section for this.
<original-position />
- See the Object section for this tag.
5.9 Adverbial Adverb
The <adverbial-adverb>
node represents an adverbial that consists of
an adverb.
<adverb adverb="adverb" original-adverb="original adverb" degree="degree">
- The adverb. Degree is 0 (1st degree), 1 (2nd degree) or 2 (3rd degree).
<original-position />
- See the Object section for this tag.