Next: , Previous: Using the User Library, Up: Top



4 The SENTENCE Class and Its Children

The class SENTENCE is the central part of each module. The goal of an input module is filling such a class with all information about the sentence so that it can be used by an output module to generate a sentence.

Input modules export the class to an XML sentence file, which is read by output modules to reform the class.

But what's in the class? The full parsed sentence. It contains the subject, object, adverbials, finite forms and so on. This chapter will explain the way the class is built up.

Studying the class you may want to read the appropiate header file as well. It is transtalo_objects.h. Because it is automatically included by transtalo_modules.h, you don't need to include it if you are making a translation module.

Note: reading this chapter, it is important to know that the classes aren't complete yet and are always about to change.

4.1 Abbreviations

The SENTENCE class and its children use much abbreviations for the phrase's names. These are listed here:

i_object
indirect object
d_object
direct object
subcomp
subject complement
adverbial_adverb
adverbial consisting of an adverb
adverbial_preposition
adverbial consisting of a preposition with an object
pers_pronoun
personal pronoun
ind_pronoun
indicating pronoun

4.2 SENTENCE

This is the main class that describes the whole sentence. Let's start with the declaration.

     class SENTENCE
     {
       public:
         vector<OBJECT> subjects, i_objects, d_objects;
         vector<SUBCOMP> subcomps;
         PREDICATE predicate;
     
         vector<ADVERBIAL_ADVERB> adverbials_adverb;
         vector<ADVERBIAL_PREPOSITION> adverbials_preposition;
     
         bool negative;
         bool asking;
     
         string original_language;
         string original_sentence;
     
         void add_subject(OBJECT subject);
         void add_i_object(OBJECT i_object);
         void add_d_object(OBJECT d_object);
         void add_subcomp(SUBCOMP subcomp);
         void add_finform(FINFORM finform);
         void add_infinitive(INFINITIVE infinitive);
         void add_adverbial_adverb(ADVERBIAL_ADVERB adverbial_adverb);
         void add_adverbial_preposition(ADVERBIAL_PREPOSITION adverbial_preposition);
     
         int save(string filename);
         void load(string filename);
     
         SENTENCE();
     };

Most members are self-explaining. The following list contains information about members that need more information:

original_language
contains the original language before the translation (ISO language code).
original_sentence
contains the original sentence before the translation.
negative
is true if the sentence is negative, e.g. in “He isn't able to sell fish.”
asking
is true if the sentence is asking, e.g. in “Do you see him?”
int save(string filename);
saves the SENTENCE to an XML file named filename.
void load(string filename);
loads the XML file named filename. Throws the parse_error exception if an XML parsing error was noticed.

4.3 OBJECT

OBJECT is the general class for objects. It can be a subject, indirect object, direct object, an object belonging to a subject complement, or an object in an adverbial after the preposition.

An object can be an object with a noun (e.g. `the big cat'), this is called a noun object; or it can be a personal pronoun (e.g. `you'), or it can be an independent indicating pronoun (`this' or `that').

     namespace object_type {
       enum OBJECT_TYPE {
         undefined,
         noun_object,
         pers_pronoun,
         ind_pronoun
       };
     }
     
     class OBJECT {
       public:
         object_type::OBJECT_TYPE type;
         NOUN_OBJECT noun_object;
         PERS_PRONOUN pers_pronoun;
         IND_PRONOUN ind_pronoun;
     
         RELATION relation;
     
         OBJECT(object_type::OBJECT_TYPE type = object_type::undefined);
         OBJECT(NOUN_OBJECT noun_object);
         OBJECT(PERS_PRONOUN pers_pronoun);
         OBJECT(IND_PRONOUN ind_pronoun);
     };

An OBJECT should have only one out of noun_object, pers_pronoun and ind_pronoun. type must confirm with the actual type that is used.

The three last constructors automatically add the contents and set the right type belonging to the class types they accept. The first constructor only sets the type.

See the section RELATION for information about relation.

4.4 NOUN_OBJECT

This class is always child of OBJECT and represents an object with a noun and possible adjectivals, adverbials and indicating pronouns.

     class NOUN_OBJECT
     {
       public:
         string noun, noun_orig;
         bool noun_known;
     
         vector<ADJECTIVAL> adjectivals;
         vector<ADVERBIAL_PREPOSITION> adverbials_preposition;
     
         vector<IND_PRONOUN> ind_pronouns;
     
         bool definite;
         bool force_no_article;
         bool plural;
         bool little;
     
         int begin_pos, end_pos;
     
         void add_adjectival(ADJECTIVAL adjectival);
         void add_adverbial_preposition(ADVERBIAL_PREPOSITION adverbial_preposition);
         void add_ind_pronoun(IND_PRONOUN ind_pronoun);
         NOUN_OBJECT();
     };

The following members need more information:

noun
the noun in Esperanto as it was translated by the input module
noun_orig
the original noun before the translation
noun_known
is true if the noun was known by the input module
definite
is true if the noun is definite (i.e. when it was called before, in English this means the article `the' is used instead of `a(n)'.
force_no_article
is true if the output module needs to be used not to use an article
little
is true if the noun is in its little form—in most languages, including English, this needs to be translated by adding the adjective `little'. Some languages have own suffixes for them, e.g. `-chen' and `-lein' in German.
begin_pos
the position of the character at the beginning of this phrase in the original sentence
end_pos
the position of the character at the end of this phrase in the original sentence

4.5 PERS_PRONOUN

This class is always child of OBJECT and represents an object consisting of a personal pronoun.

     class PERS_PRONOUN {
       public:
         int person;
         bool plural;
         int gender;
         bool definite;
         bool polite;
     
         vector<ADJECTIVAL> adjectivals;
         vector<ADVERBIAL_PREPOSITION> adverbials_preposition;
     
         vector<IND_PRONOUN> ind_pronouns;
     
         int begin_pos, end_pos;
     
         void add_adjectival(ADJECTIVAL adjectival);
         void add_adverbial_preposition(ADVERBIAL_PREPOSITION adverbial_preposition);
         void add_ind_pronoun(IND_PRONOUN ind_pronoun);
         PERS_PRONOUN(int person=0, bool plural=false, int gender=0);
     };
person
the person; can be 1, 2 or 3 (1st, 2nd, 3rd person)
plural
is true if the number is plural
gender
the gender; can be 0 (neuter), 1 (male), 2 (female)
definite
well I don't know either what I meant with this
polite
is true if the personal pronoun should be used polite; most languages only support it for 2nd person (the German `Sie' instead of `du'), and much languages, including English, don't even support it at all

See the explanation for begin_pos and end_pos in the SUBJECT section.

4.6 IND_PRONOUN

This class can be child of OBJECT, but also of NOUN_OBJECT or PERS_PRONOUN. It represents an either dependent or independent indicating pronoun (`this', `that', `these' in English).

     namespace ind_pronoun_type {
       enum IND_PRONOUN_TYPE {
         undefined, t_this, t_that
       };
     }
     
     class IND_PRONOUN {
       public:
         ind_pronoun_type::IND_PRONOUN_TYPE type;
     
         int begin_pos, end_pos;
         RELATION relation;
     
         IND_PRONOUN(ind_pronoun_type::IND_PRONOUN_TYPE type = ind_pronoun_type::undefined);
     };

type must be one out of ind_pronoun_type::t_this and ind_pronoun_type::t_that. It should not stay at ind_pronoun_type::undefined.

4.7 ADJECTIVAL

This class represents an adjectival.

     class ADJECTIVAL {
       public:
         string adjective;
         string adjective_orig;
         bool adjective_known;
     
         int degree;
     
         vector<ADVERBIAL_ADVERB> adverbials_adverb;
     
         int begin_pos, end_pos;
         RELATION relation;
     
         void add_adverbial_adverb(ADVERBIAL_ADVERB adverbial_adverb);
         ADJECTIVAL();
     };

adjective is the adjective in Esperanto, adjective_orig the original word. adjective_known indicates if it was known by the input module.

degree is one out of 0 (`big'), 1 (`bigger') or 2 (`biggest').

4.8 ADVERBIAL_ADVERB

This class represents an adverbial with an adverb. It can only be child of ADJECTIVAL or SENTENCE. An example is `really' in `It is a really big file' or `well' in `She shings well'.

     class ADVERBIAL_ADVERB {
       public:
         ADJECTIVAL adverb;
     
         int begin_pos, end_pos;
         RELATION relation;
     
         ADVERBIAL_ADVERB::ADVERBIAL_ADVERB();
     };

As you can see the adverb is formed by an ADJECTIVAL object. This may sound weird, but the adjectival has the same properties as the adverb, so this is possible.

4.9 ADVERBIAL_PREPOSITION

This class represents an adverbial with a preposition and an object. It can be child of an OBJECT or SENTENCE. An example is `in the room' in `The children are in the room'.

     namespace place_in_sentence {
       enum PLACE_IN_SENTENCE {
         undefined,
         start_of_sentence,
         after_subject,
         after_d_object,
         after_i_object,
         after_finform,
         after_subcomp,
         end_of_sentence
       };
     }
     
     class ADVERBIAL_PREPOSITION {
       public:
         place_in_sentence::PLACE_IN_SENTENCE place;
         vector<OBJECT> objects;
         vector<PREPOSITION> prepositions;
     
         int begin_pos, end_pos;
         RELATION relation;
     
         void add_object(OBJECT object);
         void add_preposition(PREPOSITION preposition);
         ADVERBIAL_PREPOSITION();
     };

place is a PLACE_IN_SENTENCE enumeration and defines the place where the adverbial was placed in the original sentence. Output modules will try to keep this place, but sometimes it is impossible, because some languages have restrictions where which adverbials are placed.

Of cource place has no meaning if the adverbial is child of an object.

4.10 PREPOSITION

The PREPOSITION class is only used as child of ADVERBIAL_PREPOSITION. It represents, like the name suggests, a preposition.

     class PREPOSITION {
       public:
         string preposition;
         RELATION relation;
         PREPOSITION(string preposition="");
     };

preposition is a simple string containing the preposition translated in Esperanto. The original preposition is omitted as there are so few prepositions.

4.11 PREDICATE

A PREDICATE contains the full predicate of the sentence.

     class PREDICATE {
       public:
         vector<VERB> verbs;
     
         int time;
         int person;
         bool plural;
         bool perfect;
         bool passive;
         bool imperative;
     
         int begin_pos, end_pos;
     
         VERB last_verb() const;
         bool exists() const;
         void add_verb(const VERB &verb);
         PREDICATE();
     };
verbs
All verbs belonging to the predicate. The first verb is the finite form.
time
time, can be 1 (present), 2 (past) or 3 (future)
person
person, can be 1 (1st person), 2 (2nd person) or 3 (3rd person)
plural
is true if the finite form is plural
perfect
is true if the finite form is perfect
passive
is true if the finite form is passive
imperative
is true if the finite form is an imperative
VERB last_verb()
returns the last verb
bool exists()
returns true if there is minimum one verb, i.e. if the predicate exists

See the explanation for begin_pos and end_pos in the SUBJECT section.

4.12 VERB

This class represents one verb of the predicate.

     class VERB {
       public:
         string verb;
         string verb_orig;
         bool verb_known;
     
         int begin_pos, end_pos;
         RELATION relation;
     
         VERB();
     };

4.13 RELATION

A RELATION is an enumeration type which indicates the relation of a subject, object, adjectival or other phrase with its predecessor.

     enum RELATION {
       r_none,
       r_and,
       r_or,
       r_but,
       r_but_also
     };

Imagine the following sentence: “The ugly and dirty cat sees your father.” In this sentence, the second adjectival has the relation r_and, and the first adjectival has the relation r_none.

4.14 More Subjects in One Sentence?

As you can see, SENTENCE contains vectors for subject, direct object, indirect object and subject complement. This means that it is possible to have e.g. more than one subject within one sentence.

In normal linguistics, it is impossible to have more than one subject in one sentence. The subject can have more than one part, but it is still one phrase:

The big cat and the small dog are in the room.

Normally there would be only one subject in this sentence: “The big cat and the small dog”. In Transtalo, however, there are two subjects: “The big cat” and “the small dog”. This is because the subjects all have their own articles, adjectivals and numbers (single or plural). The two subjects are totally separated from each other.

Let's take another, different, example:

The big cats and dogs are in the room.

Transtalo treats this as only one subject, just like in normal linguistics. This is because here the nouns (`cats' and `dogs') are not separated—the article (`the') as well as the adjectivals (`big') belong to both nouns.

Both forms can be combined, too:

The big cats and dogs and a yellow rabbit are in the room.

There are two subjects in this sentence: “The big cats and dogs” and “a yellow rabbit”.