The Complete Lojban Language (1997)/Chapter 21
The following two listings constitute the formal grammar of Lojban. The first version is written in the YACC language, which is used to describe parsers, and has been used to create a parser for Lojban texts. This parser is available from the Logical Language Group. The second listing is in Extended Backus-Naur Form (EBNF) and represents the same grammar in a more human-readable form. (In case of discrepancies, the YACC version is official.) There is a cross-reference listing for each format that shows, for each selma'o and rule, which rules refer to it.
YACC Grammar of Lojban
[edit]/* /*Lojban Machine Grammar, Final Baseline The Lojban Machine Grammardocument is explicitly dedicated to the public domain by its author,The Logical Language Group, Inc. grammar.300 */ /* The Lojban machine parsing algorithm is a multi-step process. The YACC machine grammar presented here is an amalgam of those steps, concatenated so as to allow YACC to verify the syntactic ambiguity of the grammar. YACC is used to generate a parser for a portion of the grammar, which is LALR1 (the type of grammar that YACC is designed to identify and process successfully), but most of the rest of the grammar must be parsed using some language-coded processing. Step 1 - Lexing From phonemes, stress, and pause, it is possible to resolve Lojban unambiguously into a stream of words. Any machine processing of speech will have to have some way to deal with ’non-Lojban’ failures of fluent speech, of course. The resolved words can be expressed as a text file using Lojban’s phonetic spelling rules. The following steps assume that there is the possibility of non-Lojban text within the Lojban text (delimited appropriately). Such non-Lojban text may not be reducible from speech phonetically. However, step 2 allows the filtering of a phonetically transcribed text stream, to recognize such portions of non-Lojban text where properly delimited, without interference with the parsing algorithm. Step 2 - Filtering From start to end, performing the following filtering and lexing tasks using the given order of precedence in case of conflict: a. If the Lojban word “zoi” (selma'o ZOI) is identified, take the following Lojban word (which should be end delimited with a pause for separation from the following non-Lojban text) as an opening delimiter. Treat all text following that delimiter, until that delimiter recurs after a pause, as grammatically a single token (labelled ’anything_699’ in this grammar). There is no need for processing within this text except as necessary to find the closing delimiter. b. If the Lojban word “zo” (selma'o ZO) is identified, treat the following Lojban word as a token labelled ’any_word_698’, instead of lexing it by its normal grammatical function. c. If the Lojban word “lo'u” (selma'o LOhU) is identified, search for the closing delimiter “le'u” (selma'o LEhU), ignoring any such closing delimiters absorbed by the previous two steps. The text between the delimiters should be treated as the single token ’any_words_697’. d. Categorize all remaining words into their Lojban selma'o category, including the various delimiters mentioned in the previous steps. In all steps after step 2, only the selma'o token type is significant for each word. e. If the word “si” (selma'o SI) is identified, erase it and the previous word (or token, if the previous text has been condensed into a single token by one of the above rules). f. If the word “sa” (selma'o SA) is identified, erase it and all preceding text as far back as necessary to make what follows attach to what precedes. (This rule is hard to formalize and may receive further definition later.) g. If the word ’su’ (selma'o SU) is identified, erase it and all preceding text back to and including the first preceding token word which is in one of the selma'o: NIhO, LU, TUhE, and TO. However, if speaker identification is available, a SU shall only erase to the beginning of a speaker’s discourse, unless it occurs at the beginning of a speaker’s discourse. (Thus, if the speaker has said something, two adjacent uses of “su” are required to erase the entire conversation. Step 3 - Termination If the text contains a FAhO, treat that as the end-of-text and ignore everything that follows it. Step 4 - Absorption of Grammar-Free Tokens In a new pass, perform the following absorptions (absorption means that the token is removed from the grammar for processing in following steps, and optionally reinserted, grouped with the absorbing token after parsing is completed). a. Token sequences of the form any - (ZEI - any) ..., where there may be any number of ZEIs, are merged into a single token of selma'o BRIVLA. b. Absorb all selma'o BAhE tokens into the following token. If they occur at the end of text, leave them alone (they are errors). c. Absorb all selma'o BU tokens into the previous token. Relabel the previous token as selma'o BY. d. If selma'o NAI occurs immediately following any of tokens UI or CAI, absorb the NAI into the previous token. e. Absorb all members of selma'o DAhO, FUhO, FUhE, UI, Y, and CAI into the previous token. All of these null grammar tokens are permitted following any word of the grammar, without interfering with that word’s grammatical function, or causing any effect on the grammatical interpretation of any other token in the text. Indicators at the beginning of text are explicitly handled by the grammar. Step 5 - Insertion of Lexer Lexemes Lojban is not in itself LALR1. There are words whose grammatical function is determined by following tokens. As a result, parsing of the YACC grammar must take place in two steps. In the first step, certain strings of tokens with defined grammars are identified, and either a. are replaced by a single specified ’lexer token’ for step 6, or b. the lexer token is inserted in front of the token string to identify it uniquely. The YACC grammar included herein is written to make YACC generation of a step 6 parser easy regardless of whether a. or b. is used. The strings of tokens to be labelled with lexer tokens are found in rule terminals labelled with numbers between 900 and 1099. These rules are defined with the lexer tokens inserted, with the result that it can be verified that the language is LALR1 under option b. after steps 1 through 4 have been performed. Alternatively, if option a. is to be used, these rules are commented out, and the rule terminals labelled from 800 to 900 refer to the lexer tokens without the strings of defining tokens. Two sets of lexer tokens are defined in the token set so as to be compatible with either option. In this step, the strings must be labelled with the appropriate lexer tokens. Order of inserting lexer tokens IS significant, since some shorter strings that would be marked with a lexer token may be found inside longer strings. If the tokens are inserted before or in place of the shorter strings, the longer strings cannot be identified. If option a. is chosen, the following order of insertion works correctly (it is not the only possible order): A, C, D, B, U, E, H, I, J, K, M, N, G, O, V, W, F, P, R, T, S, Y, L, Q. This ensures that the longest rules will be processed first; a PA+MAI will not be seen as a PA with a dangling MAI at the end, for example. Step 6 - YACC Parsing YACC should now be able to parse the Lojban text in accordance with the rule terminals labelled from 1 to 899 under option 5a, or 1 to 1099 under option 5b. Comment out the rules beyond 900 if option 5a is used, and comment out the 700-series of lexer-tokens, while restoring the series of lexer tokens numbered from 900 up. */ %token A_501 /* eks; basic afterthought logical connectives */ %token BAI_502 /* modal operators */ %token BAhE_503 /* next word intensifier */ %token BE_504 /* sumti link to attach sumti to a selbri */ %token BEI_505 /* multiple sumti separator between BE, BEI */ %token BEhO_506 /* terminates BE/BEI specified descriptors */ %token BIhI_507 /* interval component of JOI */ %token BO_508 /* joins two units with shortest scope */ %token BRIVLA_509 /* any brivla */ %token BU_511 /* turns any word into a BY lerfu word */ %token BY_513 /* individual lerfu words */ %token CAhA_514 /* specifies actuality/potentiality of tense */ %token CAI_515 /* afterthought intensity marker */ %token CEI_516 /* pro-bridi assignment operator */ %token CEhE_517 /* afterthought term list connective */ %token CMENE_518 /* names; require consonant end, then pause no LA or DOI selma'o embedded, pause before if vowel initial and preceded by a vowel */ %token CO_519 /* tanru inversion */ %token COI_520 /* vocative marker permitted inside names; must always be followed by pause or DOI */ %token CU_521 /* separator between head sumti and selbri */ %token CUhE_522 /* tense/modal question */ %token DAhO_524 /* cancel anaphora/cataphora assignments */ %token DOI_525 /* vocative marker */ %token DOhU_526 /* terminator for DOI-marked vocatives */ %token FA_527 /* modifier head generic case tag */ %token FAhA_528 /* superdirections in space */ %token FAhO_529 /* normally elided ’done pause’ to indicate end of utterance string */ %token FEhE_530 /* space interval mod flag */ %token FEhU_531 /* ends bridi to modal conversion */ %token FIhO_532 /* marks bridi to modal conversion */ %token FOI_533 /* end compound lerfu */ %token FUhE_535 /* open long scope for indicator */ %token FUhO_536 /* close long scope for indicator */ %token GA_537 /* geks; forethought logical connectives */ %token GEhU_538 /* marker ending GOI relative clauses */ %token GI_539 /* forethought medial marker */ %token GIhA_541 /* logical connectives for bridi-tails */ %token GOI_542 /* attaches a sumti modifier to a sumti */ %token GOhA_543 /* pro-bridi */ %token GUhA_544 /* GEK for tanru units, corresponds to JEKs */ %token I_545 /* sentence link */ %token JA_546 /* jeks; logical connectives within tanru */ %token JAI_547 /* modal conversion flag */ %token JOI_548 /* non-logical connectives */ %token KEhE_550 /* right terminator for KE groups */ %token KE_551 /* left long scope marker */ %token KEI_552 /* right terminator, NU abstractions */ %token KI_554 /* multiple utterance scope for tenses */ %token KOhA_555 /* sumti anaphora */ %token KU_556 /* right terminator for descriptions, etc. */ %token KUhO_557 /* right terminator, NOI relative clauses */ %token LA_558 /* name descriptors */ %token LAU_559 /* lerfu prefixes */ %token LAhE_561 /* sumti qualifiers */ %token LE_562 /* sumti descriptors */ %token LEhU_565 /* possibly ungrammatical text right quote */ %token LI_566 /* convert number to sumti */ %token LIhU_567 /* grammatical text right quote */ %token LOhO_568 /* elidable terminator for LI */ %token LOhU_569 /* possibly ungrammatical text left quote */ %token LU_571 /* grammatical text left quote */ %token LUhU_573 /* LAhE close delimiter */ %token ME_574 /* converts a sumti into a tanru_unit */ %token MEhU_575 /* terminator for ME */ %token MOhI_577 /* motion tense marker */ %token NA_578 /* bridi negation */ %token NAI_581 /* attached to words to negate them */ %token NAhE_583 /* scalar negation */ %token NIhO_584 /* new paragraph; change of subject */ %token NOI_585 /* attaches a subordinate clause to a sumti */ %token NU_586 /* abstraction */ %token NUhI_587 /* marks the start of a termset */ %token NUhU_588 /* marks the middle and end of a termset */ %token PEhE_591 /* afterthought termset connective prefix */ %token PU_592 /* directions in time */ %token RAhO_593 /* flag for modified interpretation of GOhI */ %token ROI_594 /* converts number to extensional tense */ %token SA_595 /* metalinguistic eraser to the beginning of the current utterance */ %token SE_596 /* conversions */ %token SEI_597 /* metalinguistic bridi insert marker */ %token SEhU_598 /* metalinguistic bridi end marker */ %token SI_601 /* metalinguistic single word eraser */ %token SOI_602 /* reciprocal sumti marker */ %token SU_603 /* metalinguistic eraser of the entire text */ %token TAhE_604 /* tense interval properties */ %token TEI_605 /* start compound lerfu */ %token TO_606 /* left discursive parenthesis */ %token TOI_607 /* right discursive parenthesis */ %token TUhE_610 /* multiple utterance scope mark */ %token TUhU_611 /* multiple utterance end scope mark */ %token UI_612 /* attitudinals, observationals, discursives */ %token VA_613 /* distance in space-time */ %token VAU_614 /* end simple bridi or bridi-tail */ %token VEhA_615 /* space-time interval size */ %token VIhA_616 /* space-time dimensionality marker */ %token VUhO_617 /* glue between logically connected sumti and relative clauses */ %token XI_618 /* subscripting operator */ %token Y_619 /* hesitation */ %token ZAhO_621 /* event properties - inchoative, etc. */ %token ZEhA_622 /* time interval size tense */ %token ZEI_623 /* lujvo glue */ %token ZI_624 /* time distance tense */ %token ZIhE_625 /* conjoins relative clauses */ %token ZO_626 /* single word metalinguistic quote marker */ %token ZOI_627 /* delimited quote marker */ %token ZOhU_628 /* prenex terminator (not elidable) */ %token BIhE_650 /* prefix for high-priority MEX operator */ %token BOI_651 /* number or lerfu-string terminator */ %token FUhA_655 /* reverse Polish flag */ %token GAhO_656 /* open/closed interval markers for BIhI */ %token JOhI_657 /* flags an array operand */ %token KUhE_658 /* MEX forethought delimiter */ %token MAI_661 /* change numbers to utterance ordinals */ %token MAhO_662 /* change MEX expressions to MEX operators */ %token MOI_663 /* change number to selbri */ %token MOhE_664 /* change sumti to operand, inverse of LI */ %token NAhU_665 /* change a selbri into an operator */ %token NIhE_666 /* change selbri to operand; inverse of MOI */ %token NUhA_667 /* change operator to selbri; inverse of MOhE */ %token PA_672 /* numbers and numeric punctuation */ %token PEhO_673 /* forethought (Polish) flag */ %token TEhU_675 /* closing gap for MEX constructs */ %token VEI_677 /* left MEX bracket */ %token VEhO_678 /* right MEX bracket */ %token VUhU_679 /* MEX operator */ %token any_words_697 /* a string of lexable Lojban words */ %token any_word_698 /* any single lexable Lojban words */ %token anything_699 /* a possibly unlexable phoneme string */ /* The following tokens are the actual lexer tokens. The _900 series tokens are duplicates that allow limited testing of lexer rules in the context of the total grammar. They are used in the actual parser, where the 900 series rules are found in the lexer. */ %token lexer_A_701 /* flags a MAI utterance ordinal */ %token lexer_B_702 /* flags an EK unless EK_BO, EK_KE */ %token lexer_C_703 /* flags an EK_BO */ %token lexer_D_704 /* flags an EK_KE */ %token lexer_E_705 /* flags a JEK */ %token lexer_F_706 /* flags a JOIK */ %token lexer_G_707 /* flags a GEK */ %token lexer_H_708 /* flags a GUhEK */ %token lexer_I_709 /* flags a NAhE_BO */ %token lexer_J_710 /* flags a NA_KU */ %token lexer_K_711 /* flags an I_BO (option. JOIK/JEK lexer tags)*/ %token lexer_L_712 /* flags a PA, unless MAI (then lexer A) */ %token lexer_M_713 /* flags a GIhEK_BO */ %token lexer_N_714 /* flags a GIhEK_KE */ %token lexer_O_715 /* flags a modal operator BAI or compound */ %token lexer_P_716 /* flags a GIK */ %token lexer_Q_717 /* flags a lerfu_string unless MAI (then lexer_A)*/ %token lexer_R_718 /* flags a GIhEK, not BO or KE */ %token lexer_S_719 /* flags simple I */ %token lexer_T_720 /* flags I_JEK */ %token lexer_U_721 /* flags a JEK_BO */ %token lexer_V_722 /* flags a JOIK_BO */ %token lexer_W_723 /* flags a JOIK_KE */ /* %token lexer_X_724 /* null */ %token lexer_Y_725 /* flags a PA_MOI */ /* %token lexer_A_905 /* : lexer_A_701 utt_ordinal_root_906 */ /* %token lexer_B_910 /* : lexer_B_702 EK_root_911 */ /* %token lexer_C_915 /* : lexer_C_703 EK_root_911 BO_508 */ /* %token lexer_D_916 /* : lexer_D_704 EK_root_911 KE_551 */ /* %token lexer_E_925 /* : lexer_E_705 JEK_root_926 */ /* %token lexer_F_930 /* : lexer_F_706 JOIK_root_931 */ /* %token lexer_G_935 /* : lexer_G_707 GA_537 */ /* %token lexer_H_940 /* : lexer_H_708 GUhA_544 */ /* %token lexer_I_945 /* : lexer_I_709 NAhE_583 BO_508 */ /* %token lexer_J_950 /* : lexer_J_710 NA_578 KU_556 */ /* %token lexer_K_955 /* : lexer_K_711 I_432 BO_508 */ /* %token lexer_L_960 /* : lexer_L_712 number_root_961 */ /* %token lexer_M_965 /* : lexer_M_713 GIhEK_root_991 BO_508 */ /* %token lexer_N_966 /* : lexer_N_714 GIhEK_root_991 KE_551 */ /* %token lexer_O_970 /* : lexer_O_715 simple_tense_modal_972 */ /* %token lexer_P_980 /* : lexer_P_716 GIK_root_981 */ /* %token lexer_Q_985 /* : lexer_Q_717 lerfu_string_root_986 */ /* %token lexer_R_990 /* : lexer_R_718 GIhEK_root_991 */ /* %token lexer_S_995 /* : lexer_S_719 I_545 */ /* %token lexer_T_1000 /* : lexer_T_720 I_545 simple_JOIK_JEK_957 */ /* %token lexer_U_1005 /* : lexer_U_721 JEK_root_926 BO_508 */ /* %token lexer_V_1010 /* : lexer_V_722 JOIK_root_931 BO_508 */ /* %token lexer_W_1015 /* : lexer_W_723 JOIK_root_931 KE_551 */ /* %token lexer_X_1020 /* null */ /* %token lexer_Y_1025 /* : lexer_Y_725 number_root_961 MOI_663 */ %start text_0 %% text_0 : text_A_1 | indicators_411 text_A_1 | free_modifier_32 text_A_1 | cmene_404 text_A_1 | indicators_411 free_modifier_32 text_A_1 | NAI_581 text_0 ; text_A_1 : JOIK_JEK_422 text_B_2 /* incomplete JOIK_JEK without preceding I */ /* compare note on paragraph_10 */ | text_B_2 ; text_B_2 : I_819 text_B_2 | I_JEK_820 text_B_2 | I_BO_811 text_B_2 | para_mark_410 text_C_3 | text_C_3 ; text_C_3 : paragraphs_4 /* Only indicators which follow certain selma'o: cmene, TOI_607, LU_571, and the lexer_K and lexer_S I_roots and compounds, and at the start of text(_0), will survive the lexer; all other valid ones will be absorbed. The only strings for which indicators generate a potential ambiguity are those which contain NAI. An indicator cannot be inserted in between a token and its negating NAI, else you can’t tell whether it is the indicator or the original token being negated. */ | /* empty */ /* An empty text is legal; formerly this was handled by the explicit appearance of FAhO_529, but this is now absorbed by the preparser. */ ; paragraphs_4 : paragraph_10 | paragraph_10 para_mark_410 paragraphs_4 ; paragraph_10 : statement_11 | fragment_20 | paragraph_10 I_819 statement_11 | paragraph_10 I_819 fragment_20 | paragraph_10 I_819 /* this last fixes an erroneous start to a sentence, and permits incomplete JOIK_JEK after I, as well in answer to questions on those connectives */ ; statement_11 : statement_A_12 | prenex_30 statement_11 ; statement_A_12 : statement_B_13 | statement_A_12 I_JEK_820 statement_B_13 | statement_A_12 I_JEK_820 ; statement_B_13 : statement_C_14 | statement_C_14 I_BO_811 statement_B_13 | statement_C_14 I_BO_811 ; statement_C_14 : sentence_40 | TUhE_447 text_B_2 TUhU_gap_454 | tag_491 TUhE_447 text_B_2 TUhU_gap_454 ; fragment_20 : EK_802 | NA_445 | GIhEK_818 | quantifier_300 | terms_80 VAU_gap_456 /* answer to ma */ /* mod_head_490 requires both gap_450 and VAU_gap_456 but needs no extra rule to accomplish this */ | relative_clauses_121 | links_161 | linkargs_160 | prenex_30 ; prenex_30 : terms_80 ZOhU_492 ; free_modifier_32 : free_modifier_A_33 | free_modifier_A_33 free_modifier_32 ; free_modifier_A_33 : vocative_35 | parenthetical_36 | discursive_bridi_34 | subscript_486 | utterance_ordinal_801 ; discursive_bridi_34 : SEI_440 selbri_130 SEhU_gap_459 | SOI_498 sumti_90 SEhU_gap_459 | SOI_498 sumti_90 sumti_90 SEhU_gap_459 | SEI_440 terms_80 front_gap_451 selbri_130 SEhU_gap_459 | SEI_440 terms_80 selbri_130 SEhU_gap_459 ; vocative_35 : DOI_415 selbri_130 DOhU_gap_457 | DOI_415 selbri_130 relative_clauses_121 DOhU_gap_457 | DOI_415 relative_clauses_121 selbri_130 DOhU_gap_457 | DOI_415 relative_clauses_121 selbri_130 relative_clauses_121 DOhU_gap_457 | DOI_415 cmene_404 DOhU_gap_457 | DOI_415 cmene_404 relative_clauses_121 DOhU_gap_457 | DOI_415 relative_clauses_121 cmene_404 DOhU_gap_457 | DOI_415 relative_clauses_121 cmene_404 relative_clauses_121 DOhU_gap_457 | DOI_415 sumti_90 DOhU_gap_457 | DOI_415 DOhU_gap_457 ; parenthetical_36 : TO_606 text_0 TOI_gap_468 ; sentence_40 : bridi_tail_50 /* bare observative or mo answer */ | terms_80 front_gap_451 bridi_tail_50 | terms_80 bridi_tail_50 ; subsentence_41 : sentence_40 | prenex_30 subsentence_41 ; bridi_tail_50 : bridi_tail_A_51 | bridi_tail_A_51 GIhEK_KE_814 bridi_tail_50 KEhE_gap_466 tail_terms_71 ; bridi_tail_A_51 : bridi_tail_B_52 | bridi_tail_A_51 GIhEK_818 bridi_tail_B_52 tail_terms_71 ; bridi_tail_B_52 : bridi_tail_C_53 | bridi_tail_C_53 GIhEK_BO_813 bridi_tail_B_52 tail_terms_71 ; bridi_tail_C_53 : gek_sentence_54 | selbri_130 tail_terms_71 ; gek_sentence_54 : GEK_807 subsentence_41 GIK_816 subsentence_41 tail_terms_71 | tag_491 KE_493 gek_sentence_54 KEhE_gap_466 | NA_445 gek_sentence_54 ; tail_terms_71 : terms_80 VAU_gap_456 | VAU_gap_456 ; terms_80 : terms_A_81 | terms_80 terms_A_81 ; terms_A_81 : terms_B_82 | terms_A_81 PEhE_494 JOIK_JEK_422 terms_B_82 ; terms_B_82 : term_83 | terms_B_82 CEhE_495 term_83 ; term_83 : sumti_90 | modifier_84 | term_set_85 | NA_KU_810 ; modifier_84 : mod_head_490 gap_450 | mod_head_490 sumti_90 ; term_set_85 : NUhI_496 terms_80 NUhU_gap_460 | NUhI_496 GEK_807 terms_80 NUhU_gap_460 GIK_816 terms_80 NUhU_gap_460 ; sumti_90 : sumti_A_91 | sumti_A_91 VUhO_497 relative_clauses_121 ; sumti_A_91 : sumti_B_92 | sumti_B_92 EK_KE_804 sumti_90 KEhE_gap_466 | sumti_B_92 JOIK_KE_823 sumti_90 KEhE_gap_466 ; sumti_B_92 : sumti_C_93 | sumti_B_92 JOIK_EK_421 sumti_C_93 ; sumti_C_93 : sumti_D_94 | sumti_D_94 EK_BO_803 sumti_C_93 | sumti_D_94 JOIK_BO_822 sumti_C_93 ; sumti_D_94 : sumti_E_95 | GEK_807 sumti_90 GIK_816 sumti_D_94 ; sumti_E_95 : sumti_F_96 | sumti_F_96 relative_clauses_121 /* indefinite sumti */ | quantifier_300 selbri_130 gap_450 | quantifier_300 selbri_130 gap_450 relative_clauses_121 ; sumti_F_96 : sumti_G_97 /* outer-quantified sumti */ | quantifier_300 sumti_G_97 ; sumti_G_97 : qualifier_483 sumti_90 LUhU_gap_463 | qualifier_483 relative_clauses_121 sumti_90 LUhU_gap_463 /*sumti grouping, set/mass/individual conversion; also sumti scalar negation */ | anaphora_400 | LA_499 cmene_404 | LA_499 relative_clauses_121 cmene_404 | LI_489 MEX_310 LOhO_gap_472 | description_110 | quote_arg_432 ; description_110 : LA_499 sumti_tail_111 gap_450 | LE_488 sumti_tail_111 gap_450 ; sumti_tail_111 : sumti_tail_A_112 /* inner-quantified sumti relative clause */ | relative_clauses_121 sumti_tail_A_112 /* pseudo-possessive (an abbreviated inner restriction); note that sumti cannot be quantified */ | sumti_G_97 sumti_tail_A_112 /* pseudo-possessive with outer restriction */ | sumti_G_97 relative_clauses_121 sumti_tail_A_112 ; sumti_tail_A_112 : selbri_130 | selbri_130 relative_clauses_121 /* explicit inner quantifier */ | quantifier_300 selbri_130 /* quantifier both internal to a description, and external to a sumti thereby made specific */ | quantifier_300 selbri_130 relative_clauses_121 | quantifier_300 sumti_90 ; relative_clauses_121 : relative_clause_122 | relative_clauses_121 ZIhE_487 relative_clause_122 ; relative_clause_122 : GOI_485 term_83 GEhU_gap_464 | NOI_484 subsentence_41 KUhO_gap_469 ; selbri_130 : tag_491 selbri_A_131 | selbri_A_131 ; selbri_A_131 : selbri_B_132 | NA_445 selbri_130 ; selbri_B_132 : selbri_C_133 | selbri_C_133 CO_443 selbri_B_132 ; selbri_C_133 : selbri_D_134 | selbri_C_133 selbri_D_134 ; selbri_D_134 : selbri_E_135 | selbri_D_134 JOIK_JEK_422 selbri_E_135 | selbri_D_134 JOIK_KE_823 selbri_C_133 KEhE_gap_466 ; selbri_E_135 : selbri_F_136 | selbri_F_136 JEK_BO_821 selbri_E_135 | selbri_F_136 JOIK_BO_822 selbri_E_135 ; selbri_F_136 : tanru_unit_150 | tanru_unit_150 BO_479 selbri_F_136 | GUhEK_selbri_137 | NAhE_482 GUhEK_selbri_137 ; GUhEK_selbri_137 : GUhEK_808 selbri_130 GIK_816 selbri_F_136 ; tanru_unit_150 : tanru_unit_A_151 | tanru_unit_150 CEI_444 tanru_unit_A_151 ; tanru_unit_A_151 : tanru_unit_B_152 | tanru_unit_B_152 linkargs_160 ; tanru_unit_B_152 : bridi_valsi_407 | KE_493 selbri_C_133 KEhE_gap_466 | SE_480 tanru_unit_B_152 | JAI_478 tag_491 tanru_unit_B_152 | JAI_478 tanru_unit_B_152 | ME_477 sumti_90 MEhU_gap_465 | ME_477 sumti_90 MEhU_gap_465 MOI_476 | NUhA_475 MEX_operator_374 | NAhE_482 tanru_unit_B_152 | NU_425 subsentence_41 KEI_gap_453 ; linkargs_160 : BE_446 term_83 BEhO_gap_467 | BE_446 term_83 links_161 BEhO_gap_467 ; links_161 : BEI_442 term_83 | BEI_442 term_83 links_161 ; /* Main entry point for MEX; everything but a number must be in parens. */ quantifier_300 : number_812 BOI_gap_461 | left_bracket_470 MEX_310 right_bracket_gap_471 ; /* Entry point for MEX used after LI; no parens needed, but LI now has an elidable terminator. (This allows us to express the difference between “the expression a + b” and “the expression (a + b)” ) */ /* This rule supports left-grouping infix expressions and reverse Polish expressions. To handle infix monadic, use a null operand; to handle infix with more than two operands (whatever that means) use an extra operator or an array operand. */ MEX_310 : MEX_A_311 | MEX_310 operator_370 MEX_A_311 | FUhA_441 rp_expression_330 ; /* Support for right-grouping (short scope) infix expressions with BIhE. */ MEX_A_311 : MEX_B_312 | MEX_B_312 BIhE_439 operator_370 MEX_A_311 ; /* Support for forethought (Polish) expressions. These begin with a forethought flag, then the operator and then the argument(s). */ MEX_B_312 : operand_381 | operator_370 MEX_C_313 MEX_gap_452 | PEhO_438 operator_370 MEX_C_313 MEX_gap_452 ; MEX_C_313 : MEX_B_312 | MEX_C_313 MEX_B_312 ; /* Reverse Polish expressions always have exactly two operands. To handle one operand, use a null operand; to handle more than two operands, use a null operator. */ rp_expression_330 : rp_operand_332 rp_operand_332 operator_370 ; rp_operand_332 : operand_381 | rp_expression_330 ; /* Operators may be joined by logical connectives. */ operator_370 : operator_A_371 | operator_370 JOIK_JEK_422 operator_A_371 | operator_370 JOIK_KE_823 operator_370 KEhE_gap_466 ; operator_A_371 : operator_B_372 | GUhEK_808 operator_A_371 GIK_816 operator_B_372 | operator_B_372 JOIK_BO_822 operator_A_371 | operator_B_372 JEK_BO_821 operator_A_371 ; operator_B_372 : MEX_operator_374 | KE_493 operator_370 KEhE_gap_466 ; MEX_operator_374 : VUhU_679 | VUhU_679 free_modifier_32 | SE_480 MEX_operator_374 /* changes argument order */ | NAhE_482 MEX_operator_374 /* scalar negation */ | MAhO_430 MEX_310 TEhU_gap_473 | NAhU_429 selbri_130 TEhU_gap_473 ; operand_381 : operand_A_382 | operand_A_382 EK_KE_804 operand_381 KEhE_gap_466 | operand_A_382 JOIK_KE_823 operand_381 KEhE_gap_466 ; operand_A_382 : operand_B_383 | operand_A_382 JOIK_EK_421 operand_B_383 ; operand_B_383 : operand_C_385 | operand_C_385 EK_BO_803 operand_B_383 | operand_C_385 JOIK_BO_822 operand_B_383 ; operand_C_385 : quantifier_300 | lerfu_string_817 BOI_gap_461 /* lerfu string as operand - classic math variable */ | NIhE_428 selbri_130 TEhU_gap_473 /* quantifies a bridi - inverse of -MOI */ | MOhE_427 sumti_90 TEhU_gap_473 /* quantifies a sumti - inverse of LI */ | JOhI_431 MEX_C_313 TEhU_gap_473 | GEK_807 operand_381 GIK_816 operand_C_385 | qualifier_483 operand_381 LUhU_gap_463 ; /* _400 series constructs are mostly specific strings, some of which may also be used by the lexer; the lexer should not use any reference to terminals numbered less than _400, as they have grammars composed on non-deterministic strings of selma'o. Some above _400 also are this way, so care should be taken; this is especially true for those that reference free_modifier_32. */ anaphora_400 : KOhA_555 | KOhA_555 free_modifier_32 | lerfu_string_817 BOI_gap_461 ; cmene_404 : cmene_A_405 | cmene_A_405 free_modifier_32 ; cmene_A_405 : CMENE_518 /* pause */ | cmene_A_405 CMENE_518 /* pause*/ /* multiple CMENE are identified morphologically (by the lexer) -- separated by consonant & pause */ ; bridi_valsi_407 : bridi_valsi_A_408 | bridi_valsi_A_408 free_modifier_32 ; bridi_valsi_A_408 : BRIVLA_509 | PA_MOI_824 | GOhA_543 | GOhA_543 RAhO_593 ; para_mark_410 : NIhO_584 | NIhO_584 free_modifier_32 | NIhO_584 para_mark_410 ; indicators_411 : indicators_A_412 | FUhE_535 indicators_A_412 ; indicators_A_412 : indicator_413 | indicators_A_412 indicator_413 ; indicator_413 : UI_612 | CAI_515 | UI_612 NAI_581 | CAI_515 NAI_581 | Y_619 | DAhO_524 | FUhO_536 ; DOI_415 : DOI_525 | COI_416 | COI_416 DOI_525 ; COI_416 : COI_A_417 | COI_416 COI_A_417 ; COI_A_417 : COI_520 | COI_520 NAI_581 ; JOIK_EK_421 : EK_802 | JOIK_806 | JOIK_806 free_modifier_32 ; JOIK_JEK_422 : JOIK_806 | JOIK_806 free_modifier_32 | JEK_805 | JEK_805 free_modifier_32 ; XI_424 : XI_618 | XI_618 free_modifier_32 ; NU_425 : NU_A_426 | NU_425 JOIK_JEK_422 NU_A_426 ; NU_A_426 : NU_586 | NU_586 NAI_581 | NU_586 free_modifier_32 | NU_586 NAI_581 free_modifier_32 ; MOhE_427 : MOhE_664 | MOhE_664 free_modifier_32 ; NIhE_428 : NIhE_666 | NIhE_666 free_modifier_32 ; NAhU_429 : NAhU_665 | NAhU_665 free_modifier_32 ; MAhO_430 : MAhO_662 | MAhO_662 free_modifier_32 ; JOhI_431 : JOhI_657 | JOhI_657 free_modifier_32 ; quote_arg_432 : quote_arg_A_433 | quote_arg_A_433 free_modifier_32 ; quote_arg_A_433 : ZOI_quote_434 | ZO_quote_435 | LOhU_quote_436 | LU_571 text_0 LIhU_gap_448 ; /* The quoted material in the following three terminals must be identified by the lexer, but no additional lexer processing is needed. */ ZOI_quote_434 : ZOI_627 any_word_698 /*pause*/ anything_699 /*pause*/ any_word_698 ; /* ’pause’ is morphemic, represented by ’.’ The lexer assembles anything_699 */ ZO_quote_435 : ZO_626 any_word_698 ; /* ’word’ may not be a compound; but it can be any valid Lojban selma'o value, including ZO, ZOI, SI, SA, SU. The preparser will not lex the word per its normal selma'o. */ LOhU_quote_436 : LOhU_569 any_words_697 LEhU_565 ; /* ’words’ may be any Lojban words, with no claim of grammaticality; the preparser will not lex the individual words per their normal selma'o; used to quote ungrammatical Lojban, equivalent to the * or ? writing convention for such text. */ /* The preparser needs one bit of sophistication for this rule. A quoted string should be able to contain other quoted strings - this is only a problem for a LOhU quote itself, since the LEhU clossing this quote would otherwise close the outer quotes, which is incorrect. For this purpose, we will cheat on the use of ZO in such a quote (since this is ungrammatical text, it is a sign ignored by the parser). Use ZO to mark any nested quotation LOhU. The preparser then will absorb it by the ZO rule, before testing for LOhU. This is obviously not the standard usage for ZO, which would otherwise cause the result to be a sumti. But, since the result will be part of an unparsed string anyway, it doesn’t matter. */ /* It may be seen that any of the ZO/ZOI/LOhU trio of quotation markers may contain the powerful metalinguistic erasers. Since these quotations are not parsed internally, these operators are ignored within the quote. To erase a ZO, then, two SI’s are needed after giving a quoted word of any type. ZOI takes four SI’s, with the ENTIRE BODY OF THE QUOTE treated as a single ’word’ since it is one selma'o. Thus one for the quote body, two for the single word delimiters, and one for the ZOI. In LOhU, the entire body is treated as a single word, so three SI’s can erase it. */ /* All rule terminator names with ’gap’ in them are potentially elidable, where such elision does not cause an ambiguity. This is implemented through use of the YACC ’error’ token, which effectively recovers from an elision. */ FIhO_437 : FIhO_532 | FIhO_532 free_modifier_32 ; PEhO_438 : PEhO_673 | PEhO_673 free_modifier_32 ; BIhE_439 : BIhE_650 | BIhE_650 free_modifier_32 ; SEI_440 : SEI_597 | SEI_597 free_modifier_32 ; FUhA_441 : FUhA_655 | FUhA_655 free_modifier_32 ; BEI_442 : BEI_505 | BEI_505 free_modifier_32 ; CO_443 : CO_519 | CO_519 free_modifier_32 ; CEI_444 : CEI_516 | CEI_516 free_modifier_32 ; NA_445 : NA_578 | NA_578 free_modifier_32 ; BE_446 : BE_504 | BE_504 free_modifier_32 ; TUhE_447 : TUhE_610 | TUhE_610 free_modifier_32 ; LIhU_gap_448 : LIhU_567 | error ; gap_450 : KU_556 | KU_556 free_modifier_32 | error ; front_gap_451 : CU_521 | CU_521 free_modifier_32 ; MEX_gap_452 : KUhE_658 | KUhE_658 free_modifier_32 | error ; KEI_gap_453 : KEI_552 | KEI_552 free_modifier_32 | error ; TUhU_gap_454 : TUhU_611 | TUhU_611 free_modifier_32 | error ; VAU_gap_456 : VAU_614 | VAU_614 free_modifier_32 | error ; /* redundant to attach a free modifier on the following */ DOhU_gap_457 : DOhU_526 | error ; FEhU_gap_458 : FEhU_531 | FEhU_531 free_modifier_32 | error ; SEhU_gap_459 : SEhU_598 | error /* a free modifier on a discursive should be somewhere within the discursive. See SEI_440 */ ; NUhU_gap_460 : NUhU_588 | NUhU_588 free_modifier_32 | error ; BOI_gap_461 : BOI_651 | BOI_651 free_modifier_32 | error ; sub_gap_462 : BOI_651 | error ; LUhU_gap_463 : LUhU_573 | LUhU_573 free_modifier_32 | error ; GEhU_gap_464 : GEhU_538 | GEhU_538 free_modifier_32 | error ; MEhU_gap_465 : MEhU_575 | MEhU_575 free_modifier_32 | error ; KEhE_gap_466 : KEhE_550 | KEhE_550 free_modifier_32 | error ; BEhO_gap_467 : BEhO_506 | BEhO_506 free_modifier_32 | error ; TOI_gap_468 : TOI_607 | error ; KUhO_gap_469 : KUhO_557 | KUhO_557 free_modifier_32 | error ; left_bracket_470 : VEI_677 | VEI_677 free_modifier_32 ; right_bracket_gap_471 : VEhO_678 | VEhO_678 free_modifier_32 | error ; LOhO_gap_472 : LOhO_568 | LOhO_568 free_modifier_32 | error ; TEhU_gap_473 : TEhU_675 | TEhU_675 free_modifier_32 | error ; right_br_no_free_474 : VEhO_678 | error ; NUhA_475 : NUhA_667 | NUhA_667 free_modifier_32 ; MOI_476 : MOI_663 | MOI_663 free_modifier_32 ; ME_477 : ME_574 | ME_574 free_modifier_32 ; JAI_478 : JAI_547 | JAI_547 free_modifier_32 ; BO_479 : BO_508 | BO_508 free_modifier_32 ; SE_480 : SE_596 | SE_596 free_modifier_32 ; FA_481 : FA_527 | FA_527 free_modifier_32 ; NAhE_482 : NAhE_583 | NAhE_583 free_modifier_32 ; qualifier_483 : LAhE_561 | LAhE_561 free_modifier_32 | NAhE_BO_809 ; NOI_484 : NOI_585 | NOI_585 free_modifier_32 ; GOI_485 : GOI_542 | GOI_542 free_modifier_32 ; subscript_486 : XI_424 number_812 sub_gap_462 | XI_424 left_bracket_470 MEX_310 right_br_no_free_474 | XI_424 lerfu_string_817 sub_gap_462 ; ZIhE_487 : ZIhE_625 | ZIhE_625 free_modifier_32 ; LE_488 : LE_562 | LE_562 free_modifier_32 ; LI_489 : LI_566 | LI_566 free_modifier_32 ; mod_head_490 : tag_491 | FA_481 ; tag_491 : tense_modal_815 | tag_491 JOIK_JEK_422 tense_modal_815 ; ZOhU_492 : ZOhU_628 | ZOhU_628 free_modifier_32 ; KE_493 : KE_551 | KE_551 free_modifier_32 ; PEhE_494 : PEhE_591 | PEhE_591 free_modifier_32 ; CEhE_495 : CEhE_517 | CEhE_517 free_modifier_32 ; NUhI_496 : NUhI_587 | NUhI_587 free_modifier_32 ; VUhO_497 : VUhO_617 | VUhO_617 free_modifier_32 ; SOI_498 : SOI_602 | SOI_602 free_modifier_32 ; LA_499 : LA_558 | LA_558 free_modifier_32 ; utterance_ordinal_801 : lexer_A_905 ; EK_802 : lexer_B_910 | lexer_B_910 free_modifier_32 ; EK_BO_803 : lexer_C_915 | lexer_C_915 free_modifier_32 ; EK_KE_804 : lexer_D_916 | lexer_D_916 free_modifier_32 ; JEK_805 : lexer_E_925 ; JOIK_806 : lexer_F_930 ; GEK_807 : lexer_G_935 | lexer_G_935 free_modifier_32 ; GUhEK_808 : lexer_H_940 | lexer_H_940 free_modifier_32 ; NAhE_BO_809 : lexer_I_945 | lexer_I_945 free_modifier_32 ; NA_KU_810 : lexer_J_950 | lexer_J_950 free_modifier_32 ; I_BO_811 : lexer_K_955 | lexer_K_955 free_modifier_32 ; number_812 : lexer_L_960 ; GIhEK_BO_813 : lexer_M_965 | lexer_M_965 free_modifier_32 ; GIhEK_KE_814 : lexer_N_966 | lexer_N_966 free_modifier_32 ; tense_modal_815 : lexer_O_970 | lexer_O_970 free_modifier_32 | FIhO_437 selbri_130 FEhU_gap_458 ; GIK_816 : lexer_P_980 | lexer_P_980 free_modifier_32 ; lerfu_string_817 : lexer_Q_985 ; GIhEK_818 : lexer_R_990 | lexer_R_990 free_modifier_32 ; I_819 : lexer_S_995 | lexer_S_995 free_modifier_32 ; I_JEK_820 : lexer_T_1000 | lexer_T_1000 free_modifier_32 ; JEK_BO_821 : lexer_U_1005 | lexer_U_1005 free_modifier_32 ; JOIK_BO_822 : lexer_V_1010 | lexer_V_1010 free_modifier_32 ; JOIK_KE_823 : lexer_W_1015 | lexer_W_1015 free_modifier_32 ; PA_MOI_824 : lexer_Y_1025 ; /* The following rules are used only in lexer processing. They have been tested for ambiguity at various levels in the YACC grammar, but are in the recursive descent lexer in the current parser. The lexer inserts the lexer tokens before the processed strings, but leaves the original tokens. */ lexer_A_905 : lexer_A_701 utt_ordinal_root_906 ; utt_ordinal_root_906 : lerfu_string_root_986 MAI_661 | number_root_961 MAI_661 ; lexer_B_910 : lexer_B_702 EK_root_911 ; EK_root_911 : A_501 | SE_596 A_501 | NA_578 A_501 | A_501 NAI_581 | SE_596 A_501 NAI_581 | NA_578 A_501 NAI_581 | NA_578 SE_596 A_501 | NA_578 SE_596 A_501 NAI_581 ; lexer_C_915 : lexer_C_703 EK_root_911 BO_508 | lexer_C_703 EK_root_911 simple_tag_971 BO_508 ; lexer_D_916 : lexer_D_704 EK_root_911 KE_551 | lexer_D_704 EK_root_911 simple_tag_971 KE_551 ; lexer_E_925 : lexer_E_705 JEK_root_926 ; JEK_root_926 : JA_546 | JA_546 NAI_581 | NA_578 JA_546 | NA_578 JA_546 NAI_581 | SE_596 JA_546 | SE_596 JA_546 NAI_581 | NA_578 SE_596 JA_546 | NA_578 SE_596 JA_546 NAI_581 ; lexer_F_930 : lexer_F_706 JOIK_root_931 ; JOIK_root_931 : JOI_548 | JOI_548 NAI_581 | SE_596 JOI_548 | SE_596 JOI_548 NAI_581 | interval_932 | GAhO_656 interval_932 GAhO_656 ; interval_932 : BIhI_507 | BIhI_507 NAI_581 | SE_596 BIhI_507 | SE_596 BIhI_507 NAI_581 ; lexer_G_935 : lexer_G_707 GA_537 | lexer_G_707 SE_596 GA_537 | lexer_G_707 GA_537 NAI_581 | lexer_G_707 SE_596 GA_537 NAI_581 | lexer_G_707 simple_tag_971 GIK_root_981 | lexer_G_707 JOIK_root_931 GI_539 ; lexer_H_940 : lexer_H_708 GUhA_544 | lexer_H_708 SE_596 GUhA_544 | lexer_H_708 GUhA_544 NAI_581 | lexer_H_708 SE_596 GUhA_544 NAI_581 ; lexer_I_945 : lexer_I_709 NAhE_583 BO_508 ; lexer_J_950 : lexer_J_710 NA_578 KU_556 ; lexer_K_955 : lexer_K_711 I_root_956 BO_508 | lexer_K_711 I_root_956 simple_tag_971 BO_508 ; I_root_956 : I_545 | I_545 simple_JOIK_JEK_957 ; simple_JOIK_JEK_957 : JOIK_806 | JEK_805 ; /* no freemod in this version; cf. JOIK_JEK_422 */ /* this reference to a version of JOIK and JEK which already have the lexer tokens attached prevents shift/reduce errors. The problem is resolved in a hard-coded parser implementation which builds lexer_K, before lexer_S, before lexer_E and lexer_F. */ lexer_L_960 : lexer_L_712 number_root_961 ; number_root_961 : PA_672 | number_root_961 PA_672 | number_root_961 lerfu_word_987 ; lexer_M_965 : lexer_M_713 GIhEK_root_991 BO_508 | lexer_M_713 GIhEK_root_991 simple_tag_971 BO_508 ; lexer_N_966 : lexer_N_714 GIhEK_root_991 KE_551 | lexer_N_714 GIhEK_root_991 simple_tag_971 KE_551 ; lexer_O_970 : lexer_O_715 simple_tense_modal_972 ; /* the following rule is a lexer version of non-terminal_815 for compounding PU/modals; it disallows the lexer picking out FIhO clauses, which would require it to have knowledge of the main parser grammar */ simple_tag_971 : simple_tense_modal_972 | simple_tag_971 simple_JOIK_JEK_957 simple_tense_modal_972 ; simple_tense_modal_972 : simple_tense_modal_A_973 | NAhE_583 simple_tense_modal_A_973 | KI_554 | CUhE_522 ; simple_tense_modal_A_973: modal_974 | modal_974 KI_554 | tense_A_977 ; modal_974 : modal_A_975 | modal_A_975 NAI_581 ; modal_A_975 : BAI_502 | SE_596 BAI_502 ; tense_A_977 : tense_B_978 | tense_B_978 KI_554 ; tense_B_978 : tense_C_979 | CAhA_514 | tense_C_979 CAhA_514 ; /* specifies actuality/potentiality of the bridi */ /* puca'a = actually was */ /* baca'a = actually will be */ /* bapu'i = can and will have */ /* banu'o = can, but won’t have yet */ /* canu'ojebapu'i = can, hasn’t yet, but will */ tense_C_979 : time_1030 /* time-only */ /* space defaults to time-space reference space */ | space_1040 /* can include time if specified with VIhA; otherwise time defaults to the time-space reference time */ | time_1030 space_1040 /* time and space - If space_1040 is marked with VIhA for space-time the tense may be self-contradictory */ /* interval prop before space_time is for time distribution */ | space_1040 time_1030 ; lexer_P_980 : lexer_P_716 GIK_root_981 ; GIK_root_981 : GI_539 | GI_539 NAI_581 ; lexer_Q_985 : lexer_Q_717 lerfu_string_root_986 ; lerfu_string_root_986 : lerfu_word_987 | lerfu_string_root_986 lerfu_word_987 | lerfu_string_root_986 PA_672 ; lerfu_word_987 : BY_513 | LAU_559 lerfu_word_987 | TEI_605 lerfu_string_root_986 FOI_533 ; lexer_R_990 : lexer_R_718 GIhEK_root_991 ; GIhEK_root_991 : GIhA_541 | SE_596 GIhA_541 | NA_578 GIhA_541 | GIhA_541 NAI_581 | SE_596 GIhA_541 NAI_581 | NA_578 GIhA_541 NAI_581 | NA_578 SE_596 GIhA_541 | NA_578 SE_596 GIhA_541 NAI_581 ; lexer_S_995 : lexer_S_719 I_545 ; lexer_T_1000 : lexer_T_720 I_545 simple_JOIK_JEK_957 ; lexer_U_1005 : lexer_U_721 JEK_root_926 BO_508 | lexer_U_721 JEK_root_926 simple_tag_971 BO_508 ; lexer_V_1010 : lexer_V_722 JOIK_root_931 BO_508 | lexer_V_722 JOIK_root_931 simple_tag_971 BO_508 ; lexer_W_1015 : lexer_W_723 JOIK_root_931 KE_551 | lexer_W_723 JOIK_root_931 simple_tag_971 KE_551 ; lexer_Y_1025 : lexer_Y_725 number_root_961 MOI_663 | lexer_Y_725 lerfu_string_root_986 MOI_663 ; time_1030 : ZI_624 | ZI_624 time_A_1031 | time_A_1031 ; time_A_1031 : time_B_1032 | time_interval_1034 | time_B_1032 time_interval_1034 ; time_B_1032 : time_offset_1033 | time_B_1032 time_offset_1033 ; time_offset_1033 : time_direction_1035 | time_direction_1035 ZI_624 ; time_interval_1034 : ZEhA_622 | ZEhA_622 time_direction_1035 | time_int_props_1036 | ZEhA_622 time_int_props_1036 | ZEhA_622 time_direction_1035 time_int_props_1036 ; time_direction_1035 : PU_592 | PU_592 NAI_581 ; time_int_props_1036 : interval_property_1051 | time_int_props_1036 interval_property_1051 ; space_1040 : space_A_1042 | space_motion_1041 | space_A_1042 space_motion_1041 ; space_motion_1041 : MOhI_577 space_offset_1045 ; space_A_1042 : VA_613 | VA_613 space_B_1043 | space_B_1043 ; space_B_1043 : space_C_1044 | space_intval_1046 | space_C_1044 space_intval_1046 ; space_C_1044 : space_offset_1045 | space_C_1044 space_offset_1045 ; space_offset_1045 : space_direction_1048 | space_direction_1048 VA_613 ; space_intval_1046 : space_intval_A_1047 | space_intval_A_1047 space_direction_1048 | space_int_props_1049 | space_intval_A_1047 space_int_props_1049 | space_intval_A_1047 space_direction_1048 space_int_props_1049 ; space_intval_A_1047 : VEhA_615 | VIhA_616 | VEhA_615 VIhA_616 ; space_direction_1048 : FAhA_528 | FAhA_528 NAI_581 ; space_int_props_1049 : space_int_props_A_1050 | space_int_props_1049 space_int_props_A_1050 ; space_int_props_A_1050 : FEhE_530 interval_property_1051 ; /* This terminal gives an interval size in space-time (VEhA), and possibly a dimensionality of the interval. The dimensionality may also be used with the interval size left unspecified. When this terminal is used for the spacetime origin, then barring any overriding VIhA, a VIhA here defines the dimensionality of the space-time being discussed. */ interval_property_1051 : number_root_961 ROI_594 | number_root_961 ROI_594 NAI_581 | TAhE_604 | TAhE_604 NAI_581 | ZAhO_621 | ZAhO_621 NAI_581 ; /* extensional/intensional interval parameters */ /* These may be appended to any defined interval, or may stand in place of either time or space tenses. If no other tense is present, this terminal stands for the time-space interval parameter with an unspecified interval.*/ /* roroi = always and everywhere */ /* roroiku'avi = always here (ku'a = intersection) */ /* puroroi = always in the past /* paroi = once upon a time (somewhere) */ /* paroiku'avi = once upon a time here */ /* The following are “Lexer-only rules”, covered by steps 1-4 described at the beginning. The grammar of these constructs is nonexistent, except possibly in cases where they interact with each other. Even there, however, the effects are semantic rather than grammatical. Where it is believed possible that conflicts could exist, the grammar of these constructs has been put in the above grammar, even though the lexer/Preparser will actually prevent these from being passed thru to the parse routine. (Otherwise we have to put unacceptably fancy code in the PreParser to determine just when these can be passed thru, and when they can’t.) Constructs in this category include quotes and indicators as defined above. (The above grammar handles utterance scope (free_modifier) and clause scope (gap) applications of the latter, however, and indicators should be allowed to be absorbed into almost any word without changing its grammar. SI_601, SA_595, and SU_603 are metalinguistic erasers. token_1100 : any_word_698 | BAhE_503 any_word_698 | anything_699 | any_word_698 BU_511 | any_word_698 DAhO_524 | any_word_698 FUhO_536 | any_word_698 FUhE_535 | any_word_698 UI_612 | any_word_698 UI_612 NAI_581 | any_word_698 Y_619 | any_word_698 CAI_515 | any_word_698 CAI_515 NAI_581 | UI_612 NAI_581 | CAI_515 NAI_581 ; null_1101 : any_word_698 SI_601 | possibly_unlexable_word (PAUSE) SI_601 | utterance_20 SA_595 | possibly unlexable string (PAUSE) SA_595 erases back to the last individual token I or NIhO or start of text, ignoring the insides of ZOI, ZO, and LOhU/LEhU quotes. Start of text is defined for SU below. | text_C_3 SU_603 | possibly unparsable text (PAUSE) SU_603 erases back to start of text which is the beginning of a speaker’s statement, a parenthesis (TO/TOI), a LU/LIhU quote, or a TUhE/TUhU utterance string. ; */ %% 2. YACC Grammar Cross-Reference A_501 EK_root_911 anaphora_400 sumti_G_97 anything_699 token_1100, ZOI_quote_434 any_word_698 null_1101, token_1100, ZOI_quote_434, ZO_quote_435 any_words_697 LOhU_quote_436 BAhE_503 token_1100 BAI_502 modal_A_975 BE_446 linkargs_160 BE_504 BE_446 BEhO_506 BEhO_gap_467 BEhO_gap_467 linkargs_160 BEI_442 links_161 BEI_505 BEI_442 BIhE_439 MEX_A_311 BIhE_650 BIhE_439 BIhI_507 interval_932 BO_479 selbri_F_136 BO_508 BO_479, lexer_C_915, lexer_I_945, lexer_K_955, lexer_M_965, lexer_U_1005, lexer_V_1010 BOI_651 BOI_gap_461, sub_gap_462 BOI_gap_461 anaphora_400, operand_C_385, quantifier_300 bridi_tail_50 bridi_tail_50, sentence_40 bridi_tail_A_51 bridi_tail_50, bridi_tail_A_51 bridi_tail_B_52 bridi_tail_A_51, bridi_tail_B_52 bridi_tail_C_53 bridi_tail_B_52 bridi_valsi_407 tanru_unit_B_152 bridi_valsi_A_408 bridi_valsi_407 BRIVLA_509 bridi_valsi_A_408 BU_511 token_1100 BY_513 lerfu_word_987 CAhA_514 tense_B_978 CAI_515 indicator_413, token_1100 CEhE_495 terms_B_82 CEhE_517 CEhE_495 CEI_444 tanru_unit_150 CEI_516 CEI_444 cmene_404 sumti_G_97, text_0, vocative_35 CMENE_518 cmene_A_405 cmene_A_405 cmene_404, cmene_A_405 CO_443 selbri_B_132 CO_519 CO_443 COI_416 COI_416, DOI_415 COI_520 COI_A_417 COI_A_417 COI_416 CU_521 front_gap_451 CUhE_522 simple_tense_modal_972 DAhO_524 indicator_413, token_1100 description_110 sumti_G_97 discursive_bridi_34 free_modifier_A_33 DOhU_526 DOhU_gap_457 DOhU_gap_457 vocative_35 DOI_415 vocative_35 DOI_525 DOI_415 EK_802 fragment_20, JOIK_EK_421 EK_BO_803 operand_B_383, sumti_C_93 EK_KE_804 operand_381, sumti_A_91 EK_root_911 lexer_B_910, lexer_C_915, lexer_D_916 error BEhO_gap_467, BOI_gap_461, DOhU_gap_457, FEhU_gap_458, gap_450, GEhU_gap_464, KEhE_gap_466, KEI_gap_453, KUhO_gap_469, LIhU_gap_448, LOhO_gap_472, LUhU_gap_463, MEhU_gap_465, MEX_gap_452, NUhU_gap_460, right_bracket_gap_471, right_br_no_free_474, SEhU_gap_459, sub_gap_462, TEhU_gap_473, TOI_gap_468, TUhU_gap_454, VAU_gap_456 FA_481 mod_head_490 FA_527 FA_481 FAhA_528 space_direction_1048 FEhE_530 space_int_props_A_1050 FEhU_531 FEhU_gap_458 FEhU_gap_458 tense_modal_815 FIhO_437 tense_modal_815 FIhO_532 FIhO_437 FOI_533 lerfu_word_987 fragment_20 paragraph_10 free_modifier_32 anaphora_400, BE_446, BEhO_gap_467, BEI_442, BIhE_439, BO_479, BOI_gap_461, bridi_valsi_407, CEhE_495, CEI_444, cmene_404, CO_443, EK_802, EK_BO_803, EK_KE_804, FA_481, FEhU_gap_458, FIhO_437, free_modifier_32, front_gap_451, FUhA_441, gap_450, GEhU_gap_464, GEK_807, GIhEK_818, GIhEK_BO_813, GIhEK_KE_814, GIK_816, GOI_485, GUhEK_808, I_819, I_BO_811, I_JEK_820, JAI_478, JEK_BO_821, JOhI_431, JOIK_BO_822, JOIK_EK_421, JOIK_JEK_422, JOIK_KE_823, KE_493, KEhE_gap_466, KEI_gap_453, KUhO_gap_469, LA_499, LE_488, left_bracket_470, LI_489, LOhO_gap_472, LUhU_gap_463, MAhO_430, ME_477, MEhU_gap_465, MEX_gap_452, MEX_operator_374, MOhE_427, MOI_476, NA_445, NAhE_482, NAhE_BO_809, NAhU_429, NA_KU_810, NIhE_428, NOI_484, NU_A_426, NUhA_475, NUhI_496, NUhU_gap_460, para_mark_410, PEhE_494, PEhO_438, qualifier_483, quote_arg_432, right_bracket_gap_471, SE_480, SEI_440, SOI_498, TEhU_gap_473, tense_modal_815, text_0, TUhE_447, TUhU_gap_454, VAU_gap_456, VUhO_497, XI_424, ZIhE_487, ZOhU_492 free_modifier_A_33 free_modifier_32 front_gap_451 discursive_bridi_34, sentence_40 FUhA_441 MEX_310 FUhA_655 FUhA_441 FUhE_535 indicators_411, token_1100 FUhO_536 indicator_413, token_1100 GA_537 lexer_G_935 GAhO_656 JOIK_root_931 gap_450 description_110, modifier_84, sumti_E_95 GEhU_538 GEhU_gap_464 GEhU_gap_464 relative_clause_122 GEK_807 gek_sentence_54, operand_C_385, sumti_D_94, term_set_85 gek_sentence_54 bridi_tail_C_53, gek_sentence_54 GI_539 GIK_root_981, lexer_G_935 GIhA_541 GIhEK_root_991 GIhEK_818 bridi_tail_A_51, fragment_20 GIhEK_BO_813 bridi_tail_B_52 GIhEK_KE_814 bridi_tail_50 GIhEK_root_991 lexer_M_965, lexer_N_966, lexer_R_990 GIK_816 gek_sentence_54, GUhEK_selbri_137, operand_C_385, operator_A_371, sumti_D_94, term_set_85 GIK_root_981 lexer_G_935, lexer_P_980 GOhA_543 bridi_valsi_A_408 GOI_485 relative_clause_122 GOI_542 GOI_485 GUhA_544 lexer_H_940 GUhEK_808 GUhEK_selbri_137, operator_A_371 GUhEK_selbri_137 selbri_F_136 I_545 I_root_956, lexer_S_995, lexer_T_1000 I_819 paragraph_10, text_B_2 I_BO_811 statement_B_13, text_B_2 I_JEK_820 statement_A_12, text_B_2 indicator_413 indicators_A_412 indicators_411 text_0 indicators_A_412 indicators_411, indicators_A_412 interval_932 JOIK_root_931 interval_property_1051 space_int_props_A_1050, time_int_props_1036 I_root_956 lexer_K_955 JA_546 JEK_root_926 JAI_478 tanru_unit_B_152 JAI_547 JAI_478 JEK_805 JOIK_JEK_422, simple_JOIK_JEK_957 JEK_BO_821 operator_A_371, selbri_E_135 JEK_root_926 lexer_E_925, lexer_U_1005 JOhI_431 operand_C_385 JOhI_657 JOhI_431 JOI_548 JOIK_root_931 JOIK_806 JOIK_EK_421, JOIK_JEK_422, simple_JOIK_JEK_957 JOIK_BO_822 operand_B_383, operator_A_371, selbri_E_135, sumti_C_93 JOIK_EK_421 operand_A_382, sumti_B_92 JOIK_JEK_422 NU_425, operator_370, selbri_D_134, tag_491, terms_A_81, text_A_1 JOIK_KE_823 operand_381, operator_370, selbri_D_134, sumti_A_91 JOIK_root_931 lexer_F_930, lexer_G_935, lexer_V_1010, lexer_W_1015 KE_493 gek_sentence_54, operator_B_372, tanru_unit_B_152 KE_551 KE_493, lexer_D_916, lexer_N_966, lexer_W_1015 KEhE_550 KEhE_gap_466 KEhE_gap_466 bridi_tail_50, gek_sentence_54, operand_381, operator_370, operator_B_372, selbri_D_134, sumti_A_91, tanru_unit_B_152 KEI_552 KEI_gap_453 KEI_gap_453 tanru_unit_B_152 KI_554 simple_tense_modal_972, simple_tense_modal_A_973, tense_A_977 KOhA_555 anaphora_400 KU_556 gap_450, lexer_J_950 KUhE_658 MEX_gap_452 KUhO_557 KUhO_gap_469 KUhO_gap_469 relative_clause_122 LA_499 description_110, sumti_G_97 LA_558 LA_499 LAhE_561 qualifier_483 LAU_559 lerfu_word_987 LE_488 description_110 LE_562 LE_488 left_bracket_470 quantifier_300, subscript_486 LEhU_565 LOhU_quote_436 lerfu_string_817 anaphora_400, operand_C_385, subscript_486 lerfu_string_root_986 lerfu_string_root_986, lerfu_word_987, lexer_Q_985, lexer_Y_1025, utt_ordinal_root_906 lerfu_word_987 lerfu_string_root_986, lerfu_word_987, number_root_961 lexer_A_701 lexer_A_905 lexer_A_905 utterance_ordinal_801 lexer_B_702 lexer_B_910 lexer_B_910 EK_802 lexer_C_703 lexer_C_915 lexer_C_915 EK_BO_803 lexer_D_704 lexer_D_916 lexer_D_916 EK_KE_804 lexer_E_705 lexer_E_925 lexer_E_925 JEK_805 lexer_F_706 lexer_F_930 lexer_F_930 JOIK_806 lexer_G_707 lexer_G_935 lexer_G_935 GEK_807 lexer_H_708 lexer_H_940 lexer_H_940 GUhEK_808 lexer_I_709 lexer_I_945 lexer_I_945 NAhE_BO_809 lexer_J_710 lexer_J_950 lexer_J_950 NA_KU_810 lexer_K_711 lexer_K_955 lexer_K_955 I_BO_811 lexer_L_712 lexer_L_960 lexer_L_960 number_812 lexer_M_713 lexer_M_965 lexer_M_965 GIhEK_BO_813 lexer_N_714 lexer_N_966 lexer_N_966 GIhEK_KE_814 lexer_O_715 lexer_O_970 lexer_O_970 tense_modal_815 lexer_P_716 lexer_P_980 lexer_P_980 GIK_816 lexer_Q_717 lexer_Q_985 lexer_Q_985 lerfu_string_817 lexer_R_718 lexer_R_990 lexer_R_990 GIhEK_818 lexer_S_719 lexer_S_995 lexer_S_995 I_819 lexer_T_1000 I_JEK_820 lexer_T_720 lexer_T_1000 lexer_U_1005 JEK_BO_821 lexer_U_721 lexer_U_1005 lexer_V_1010 JOIK_BO_822 lexer_V_722 lexer_V_1010 lexer_W_1015 JOIK_KE_823 lexer_W_723 lexer_W_1015 lexer_Y_1025 PA_MOI_824 lexer_Y_725 lexer_Y_1025 LI_489 sumti_G_97 LI_566 LI_489 LIhU_567 LIhU_gap_448 LIhU_gap_448 quote_arg_A_433 linkargs_160 fragment_20, tanru_unit_A_151 links_161 fragment_20, linkargs_160, links_161 LOhO_568 LOhO_gap_472 LOhO_gap_472 sumti_G_97 LOhU_569 LOhU_quote_436 LOhU_quote_436 quote_arg_A_433 LU_571 quote_arg_A_433 LUhU_573 LUhU_gap_463 LUhU_gap_463 operand_C_385, sumti_G_97 MAhO_430 MEX_operator_374 MAhO_662 MAhO_430 MAI_661 utt_ordinal_root_906 ME_477 tanru_unit_B_152 ME_574 ME_477 MEhU_575 MEhU_gap_465 MEhU_gap_465 tanru_unit_B_152 MEX_310 MEX_310, MEX_operator_374, quantifier_300, subscript_486, sumti_G_97 MEX_A_311 MEX_310, MEX_A_311 MEX_B_312 MEX_A_311, MEX_C_313 MEX_C_313 MEX_B_312, MEX_C_313, operand_C_385 MEX_gap_452 MEX_B_312 MEX_operator_374 MEX_operator_374, operator_B_372, tanru_unit_B_152 modal_974 simple_tense_modal_A_973 modal_A_975 modal_974 mod_head_490 modifier_84 modifier_84 term_83 MOhE_427 operand_C_385 MOhE_664 MOhE_427 MOhI_577 space_motion_1041 MOI_476 tanru_unit_B_152 MOI_663 lexer_Y_1025, MOI_476 NA_445 fragment_20, gek_sentence_54, selbri_A_131 NA_578 EK_root_911, GIhEK_root_991, JEK_root_926, lexer_J_950, NA_445 NAhE_482 MEX_operator_374, selbri_F_136, tanru_unit_B_152 NAhE_583 lexer_I_945, NAhE_482, simple_tense_modal_972 NAhE_BO_809 qualifier_483 NAhU_429 MEX_operator_374 NAhU_665 NAhU_429 NAI_581 COI_A_417, EK_root_911, GIhEK_root_991, GIK_root_981, indicator_413, interval_932, interval_property_1051, JEK_root_926, JOIK_root_931, lexer_G_935, lexer_H_940, modal_974, NU_A_426, space_direction_1048, text_0, time_direction_1035, token_1100 NA_KU_810 term_83 NIhE_428 operand_C_385 NIhE_666 NIhE_428 NIhO_584 para_mark_410 NOI_484 relative_clause_122 NOI_585 NOI_484 NU_425 NU_425, tanru_unit_B_152 NU_586 NU_A_426 NU_A_426 NU_425 NUhA_475 tanru_unit_B_152 NUhA_667 NUhA_475 NUhI_496 term_set_85 NUhI_587 NUhI_496 NUhU_588 NUhU_gap_460 NUhU_gap_460 term_set_85 number_812 quantifier_300, subscript_486 number_root_961 interval_property_1051, lexer_L_960, lexer_Y_1025, number_root_961, utt_ordinal_root_906 operand_381 MEX_B_312, operand_381, operand_C_385, rp_operand_332 operand_A_382 operand_381, operand_A_382 operand_B_383 operand_A_382, operand_B_383 operand_C_385 operand_B_383, operand_C_385 operator_370 MEX_310, MEX_A_311, MEX_B_312, operator_370, operator_B_372, rp_expression_330 operator_A_371 operator_370, operator_A_371 operator_B_372 operator_A_371 PA_672 lerfu_string_root_986, number_root_961 PA_MOI_824 bridi_valsi_A_408 paragraph_10 paragraph_10, paragraphs_4 paragraphs_4 paragraphs_4, text_C_3 para_mark_410 paragraphs_4, para_mark_410, text_B_2 parenthetical_36 free_modifier_A_33 PEhE_494 terms_A_81 PEhE_591 PEhE_494 PEhO_438 MEX_B_312 PEhO_673 PEhO_438 prenex_30 fragment_20, statement_11, subsentence_41 PU_592 time_direction_1035 qualifier_483 operand_C_385, sumti_G_97 quantifier_300 fragment_20, operand_C_385, sumti_E_95, sumti_F_96, sumti_tail_A_112 quote_arg_432 sumti_G_97 quote_arg_A_433 quote_arg_432 RAhO_593 bridi_valsi_A_408 relative_clause_122 relative_clauses_121 relative_clauses_121 fragment_20, relative_clauses_121, sumti_90, sumti_E_95, sumti_G_97, sumti_tail_111, sumti_tail_A_112, vocative_35 right_bracket_gap_471 quantifier_300 right_br_no_free_474 subscript_486 ROI_594 interval_property_1051 rp_expression_330 MEX_310, rp_operand_332 rp_operand_332 rp_expression_330 SA_595 null_1101 SE_480 MEX_operator_374, tanru_unit_B_152 SE_596 EK_root_911, GIhEK_root_991, interval_932, JEK_root_926, JOIK_root_931, lexer_G_935, lexer_H_940, modal_A_975, SE_480 SEhU_598 SEhU_gap_459 SEhU_gap_459 discursive_bridi_34 SEI_440 discursive_bridi_34 SEI_597 SEI_440 selbri_130 bridi_tail_C_53, discursive_bridi_34, GUhEK_selbri_137, MEX_operator_374, operand_C_385, selbri_A_131, sumti_E_95, sumti_tail_A_112, tense_modal_815, vocative_35 selbri_A_131 selbri_130 selbri_B_132 selbri_A_131, selbri_B_132 selbri_C_133 selbri_B_132, selbri_C_133, selbri_D_134, tanru_unit_B_152 selbri_D_134 selbri_C_133, selbri_D_134 selbri_E_135 selbri_D_134, selbri_E_135 selbri_F_136 GUhEK_selbri_137, selbri_E_135, selbri_F_136 sentence_40 statement_C_14, subsentence_41 SI_601 null_1101 simple_JOIK_JEK_957 I_root_956, lexer_T_1000, simple_tag_971 simple_tag_971 lexer_C_915, lexer_D_916, lexer_G_935, lexer_K_955, lexer_M_965, lexer_N_966, lexer_U_1005, lexer_V_1010, lexer_W_1015, simple_tag_971 simple_tense_modal_972 lexer_O_970, simple_tag_971 simple_tense_modal_A_973 simple_tense_modal_972 SOI_498 discursive_bridi_34 SOI_602 SOI_498 space_1040 tense_C_979 space_A_1042 space_1040 space_B_1043 space_A_1042 space_C_1044 space_B_1043, space_C_1044 space_direction_1048 space_intval_1046, space_offset_1045 space_int_props_1049 space_int_props_1049, space_intval_1046 space_int_props_A_1050 space_int_props_1049 space_intval_1046 space_B_1043 space_intval_A_1047 space_intval_1046 space_motion_1041 space_1040 space_offset_1045 space_C_1044, space_motion_1041 statement_11 paragraph_10, statement_11 statement_A_12 statement_11, statement_A_12 statement_B_13 statement_A_12, statement_B_13 statement_C_14 statement_B_13 SU_603 null_1101 sub_gap_462 subscript_486 subscript_486 free_modifier_A_33 subsentence_41 gek_sentence_54, relative_clause_122, subsentence_41, tanru_unit_B_152 sumti_90 discursive_bridi_34, modifier_84, operand_C_385, sumti_A_91, sumti_D_94, sumti_G_97, sumti_tail_A_112, tanru_unit_B_152, term_83, vocative_35 sumti_A_91 sumti_90 sumti_B_92 sumti_A_91, sumti_B_92 sumti_C_93 sumti_B_92, sumti_C_93 sumti_D_94 sumti_C_93, sumti_D_94 sumti_E_95 sumti_D_94 sumti_F_96 sumti_E_95 sumti_G_97 sumti_F_96, sumti_tail_111 sumti_tail_111 description_110 sumti_tail_A_112 sumti_tail_111 tag_491 gek_sentence_54, mod_head_490, selbri_130, statement_C_14, tag_491, tanru_unit_B_152 TAhE_604 interval_property_1051 tail_terms_71 bridi_tail_50, bridi_tail_A_51, bridi_tail_B_52, bridi_tail_C_53, gek_sentence_54 tanru_unit_150 selbri_F_136, tanru_unit_150 tanru_unit_A_151 tanru_unit_150 tanru_unit_B_152 tanru_unit_A_151, tanru_unit_B_152 TEhU_675 TEhU_gap_473 TEhU_gap_473 MEX_operator_374, operand_C_385 TEI_605 lerfu_word_987 tense_A_977 simple_tense_modal_A_973 tense_B_978 tense_A_977 tense_C_979 tense_B_978 tense_modal_815 tag_491 term_83 linkargs_160, links_161, relative_clause_122, terms_B_82 terms_80 discursive_bridi_34, fragment_20, prenex_30, sentence_40, tail_terms_71, terms_80, term_set_85 terms_A_81 terms_80, terms_A_81 terms_B_82 terms_A_81, terms_B_82 term_set_85 term_83 text_0 parenthetical_36, quote_arg_A_433, text_0 text_A_1 text_0 text_B_2 statement_C_14, text_A_1, text_B_2 text_C_3 null_1101, text_B_2 time_1030 tense_C_979 time_A_1031 time_1030 time_B_1032 time_A_1031, time_B_1032 time_direction_1035 time_interval_1034, time_offset_1033 time_interval_1034 time_A_1031 time_int_props_1036 time_interval_1034, time_int_props_1036 time_offset_1033 time_B_1032 TO_606 parenthetical_36 TOI_607 TOI_gap_468 TOI_gap_468 parenthetical_36 TUhE_447 statement_C_14 TUhE_610 TUhE_447 TUhU_611 TUhU_gap_454 TUhU_gap_454 statement_C_14 UI_612 indicator_413, token_1100 utterance_20 null_1101 utterance_ordinal_801 free_modifier_A_33 utt_ordinal_root_906 lexer_A_905 VA_613 space_A_1042, space_offset_1045 VAU_614 VAU_gap_456 VAU_gap_456 fragment_20, tail_terms_71 VEhA_615 space_intval_A_1047 VEhO_678 right_bracket_gap_471, right_br_no_free_474 VEI_677 left_bracket_470 VIhA_616 space_intval_A_1047 vocative_35 free_modifier_A_33 VUhO_497 sumti_90 VUhO_617 VUhO_497 VUhU_679 MEX_operator_374 XI_424 subscript_486 XI_618 XI_424 Y_619 indicator_413, token_1100 ZAhO_621 interval_property_1051 ZEhA_622 time_interval_1034 ZI_624 time_1030, time_offset_1033 ZIhE_487 relative_clauses_121 ZIhE_625 ZIhE_487 ZO_626 ZO_quote_435 ZOhU_492 prenex_30 ZOhU_628 ZOhU_492 ZOI_627 ZOI_quote_434 ZOI_quote_434 quote_arg_A_433 ZO_quote_435 quote_arg_A_433
EBNF Grammar of Lojban
[edit]Lojban Machine Grammar, EBNF Version, Final Baseline
This EBNF document is explicitly dedicated to the public domain by its author, The Logical Language Group, Inc. Contact that organization at: 2904 Beau Lane, Fairfax VA 22031 USA 703-385-0273 (intl: +1 703 385 0273)
Explanation of notation: All rules have the form:
namenumber = bnf-expression
which means that the grammatical construct “name” is defined by “bnf-expression”. The number cross-references this grammar with the rule numbers in the YACC grammar. The names are the same as those in the YACC grammar, except that subrules are labeled with A, B, C, ... in the YACC grammar and with 1, 2, 3, ... in this grammar. In addition, rule 971 is “simple_tag” in the YACC grammar but “stag” in this grammar, because of its frequent appearance.
- Names in lower case are grammatical constructs.
- Names in UPPER CASE are selma'o (lexeme) names, and are terminals.
- Concatenation is expressed by juxtaposition with no operator symbol.
- | represents alternation (choice).
- [] represents an optional element.
- & represents and/or (“A & B” is the same as “A | B | A B”).
- ... represents optional repetition of the construct to the left. Left-grouping is implied; right-grouping is shown by explicit self-referential recursion with no “...”
- () serves to indicate the grouping of the other operators. Otherwise, “...” binds closer than &, which binds closer than |.
- # is shorthand for “[free ...]”, a construct which appears in many places.
- // encloses an elidable terminator, which may be omitted (without change of meaning) if no grammatical ambiguity results.
text0 = [NAI ...] [CMENE ... # | (indicators & free ...)] [joik-jek] text-1 text-12 = [(I [jek | joik] [[stag] BO] #) ... | NIhO ... #] [paragraphs] paragraphs4 = paragraph [NIhO ... # paragraphs] paragraph10 = (statement | fragment) [I # [statement | fragment]] ... statement11 = statement-1 | prenex statement statement-112 = statement-2 [I joik-jek [statement-2]] ... statement-213 = statement-3 [I [jek | joik] [stag] BO # [statement-2]] statement-314 = sentence | [tag] TUhE # text-1 /TUhU#/ fragment20 = ek # | gihek # | quantifier | NA # | terms /VAU#/ | prenex | relative-clauses | links | linkargs prenex30 = terms ZOhU # sentence40 = [terms [CU #]] bridi-tail subsentence41 = sentence | prenex subsentence bridi-tail50 = bridi-tail-1 [gihek [stag] KE # bridi-tail /KEhE#/ tail-terms] bridi-tail-151 = bridi-tail-2 [gihek # bridi-tail-2 tail-terms] ... bridi-tail-252 = bridi-tail-3 [gihek [stag] BO # bridi-tail-2 tail-terms] bridi-tail-353 = selbri tail-terms | gek-sentence gek-sentence54 = gek subsentence gik subsentence tail-terms | [tag] KE # gek-sentence /KEhE#/ | NA # gek-sentence tail-terms71 = [terms] /VAU#/ terms80 = terms-1 ... terms-181 = terms-2 [PEhE # joik-jek terms-2] ... terms-282 = term [CEhE # term] ... term83 = sumti | (tag | FA #) (sumti | /KU#/) | termset | NA KU # termset85 = NUhI # gek terms /NUhU#/ gik terms /NUhU#/ | NUhI # terms /NUhU#/ sumti90 = sumti-1 [VUhO # relative-clauses] sumti-191 = sumti-2 [(ek | joik) [stag] KE # sumti /KEhE#/] sumti-292 = sumti-3 [joik-ek sumti-3] ... sumti-393 = sumti-4 [(ek | joik) [stag] BO # sumti-3] sumti-494 = sumti-5 | gek sumti gik sumti-4 sumti-595 = [quantifier] sumti-6 [relative-clauses] | quantifier selbri /KU#/ [relative-clauses] sumti-697 = (LAhE # | NAhE BO #) [relative-clauses] sumti /LUhU#/ | KOhA # | lerfu-string /BOI#/ | LA # [relative-clauses] CMENE ... # | (LA | LE) # sumti-tail /KU#/ | LI # mex /LOhO#/ | ZO any-word # | LU text /LIhU#/ | LOhU any-word ... LEhU # | ZOI any-word anything any-word # sumti-tail111 = [sumti-6 [relative-clauses]] sumti-tail-1 | relative-clauses sumti-tail-1 sumti-tail-1112 = [quantifier] selbri [relative-clauses] | quantifier sumti relative-clauses121 = relative-clause [ZIhE # relative-clause] ... relative-clause122 = GOI # term /GEhU#/ | NOI # subsentence /KUhO#/ selbri130 = [tag] selbri-1 selbri-1131 = selbri-2 | NA # selbri selbri-2132 = selbri-3 [CO # selbri-2] selbri-3133 = selbri-4 ... selbri-4134 = selbri-5 [joik-jek selbri-5 | joik [stag] KE # selbri-3 /KEhE#/] ... selbri-5135 = selbri-6 [(jek | joik) [stag] BO # selbri-5] selbri-6136 = tanru-unit [BO # selbri-6] | [NAhE #] guhek selbri gik selbri-6 tanru-unit150 = tanru-unit-1 [CEI # tanru-unit-1] ... tanru-unit-1151 = tanru-unit-2 [linkargs] tanru-unit-2152 = BRIVLA # | GOhA [RAhO] # | KE # selbri-3 /KEhE#/ | ME # sumti /MEhU#/ [MOI #] | (number | lerfu-string) MOI # | NUhA # mex-operator | SE # tanru-unit-2 | JAI # [tag] tanru-unit-2 | any-word (ZEI any-word) ... | NAhE # tanru-unit-2 | NU [NAI] # [joik-jek NU [NAI] #] ... subsentence /KEI#/ linkargs160 = BE # term [links] /BEhO#/ links161 = BEI # term [links] quantifier300 = number /BOI#/ | VEI # mex /VEhO#/ mex310 = mex-1 [operator mex-1] ... | FUhA # rp-expression mex-1311 = mex-2 [BIhE # operator mex-1] mex-2312 = operand | [PEhO #] operator mex-2 ... /KUhE#/ rp-expression330 = rp-operand rp-operand operator rp-operand332 = operand | rp-expression operator370 = operator-1 [joik-jek operator-1 | joik [stag] KE # operator /KEhE#/] ... operator-1371 = operator-2 | guhek operator-1 gik operator-2 | operator-2 (jek | joik) [stag] BO # operator-1 operator-2372 = mex-operator | KE # operator /KEhE#/ mex-operator374 = SE # mex-operator | NAhE # mex-operator | MAhO # mex /TEhU#/ | NAhU # selbri /TEhU#/ | VUhU # operand381 = operand-1 [(ek | joik) [stag] KE # operand /KEhE#/] operand-1382 = operand-2 [joik-ek operand-2] ... operand-2383 = operand-3 [(ek | joik) [stag] BO # operand-2] operand-3385 = quantifier | lerfu-string /BOI#/ | NIhE # selbri /TEhU#/ | MOhE # sumti /TEhU#/ | JOhI # mex-2 ... /TEhU#/ | gek operand gik operand-3 | (LAhE # | NAhE BO #) operand /LUhU#/ number812 = PA [PA | lerfu-word] ... lerfu-string817 = lerfu-word [PA | lerfu-word] ... lerfu-word987 = BY | any-word BU | LAU lerfu-word | TEI lerfu-string FOI ek802 = [NA] [SE] A [NAI] gihek818 = [NA] [SE] GIhA [NAI] jek805 = [NA] [SE] JA [NAI] joik806 = [SE] JOI [NAI] | interval | GAhO interval GAhO interval932 = [SE] BIhI [NAI] joik-ek421 = joik # | ek # joik-jek422 = joik # | jek # gek807 = [SE] GA [NAI] # | joik GI # | stag gik guhek808 = [SE] GUhA [NAI] # gik816 = GI [NAI] # tag491 = tense-modal [joik-jek tense-modal] ... stag971 = simple-tense-modal [(jek | joik) simple-tense-modal] ... tense-modal815 = simple-tense-modal # | FIhO # selbri /FEhU#/ simple-tense-modal972 = [NAhE] [SE] BAI [NAI] [KI] | [NAhE] (time [space] | space [time]) & CAhA [KI] | KI | CUhE time1030 = ZI & time-offset ... & ZEhA [PU [NAI]] & interval-property ... time-offset1033 = PU [NAI] [ZI] space1040 = VA & space-offset ... & space-interval & (MOhI space-offset) space-offset1045 = FAhA [NAI] [VA] space-interval1046 = ((VEhA & VIhA) [FAhA [NAI]]) & space-int-props space-int-props1049 = (FEhE interval-property) ... interval-property1051 = number ROI [NAI] | TAhE [NAI] | ZAhO [NAI] free32 = SEI # [terms [CU #]] selbri /SEhU/ | SOI # sumti [sumti] /SEhU/ | vocative [relative-clauses] selbri [relative-clauses] /DOhU/ | vocative [relative-clauses] CMENE ... # [relative-clauses] /DOhU/ | vocative [sumti] /DOhU/ | (number | lerfu-string) MAI | TO text /TOI/ | XI # (number | lerfu-string) /BOI/ | XI # VEI # mex /VEhO/ vocative415 = (COI [NAI]) ... & DOI indicators411 = [FUhE] indicator ... indicator413 = (UI | CAI) [NAI] | Y | DAhO | FUhO
The following rules are non-formal:
word1100 = [BAhE] any-word [indicators] any-word = “any single word (no compound cmavo)” anything = “any text at all, whether Lojban or not” null1101 = any-word SI | utterance SA | text SU
FAhO is a universal terminator and signals the end of parsable input.
EBNF Cross-Reference
[edit]A
- ek802
BAI
- simple-tense-modal972
BAhE
- word1100
BE
- linkargs160
BEI
- links161
BEhO
- linkargs160
BIhE
- mex-1311
BIhI
- interval932
BO
- bridi-tail-252, operand-2383, operand-3385, operator-1371, selbri-5135, selbri-6136, statement-213, sumti-393, sumti-697, text-12
BOI
- free32, operand-3385, quantifier300, sumti-697
BRIVLA
- tanru-unit-2152
BU
- lerfu-word987
BY
- lerfu-word987
CAI
- indicator413
CAhA
- simple-tense-modal972
CEI
- tanru-unit150
CEhE
- terms-282
CMENE
- free32, sumti-697, text0
CO
- selbri-2132
COI
- vocative415
CU
- free32, sentence40
CUhE
- simple-tense-modal972
DAhO
- indicator413
DOI
- vocative415
DOhU
- free32
FA
- term83
FAhA
- space-interval1046, space-offset1045
FEhE
- space-int-props1049
FEhU
- tense-modal815
FIhO
- tense-modal815
FOI
- lerfu-word987
FUhA
- mex310
FUhE
- indicators411
FUhO
- indicator413
GA
- gek807
GAhO
- joik806
GEhU
- relative-clause122
GI
- gek807, gik816
GIhA
- gihek818
GOI
- relative-clause122
GOhA
- tanru-unit-2152
GUhA
- guhek808
I
- paragraph10, statement-112, statement-213, text-12
JA
- jek805
JAI
- tanru-unit-2152
JOI
- joik806
JOhI
- operand-3385
KE
- bridi-tail50, gek-sentence54, operand381, operator-2372, operator370, selbri-4134, sumti-191, tanru-unit-2152
KEI
- tanru-unit-2152
KEhE
- bridi-tail50, gek-sentence54, operand381, operator-2372, operator370, selbri-4134, sumti-191, tanru-unit-2152
KI
- simple-tense-modal972
KOhA
- sumti-697
KU
- sumti-595, sumti-697, term83
KUhE
- mex-2312
KUhO
- relative-clause122
LA
- sumti-697
LAU
- lerfu-word987
LAhE
- operand-3385, sumti-697
LE
- sumti-697
LEhU
- sumti-697
LI
- sumti-697
LIhU
- sumti-697
LOhO
- sumti-697
LOhU
- sumti-697
LU
- sumti-697
LUhU
- operand-3385, sumti-697
MAI
- free32
MAhO
- mex-operator374
ME
- tanru-unit-2152
MEhU
- tanru-unit-2152
MOI
- tanru-unit-2152
MOhE
- operand-3385
MOhI
- space1040
NA
- ek802, fragment20, gek-sentence54, gihek818, jek805, selbri-1131, term83
NAI
- ek802, gek807, gihek818, gik816, guhek808, indicator413, interval-property1051, interval932, jek805, joik806, simple-tense-modal972, space-interval1046, space-offset1045, tanru-unit-2152, text0, time-offset1033, time1030, vocative415
NAhE
- mex-operator374, operand-3385, selbri-6136, simple-tense-modal972, sumti-697, tanru-unit-2152
NAhU
- mex-operator374
NIhE
- operand-3385
NIhO
- paragraphs4, text-12
NOI
- relative-clause122
NU
- tanru-unit-2152
NUhA
- tanru-unit-2152
NUhI
- termset85
NUhU
- termset85
PA
- lerfu-string817, number812
PEhE
- terms-181
PEhO
- mex-2312
PU
- time-offset1033, time1030
RAhO
- tanru-unit-2152
ROI
- interval-property1051
SA
- null1101
SE
- ek802, gek807, gihek818, guhek808, interval932, jek805, joik806, mex-operator374, simple-tense-modal972, tanru-unit-2152
SEI
- free32
SEhU
- free32
SI
- null1101
SOI
- free32
SU
- null1101
TAhE
- interval-property1051
TEI
- lerfu-word987
TEhU
- mex-operator374, operand-3385
TO
- free32
TOI
- free32
TUhE
- statement-314
TUhU
- statement-314
UI
- indicator413
VA
- space-offset1045, space1040
VAU
- fragment20, tail-terms71
VEI
- free32, quantifier300
VEhA
- space-interval1046
VEhO
- free32, quantifier300
VIhA
- space-interval1046
VUhO
- sumti90
VUhU
- mex-operator374
XI
- free32
Y
- indicator413
ZAhO
- interval-property1051
ZEI
- tanru-unit-2152
ZEhA
- time1030
ZI
- time-offset1033, time1030
ZIhE
- relative-clauses121
ZO
- sumti-697
ZOI
- sumti-697
ZOhU
- prenex30
any-word
- lerfu-word987, null1101, sumti-697, tanru-unit-2152, word1100
anything
- sumti-697
bridi-tail
- bridi-tail50, sentence40
bridi-tail-1
- bridi-tail50
bridi-tail-2
- bridi-tail-151, bridi-tail-252
bridi-tail-3
- bridi-tail-252
ek
- fragment20, joik-ek421, operand-2383, operand381, sumti-191, sumti-393
fragment
- paragraph10
free
- text0
gek
- gek-sentence54, operand-3385, sumti-494, termset85
gek-sentence
- bridi-tail-353, gek-sentence54
gihek
- bridi-tail-151, bridi-tail-252, bridi-tail50, fragment20
gik
- gek-sentence54, gek807, operand-3385, operator-1371, selbri-6136, sumti-494, termset85
guhek
- operator-1371, selbri-6136
indicator
- indicators411
indicators
- text0, word1100
interval
- joik806
interval-property
- space-int-props1049, time1030
jek
- joik-jek422, operator-1371, selbri-5135, stag971, statement-213, text-12
joik
- gek807, joik-ek421, joik-jek422, operand-2383, operand381, operator-1371, operator370, selbri-4134, selbri-5135, stag971, statement-213, sumti-191, sumti-393, text-12
joik-ek
- operand-1382, sumti-292
joik-jek
- operator370, selbri-4134, statement-112, tag491, tanru-unit-2152, terms-181, text0
lerfu-string
- free32, lerfu-word987, operand-3385, sumti-697, tanru-unit-2152
lerfu-word
- lerfu-string817, lerfu-word987, number812
linkargs
- fragment20, tanru-unit-1151
links
- fragment20, linkargs160, links161
mex
- free32, mex-operator374, quantifier300, sumti-697
mex-1
- mex-1311, mex310
mex-2
- mex-1311, mex-2312, operand-3385
mex-operator
- mex-operator374, operator-2372, tanru-unit-2152
number
- free32, interval-property1051, quantifier300, tanru-unit-2152
operand
- mex-2312, operand-3385, operand381, rp-operand332
operand-1
- operand381
operand-2
- operand-1382, operand-2383
operand-3
- operand-2383, operand-3385
operator
- mex-1311, mex-2312, mex310, operator-2372, operator370, rp-expression330
operator-1
- operator-1371, operator370
operator-2
- operator-1371
paragraph
- paragraphs4
paragraphs
- paragraphs4, text-12
prenex
- fragment20, statement11, subsentence41
quantifier
- fragment20, operand-3385, sumti-595, sumti-tail-1112
relative-clause
- relative-clauses121
relative-clauses
- fragment20, free32, sumti-595, sumti-697, sumti-tail-1112, sumti-tail111, sumti90
rp-expression
- mex310, rp-operand332
rp-operand
- rp-expression330
selbri
- bridi-tail-353, free32, mex-operator374, operand-3385, selbri-1131, selbri-6136, sumti-595, sumti-tail-1112, tense-modal815
selbri-1
- selbri130
selbri-2
- selbri-1131, selbri-2132
selbri-3
- selbri-2132, selbri-4134, tanru-unit-2152
selbri-4
- selbri-3133
selbri-5
- selbri-4134, selbri-5135
selbri-6
- selbri-5135, selbri-6136
sentence
- statement-314, subsentence41
simple-tense-modal
- stag971, tense-modal815
space
- simple-tense-modal972
space-int-props
- space-interval1046
space-interval
- space1040
space-offset
- space1040
stag
- bridi-tail-252, bridi-tail50, gek807, operand-2383, operand381, operator-1371, operator370, selbri-4134, selbri-5135, statement-213, sumti-191, sumti-393, text-12
statement
- paragraph10, statement11
statement-1
- statement11
statement-2
- statement-112, statement-213
statement-3
- statement-213
subsentence
- gek-sentence54, relative-clause122, subsentence41, tanru-unit-2152
sumti
- free32, operand-3385, sumti-191, sumti-494, sumti-697, sumti-tail-1112, tanru-unit-2152, term83
sumti-1
- sumti90
sumti-2
- sumti-191
sumti-3
- sumti-292, sumti-393
sumti-4
- sumti-393, sumti-494
sumti-5
- sumti-494
sumti-6
- sumti-595, sumti-tail111
sumti-tail
- sumti-697
sumti-tail-1
- sumti-tail111
tag
- gek-sentence54, selbri130, statement-314, tanru-unit-2152, term83
tail-terms
- bridi-tail-151, bridi-tail-252, bridi-tail-353, bridi-tail50, gek-sentence54
tanru-unit
- selbri-6136
tanru-unit-1
- tanru-unit150
tanru-unit-2
- tanru-unit-1151, tanru-unit-2152
tense-modal
- tag491
term
- linkargs160, links161, relative-clause122, terms-282
terms
- fragment20, free32, prenex30, sentence40, tail-terms71, termset85
terms-1
- terms80
terms-2
- terms-181
termset
- term83
text
- free32, null1101, sumti-697
text-1
- statement-314, text0
time
- simple-tense-modal972
time-offset
- time1030
utterance
- null1101
vocative
- free32
Public domainPublic domainfalsefalse