Organizers: Roussanka Loukanova, Erkki Luuk, Erik Palmgren
Aarne Ranta will give an introduction to his system GF, a grammatical framework designed to deal with multiple languages in parallel. He is professor of Computer Science at the University Gothenburg and well-known for his seminal work on using dependent type theory as semantics for natural language and for employing it for precise multilingual translations. The GF software system, which implements its own GF programming language, is based on these ideas. It is freely available.
ReferencesSchedule: The tutorials and seminars will take place in room 16, building 5, Kräftriket.
We will explain the GF formalism and its purpose, as well as show with practical examples how GF grammars are created and used. We will together build some simple GF grammars and test them on a computer.
The main stream in machine translation is to build systems that are able to translate everything, but without any guarantees of quality. An alternative to this is systems that aim at precision but have limited coverage. Combining wide coverage with high precision is considered unrealistic. Most wide-coverage systems are based on statistics, whereas precision-oriented domain-specific systems are typically based on grammars, which guarantee translation equality by some kind of formal semantics.
This talk introduces a technique that combines wide coverage with high precision, by embedding a high-precision semantic grammar inside a wide-coverage syntactic grammar, which in turn is backed up by a chunking grammar. The system can thus reach good quality whenever the input matches the semantics; but if it doesn't, the user will still get a rough translation. The levels of confidence can be indicated by using colours, whence the title of the talk.
The talk will explain the main ideas in this technique, based on GF (Grammatical Framework) and also inspired by statistical methods (probabilistic grammars) and the Apertium system (chunk-based translation), boosted by freely available dictionaries (WordNet, Wiktionary), and built by a community of over 50 active developers. The current system covers 11 languages and is available both as a web service and as an Android application. (extended version of my talk at Vienna Summer of Logic, http://www.easychair.org/smart-program/VSL2014/NLSR-2014-07-18.html#talk:3396)
The strata of language representation in computational linguistics and human language processing (e.g., phonology, phonetics, morphology, syntax, semantics, pragmatics, discourse) rely on mathematical models and methods. We begin with a brief historical overview of the areas shaped by such layers of representation and corresponding methods.
We present Chomsky's criteria for adequacy and related developments in syntactic theories. On the other hand, by considering model-theoretic approaches to language theory, Barwise and Perry (1983) differentiated semantic universals of human languages that serve as criteria for adequateness of theories of meaning. We present these universals by pointing to corresponding tasks in computational linguistics.
We overview major computational approaches to computational syntax, semantics, and syntax-semantics interfaces. To provide background for the forthcoming lectures and seminars, we introduce linguistic concepts, such as words vs. phrases, head of a phrase, compositionality, underspecification.
We consider contributions from the theories of formal languages, formal grammar, first-order logic (FOL), and higher-order logic (HOL). FOL is a valuable and sophisticated area in logic and its applications. During the series, we will be returning to FOL to point to some tasks and problems in its applications to linguistics. We will also point to logics used as semantic representations in syntactic theories and syntactic approaches.
Slides for the lecture | Handouts for the lecture
Slides for the lecture | Handouts for the lecture