kriptia.com
Búsqueda personalizada


Home > LINGUISTIC > APPLIED LINGUISTICS >

COMPUTATIONAL LINGUISTICS

Español | Français | Deutsche
16 tesis en 1 páginas: 1
  • ADQUISICIÓ OF INFORMATION LEXICAL I MORFOSINTÀCTICA FROM CORPUS SENSE RECORD: APLICACIÓ TO RUS I AL CROAT.
    Author: OLIVER GONZÁLEZ ANTONIO.
    Year: 2003.
    University: BARCELONA [www.ub.es].
    Place of defense: FOLOLOGÍA.
    Place of preparation: UNIVESTAT DE BARCELONA.
    Summary: This thesis presents deversas methodologies automatic acquisition of lexical information and morphosyntactical and unsupervised learning of the morphology from corpus without anotar.Las metoddología we submitted were tested for two Slavic languages: Russian and Croatian languages characterized by a rich morphology and type predominantly concatenativo.Esta feature has been used in the design of algorithms, which can be readily adapted to work for other languages that provide a relatively rich morphology and whose main processes morfologícos, whether sufijales or prefijales, would be described in one way concatenativa. It has conducted an exhaustive evaluation of the methodologies presented and has been shown to work very well for these lesguas.El fact that operate without scoring corpus makes them very interesting for the creation of new lexical resources or for the expansion of existing resources. The algorithms presented in this paper can use the Internet to search for information not present in the body, which means that the process can be applied without the need to collect large corpus.
  • BANK TERMINOLOGICAL DATA FOR THE MERCOSUR: A PROPOSED MODEL.
    Author: AMARO DE MELO BIANCA.
    Year: 2003.
    University: POMPEU FABRA [www.upf.edu].
    Place of defense: INSTITUTO UNIVERSITARIO DE LINGÚISTICA APLICADA.
    Place of preparation: UNIVERSIDAD POMPEU FABRA.
    Summary: Presents, from the theoretical principles of the Theory of Communicative terminology, a proposal for model construction and management of a database terminology for the Southern Common Market. To get to the formulation of that proposed model were analyzed the various recommendations in the existing literature on the subject, as well as the status of formation of the languages of the member countries of MERCOSUR and the communication of that common market. It also presents a suggested format for the entry and data exchange teminológicos, characteristics that the system proposed in the model must have, and evaluating feasibility of carrying out each stage of its construction.
  • IZAERA HETEROGENEOKO BALIABIDE LEXIKALEN INTEGRAZIORAKO ARKITEKTURA PROPOSAMENA BATEN. DATU INTEGRAZIOAREN ARLOTIK EGINDAKO EKARPENA
    Author: SOROA ECHAVE AITOR.
    Year: 2004.
    University: PAÍS VASCO [www.ehu.es].
    Place of defense: FACULTAD DE INFORMATICA.
    Place of preparation: FACULTAD DE INFORMATICA.
    Summary: In this thesis aims to build a system integration of lexical resources. In this proposal the autonomy and the integrated resources is very broad, ie integrated resources should not camibar their design in any way to take part in our system. Thus, sources say that integrated lexical have "a life of its own." In fact, sources lexical may not "know" that are being integrated. Our proposal for integration is based on a general model of lexical representation of information, called Global Conceptual Model (GCM). The GCS has two main functions in the system. On the one hand, ensures communication with the user, so the user will use the concepts and relationships of MCG to represent his consultations. Moreover, the MCG will be the general outline on which the various sources will be integrated lexical system: Each diagram of the integrated resources will be linked to the MCG through semantic rules of integration. This thesis presents ELHISA (Ezagutza Lexikal Heterogeneoen Integrazio System, System Integration of Knowledge Lexicon Heterogéneo). Through the system ELHISA, the user --- either a human user or application of TLN --- get access to different lexical resources of a unified manner. To do so, send a query to the system, and it will be responsible for sending to sources lexical integrated (previously the original query translates to the schedules of each of the relevant sources to answer), receive responses from them, and finally present a unified response to the user.
  • EZAGUTZA BASIS LEXIKALA ERAIKITZEKO EUSKAL HIZTEGIKO DEFINIZIOEN AZTERKETA SINTAKTIKO SEMANTIKOA. HITZEN ARTEKO ERLAZIO LEXICON SEMANTIKOAK, DEFINIZIO PATROIAK, ERATORPENA ETA POSTPSIZIOAK
    Author: LERSUNDI AYESTARAN MIKEL.
    Year: 2004.
    University: PAÍS VASCO [www.ehu.es].
    Place of defense: FACULTAD DE CIENCIAS SOCIALES Y DE LA COMUNICACION.
    Place of preparation: FACULTAD DE FILOLOGIA, GEOGRAFIA E HISTORIA.
    Summary: Ezagutza-base lexikala eraikitzeko Euskal Hiztegiko definizioen azterketa sintaktiko-semantikoa. Hitzen arteko erlazio lexiko-semantikoak: definizio-patroiak, eratorpena eta postposizioak. Hau Tesi Euskararen Ezagutza-Basea eraikitzeko lehen urratsa given. Urrats honetan euskarazko hiztegi elebakarra aztertu dugu (Euskal Hiztegia; Sarasola, 1996) eta bertan dauden hitzen arteko erlazio lexiko-semantikoak erauzi ditugu. Hitzen arteko erlazioak markatzeak testu laburpenak, itzulpen automatikoa, eta abar, egitea erraztuko ditu. Erlazioak erauzteko hiztegian agertzen diren definizio-patroiak aztertu ditugu sarrera-hitza eta definizioan agertzen diren hitzen arteko erlazioak ezagutzeko. Baina ez dira horiek atera ditugun erlazio bakarrak. Definizio barruko hitzen arteko erlazioak zehaztu ditugu, eta horretarako definizioan dauden patroi lexikalak eta hitzen arteko loturak gauzatzen dituzten postposizioak aztertu ditugu. Hau Lan beste hizkuntzetako hiztegietan agertzen diren definizioekin lotu dugu, eta, beraz, espainierazko eta ingelesezko preposizioen azterketa egin behar izan dugu. Horretarako Marylandeko unibertsitatean garatutako preposizioen Egitura Lexiko Kontzeptualak (LCS) erabili ditugu. Eratorpena ere landu dugu eta hitz eratorria dagoenean, eratorri horren eta oinarriren arteko erlazioa zehazten saiatu gara. Analysis léxico - semántico dictionary Euskal Hiztegia to build a Knowledge Base. Relations between léxico - semánticas words: defining patterns, derivation lexical and postposiciones. This thesis is the first step in building a Knowledge Base of Basque. We have analyzed the dictionary monolingà ¼ and Euskal Hiztegia (Sarasola, 1996) and have drawn léxico - semánticas relations that exist between the words that appear in the dictionary. Mark the relationships between words, it is an important step to make summaries of texts, and other machine translation into Basque. To extract relations léxico - semánticas appearing between words of the definitions and lexical entries, we have analyzed the patterns used in the dictionary definition. But these have not been the only relationship we have learned. We have also drawn the relationships between words in the definition, and we have analyzed the patterns lexical and postposiciones that serve to unite the different words. We connected this work with definitions dictionary monolingà ¼ is in English and Spanish, and we have studied the prepositions of these two languages. For this study, we used the Lexicon Conceptual Structures (LCS) developed at the University of Maryland to describe prepositions in English and Spanish. Another point of inquiry has been shunting lexical. When an entry is a word derived, we have linked the lexical entry and its root thanks to the definition.
  • SUPERVISED WORD SENSE DISAMBIGUATION FACING CURRENT CHALLENGES
    Author: MARTINEZ IRAOLA DAVID.
    Year: 2004.
    University: PAÍS VASCO [www.ehu.es].
    Place of defense: FACULTAD DE INFORMATICA.
    Place of preparation: FACULTAD DE INFORMATICA.
    Summary: Tesi-lan honetan Lengoaia Naturalaren Prozesamenduan (LNP) garrantzia handia duen Hitzen Adiera Desanbiguazioa (HAD) landu given. Zabalki, eta ikuspegi desberdinetatik aztertua izan den arazo honen aurrean, gure ekarpen nagusia desanbiguazio system gainbegiratuen azterketa sakona izan gives beraien mugak aztertuz eta konponbideak proposatuz. System gainbegiratuek, eskuz landutako adibideak (hau gives pertsonek gainbegiratutakoak) behar dituzte beraien ereduak algorithm estatistikoekin ikasteko. Method hauek dira azken urteetan hedatu diren ebaluazio-saioetan emaitza onenak lortzen dituztenak hizkuntza guztietarako. Lanean Gure, bereziki ondorengo gaiak jorratu ditugu: testuinguruaren errepresentazio aberatsak, datu sakabanaketaren arazoa konpontzeko "leuntze" teknikak, automatikoki adibideak lortzeko metodoak, eta HAD sistemen garraiotasun arazoak.
  • AUTOMATIC CONSTRUCTION OF WIDE-COVERAGE DOMAIN-INDEPENDENT LEXICO-CONCEPTUAL ONTOLOGIES
    Author: FARRERES DE LA MORENA JAVIER.
    Year: 2004.
    University: POLITÉCNICA DE CATALUÑA [www.upc.edu].
    Place of defense: DEPARTAMENTO LLENGUATGES I SISTEMES INFORMÁTICS.
    Place of preparation: EDIFICI C6 CAMPUS NORD.
  • MEGALINGUISTIC INFORMATION EXTRACTION FROM SPECIALIZED TEXTS TO ENRICH COMPUTATIONAL LEXICONS
    Author: Rodríguez Penagos Carlos.
    Year: 2004.
    University: POMPEU FABRA [www.upf.edu].
    Place of defense: Departamento de Traducción y Filología.
    Place of preparation: Departamento de Traducción y Filología.
    Summary: This paper presents an empirical study of the use and role of the metalanguage in the scientific expert and the language of expertise in the English language, with special attention to the establishment, modification and negotiation of the common terminology of the group of specialists in each area. By enunciated discursive called Operations Metalingüsticas Explícitas is formalized and analyzes the dynamic nature of the structures conceptual and scientific convey the sublenguajes that. On the other hand, presents the implementation of an automated system for extracting information metalingüstica texts specialty. The system MPO (Metalinguistic Operation Processor) extracts contained metalingüsticos and definitions of specialized documents, using both finite state automata as automatic learning algorithms. The system set up bases semi-estructurades information terminology calls Metalinguistic Information Databases (CDF), useful for lexicography specialized, natural language processing and the empirical study of the evolution of scientific knowledge, among other applications.
  • THE LINGUISTIC KNOWLEDGE IN SEMANTIC DISAMBIGUATION AUTOMATIC.
    Author: NICA MIHAELA IULIANA.
    Year: 2004.
    University: BARCELONA [www.ub.es].
    Place of defense: FACULTAD DE FILOLOGÍA.
    Place of preparation: FACULTAD DE FILOLOGÍA UNIVERSIDAD DE BARCELONA.
    Summary: The disambiguation Semantic Automatique (DSA) is an open question within the area of Natural Language Processing. In this thesis explores specific modalities for the integration of knowledge lingà ¼ istico in the process of DSA, with the aim of improving the quality of terea. It proposes a new strategy for disambiguation of words are not isolated but integrated into syntactic patterns lexicon. This integration enables the extraction of information paradigmatic and antagmática related to an occurrence ambigna given to parts of the body. It also derives from EnoWodnet a new source lexical, Discrimadores of Meaning and is defined in relation to it a new algorítmo of DSA, Pueba of Conmutabilidad. The application of the algorithm and the algorithm of the Trademark Especficita on information extracted from the body takes precisely even DSA system with several heuristics. The results obtained in testing (an accuracy of approximately 95%) indicate that the approach is suitable for improving the level of the rate of DSA. The thesis opens a new line of research in DSA.
  • REPRESENTING DISCOURSE FOR AUTOMATIC TEXT SUMMARIZATION VIA SHALLOW NLP TECHINQUES.
    Author: ALONSO ALEMANY LAURA.
    Year: 2004.
    University: BARCELONA [www.ub.es].
    Place of defense: FACULTAD DE FILOLOGÍA.
    Place of preparation: FACULTAD DE PSICOLOGÍA (UNIVERSIDAD DE BARCELONA).
    Summary: This thesis deals with the problem is summarized automatic perspective lingüstica. It shows that many successful approaches to the problem addressed by the general characteristics of the texts. The thesis shows that some general properties of the organization of discursive texts can be edentificar a superficial level, which provide objective evidence for theories on the organization of the texts and which are useful for improving existing approaches to the summary automatic. To show this oseveración, he systematized and identified key dsicursivas relevanes and have empirically validated their usefulness.
  • MODEL COMPUTABLE ENRICHED ACQUISITION AND KNOWLEDGE REPRESENTATION LANGUAGE FOR PROCESSING MULTIDIRECTIONAL OF THE SPANISH LANGUAGE
    Author: GALLARDO PEREZ CAROLINA.
    Year: 2005.
    University: POLITÉCNICA DE MADRID [www.upm.es].
    Place of defense: FACULTAD DE INFORMATICA.
    Place of preparation: FACULTAD DE INFORMATICA UNIVERSIDAD POLITECNICA DE MADRID.
    Summary: The more traditional approach to formalizing and treatment knowledge lingüstico has relied on the formal grammars from theories lingüsticas. However, the needs of systems PLN for real applications have discovered the shortcomings that support systems based on theories lingüsticas. These are suitable for creating theoretical research tools, but inadequate for the design of systems for real application, due to the excessive rigidity that the theoretical framework of a theory lingüstica imposes on knowledge lingüstico. Theories lingüsticas are not the only "knowledge" that lingüstica provides. The descriptive grammar, but are not designed to use a computer, provides an overview of the use of the language provides greater coverage of grammatical knowledge despite their lower degree of formalization. Because of this lower level of formalization, PLN has directed its work more directly computable model, derived from the theoretical grammar to the detriment of descriptive grammar much more oriented to data obtained by observing that the application computational, so that the wealth of grammatical knowledge contained in a descriptive grammar is missed. In this context, this work has been guided targeting the definition of a model that allows educir knowledge of the Spanish language content to the greatest reference work so far: "Descriptive Grammar of the English Language" (GDLE) the Spanish Royal Academy, and represent it in a way computable independently of its final implementation. It has conducted a thorough analysis from the source (GDLE) - through the use of methodologies engineering knowledge modeling its contents through a representation of knowledge based on a static model (objects and classes) and a dynamic model ( rules, operations and processes). In other words, it has followed a method of development in the Knowledge Based Systems for educir knowledge from a descriptive grammar. This representation is the only one that can guarantee that the model is useful and maintainable by their characteristics of modularity, making it scalable (which would increase its coverage lingüstica very easily) and reversible, meaning that the rules do not rely on the direction of flow of the major core processes of analysis and generation language. The various processes involved have been grouped into a few functional units (Lexicon, Sintáctico and Semantic) which have been the basis for the definition of a computational architecture for distributed systems itself, as is the blackboard. The model has undergone defined by a set of test cases designed to guarantee coverage lingüstica defined in the working hypotheses and also the correction of the model especially when it relates to the reversibility. This model has no precedent in the Spanish language, which opens up a new line of work that allows developing grammars reckoned with great ease. The model is also exported to other languages and even architecture and structures of the model are sufficiently elastic to permit rapid adaptation and coherent (except in the process of extracting knowledge) to them.
  • A MODEL OF DESIGN, ANALYSIS AND EXPLOITATION OF A BODY. SYNCHRONY AND DIACRONÍA SUFFIX CASTELLANO-OSO
    Author: DÍAZ GARCÍA MARÍA TERESA.
    Year: 2005.
    University: SANTIAGO DE COMPOSTELA [www.usc.es].
    Place of defense: FACULTAD DE FILOLOGÍA.
    Place of preparation: FACULTAD DE FILOLOGÍA.
    Summary: Advances in the information society are manifested in the increasing use of new technology to communicate and gather information of all kinds, including lingüstica. @ Deriv - tesis result of this is a system of intelligent use of this. Data managed by this application are related to the morphology derivacional as the word derivative is the starting point for the development of relations with other elements lingüsticos. Deriv @ is a database lingüsticos with two models of representation, one for the Spanish words formed by the procedure of derivation and another for the corresponding Latin words. We offer a corpus analyzed grammatically, desde views synchronous and diacrónico- for consultations customized for the user's interests and allowing us to the creation of a dictionary of words derived. Deriv @ permitting and facilitating unhindered access to the information contained in the two databases. The model applies to any form derived from the Spanish (or other languages romantic), and, in its version for Web environment leaves open the possibility of real-time access to any user, which may interact with the system and contribute to its refinement. (Www.derivar.com).
  • BENCHMARK FOR THE DECISION OF OPTIMUM STORAGE SYSTEMS XML DATABASE MANAGEMENT XML.
    Author: SERNA NOCEDAL AINHOA.
    Year: 2005.
    University: MONDRAGÓN UNIBERTSITATEA [www.mondragon.edu].
    Place of defense: MONDRAGON GOI ESKOLA POLITEKNIKOA.
    Place of preparation: MONDRAGON GOI ESKOLA POLITEKNIKOA MONDRAGON UNIBERTSITATEA.
    Summary: Driven by the efforts of manufacturers DBMS to unify the world of data and content management, the existing DBMS's added support for XML, in parallel systems have been developed native XML denominated NXD (Native XML Database) both in the industry and academia. The comparative study of performance of these approaches is an important scientific interest. In experiments conducted have identified several criteria for evaluating these different solutions through the development of a Benchmark, which implements different metrics to measure the performance of all applications with data in XML format. We have studied four key criteria which are: the relevance, portability, scalability and simplicity.
  • WEB SERVICES AS A MODEL FOR COMMUNICATION BETWEEN APPLICATIONS. APPLICABILITY, INTEROPERABILITY AND ADOPTION.
    Author: GUERRICAGOITIA ARRIEN JON KEPA.
    Year: 2005.
    University: MONDRAGÓN UNIBERTSITATEA [www.mondragon.edu].
    Place of defense: MONDRAGON GOI ESKOLA POLITEKNIKOA.
    Place of preparation: MONDRAGON GOI ESKOLA POLITEKNIKOA MONDRAGON UNIBERTSITATEA.
    Summary: In the year 2001, the software industry hosted by the W3C, agrees on the basis of a model for communication between heterogeneous applications on the Internet, called Web Service. Manufacturers proclaim qeu going to transform the industry, emphasizing aspects of a future technology and ignoring the limitations and drawbacks. This is the reason to investigate the reality of this new technological paradigm. The main contribution is to guide the process of adopting technology in the context of research, which is composed of related businesses Mondragon Unibertsitatea. To that end, researching in this technology, its applicability and interoperability.
  • COMBINING MACHINE LEARNING AND RULE-BASED APPROACHES IN SPANISH SYNTACTIC GENERATION
    Author: MELERO NOGUÉS MARÍA TERESA.
    Year: 2005.
    University: POMPEU FABRA [www.upf.edu].
    Place of defense: INSTITUTO UNIVERSITARIO DE LINGÒÍSTICA APLICADA.
    Place of preparation: INSTITUTO UNIVERSITARIO DE LINGÒÍSTICA APLICADA.
    Summary: This thesis describes a syntactic grammar generation, which combines rules written by hand and machine learning techniques. This grammar belongs to a system of commercial quality Machine Translation developed in Microsoft Research. The first part of the thesis describes the grammar, as well as strategies lingüsticas underlying rules. The actual use of the system in everyday situations TA Generator requires a high degree of robustness, which is solved by adding thereto a module pregeneración. This module is capable of guaranteeing the integrity of the entry, without having to incorporate elements in the ad-hoc rules of grammar. The second part explored the use of decision tree classifiers to learn automatically from the operations taking place in the form of pregeneración, specifically the selection of lexical copulativo verb in Spanish (or are being). We demonstrate that it is possible to infer with great precision, from examples, the contexts of this phenomenon lingüstico not trivial.
  • LEARNING VOCABULARY AND ELAO: EXPERIMENTAL ANALYSIS AND COMPARISON BETWEEN TRADITIONAL AND MULTIMEDIA RESOURCES IN THE TEACHING OF ENGLISH AS A FOREIGN LANGUAGE
    Author: MARTÍN IGLESIAS JOAQUÍN.
    Year: 2006.
    University: NACIONAL DE EDUCACIÓN A DISTANCIA [www.uned.es].
    Place of defense: FILOLOGÍA.
    Place of preparation: UNIVERSIDAD NACIONAL DE EDUCACIÓN A DISTANCIA.
    Summary: This Doctoral thesis is based on research on the acquisition of vocabulary in a second language, updating with the use of multimedia and ICT in the classroom of English secondary analysis software for such purposes, developing a consistent protocol for the design of multimedia educational activities, design of such activities and experimentation on lexical acquisition, retention in the short and long term vocabulary in English as a foreign language, motivation and different variables that affect these processes. The results, although not universalizados by the number of participants, showing how the use of multimedia favors the creation of an acquisition natural creation schemes lexical mental strength, composition of the new vocabulary in the prior knowledge and its subsequent use in the creation and understanding lingüstica the student.
  • THE GENERATION TIME AND APPEARANCE IN ENGLISH AND SPANISH: A STUDY FUNCTIONAL CONTRASTIVE.
    Author: ZAMORANO MANSILLA JUAN RAFAEL.
    Year: 2006.
    University: COMPLUTENSE DE MADRID [www.ucm.es].
    Place of defense: FILOLOGÍA.
    Place of preparation: FILOLOGÍA.
    Summary: This thesis provides a contrastive analysis of weather systems and appearance of English and Spanish. The analysis is based on a study of body (CREA for Spanish and English for BNC), and follows the principles of grammar sistémico-funcional. The results of this analysis were then applied to the creation of linguistic resources for automated time and look at prayers of the two languages by the generator KPML-
16 tesis en 1 páginas: 1
Búsqueda personalizada
kriptia.com
E-mail