Publicaciones

Aproximación híbrida para la Generación del Lenguaje Natural

Este proyecto de tesis plantea una aproximación híbrida para la generación del lenguaje natural, la cual permitirá mejorar la calidad del texto producido, favoreciendo la independencia del dominio, del género textual y de la aplicación final donde se utilice. En una primera instancia, para lograr este objetivo, se ha implementado un enfoque estadístico centrado en la fase de realización cuya entrada guía el proceso de generación, que puede ser adaptado con facilidad para generar textos para diferentes dominios e idiomas.

Semántica y pragmática como factores clave en el desarrollo de un sistema de Generación de Lenguaje Natural

El estudio del lenguaje se realiza desde diversas disciplinas que, en gene- ral, lo consideran una forma de comunicación que persigue un objetivo especı́fico. La generación automática de lenguaje natural es la disciplina responsable de presentar adecuadamente una determinada información procesada para alcanzar tal objetivo comunicativo. Por un lado ha de determinar qué se ha de comunicar y por otro debe decidir cómo decirlo.

Document semantic profile definition

Nowadays many users browse Internet by means of search engines, among millions of web records, with the aim of finding specific documents. How do identify documents according to their needs? It makes necessary to find a correct way of extracting relevant information about documents and representing them as metadata. Natural Language Processing (NLP) technologies are required for automatic extracting document information to set specific metadata attributes.

An Active Ingredients Entity Recogniser System Based on Profiles.

This paper describes an active ingredients named entity recogniser. Our machine learning system, which is language and domain independent, employs unsupervised feature generation and weighting from the training data. The proposed automatic feature extraction process is based on generating a profile for the given entity without traditional knowledge resources (such as dictionaries). Our results (F1 87.3 % [95 %CI: 82.07–92.53]) proves that unsupervised feature generation can achieve a high performance for this task.

Analysing the Integration of Semantic Web Features for Document Planning across Genres

Language is usually studied and analysed from different disciplines generally on the premise that it constitutes a form of communication which pursues a specific objective. The discourse, in that sense, can be understood as a text which is constructed to express such objective. When a discourse is created, its production is related to some textual genre, usually connected with some pragmatic features, like the intention of the writer or the audience to whom is addressed, both conditioning the use of language.

Generating sets of related sentences from input seed features

The Semantic Web (SW) can provide Natural Language Generation (NLG) with technologies capable to facilitate access to structured Web data. This type of data can be useful to this research area, which aims to automatically produce human utterances, in its different subtasks, such as in the content selection or its structure. NLG has been widely applied to several fields, for instance to the generation of recommendations (Lim-Cheng et al., 2014). However, generation systems are currently designed for very specific domains (Ramos-Soto et al., 2015) and pre-defined purposes (Ge et al., 2015).

Content Selection through Paraphrase Detection: Capturing differentSemantic Realisations of the Same Idea

Summarisation can be seen as an instance of Natural Language Generation (NLG), where “what to say” corresponds to the identification of relevant information, and “how to say it” would be associated to the final creation of the summary. When dealing with data coming from the Semantic Web (e.g., RDF triples), the challenge of how a good summary can be produced arises. For instance, having the RDF properties from an infobox of a Wikipedia page, how could a summary expressed in natural language text be generated?

Cross-document event ordering through temporal, lexical and distributional knowledge

In this paper we present a system that automatically builds ordered timelines of events from different written texts in English. The system deals with problems such as automatic event extraction, cross-document temporal relation extraction and cross-document event coreference resolution. Its main characteristic is the application of three different types of knowledge: temporal knowledge, lexical-semantic knowledge and distributional-semantic knowledge, in order to anchor and order the events in the timeline. It has been evaluated within the framework of SemEval 2015.

Páginas

Suscribirse a Publicaciones