Webcast available of presentation by George Lakoff on Friday March 13 2015 “Why linguists are needed: The severe limitations of big data analysis of linguistic corpora” arguing that “big data statistical methods by themselves were hopeless” in a multi-million dollar project on analyzing the conceptual metaphors in a vast corpus of US intelligence documents. George READ MORE
Category: Blog posts
— NLP analysis of “The art of the Humanities – Graduation day 2014”
Introduction For every master student, there is one thing standing in between them and the end of their student life: “the thesis”. For months, or even years, a student tries to analyse a certain topic in the best way possible. And after this long and difficult journey, the student receives his reward at his graduation: READ MORE
— Open Source Dutch WordNet @CLIN 2015
At CLIN 2015, we presented Open Source Dutch Wordnet: (odwn_clin_2015) The project website can be found at: project website. Open Source Dutch WordNet is an open source version of Cornetto (Vossen et al., 2013). Cornetto is currently not distributed as open source, because a large portion of the database originates from the commercial publisher Van READ MORE
— Similarity, co-occurrence, functional relation, part-whole relation, subcategorization, what else?
In word sense disambiguation and named-entity disambiguation, an important assumption is that a document consists of related concepts and entities. There are millions of concepts and entities, what makes some related but not others? This question is difficult and I don’t have the definitive answer. But it is a good start to list some classes READ MORE
— SensEval/Semeval output from participant systems (WSD)
If you are interested in having the individual output at token level for all the participant systems in the last SensEval/SemEval WSD tasks, we can find then now in a simple and homogeneous XML format, easy to process. You will find more information in our results section or in https://github.com/rubenIzquierdo/sval_systems READ MORE
— DBPEDIA spotlight for KAF/NAF
If you are interested in extracting entities and their link to dbpedia entries, you should take a look to this module: https://github.com/rubenIzquierdo/dbpedia_ner It allows you to use a KAF or a NAF file with jus tokens and terms, calls to the DBPEDIA online webservice and extract entities and the link to dbpedia automatically. Given this portion of READ MORE
— Sense annotated corpora in NAF
If you work in WSD or you are simply interested in sense annotated corpora, you should take a look at this GitHub repository. You will find some well-known corpora widely used within WSD task, manually annotated with WordNet senses and converted in our NAF format, which makes very easy the use of all our modules and pipelines. READ MORE
— Demo Wsd4Kids
The Wsd4Kids demo implements a very simple Word Sense Disambiguation system and a graphical interface to interact with the system. The WSD system behind is based on a machine learning engine (Support Vector Machines) and uses a bag-of-word feature model. Word sense disambiguation (WSD) for Kids — Demo by Rubén Izquierdo Beviá Release first version READ MORE