If you work in WSD or you are simply interested in sense annotated corpora, you should take a look at this GitHub repository. You will find some well-known corpora widely used within WSD task, manually annotated with WordNet senses and converted in our NAF format, which makes very easy the use of all our modules and pipelines. Currently these corpora are available:
- Semcor
- SenseVal-2 all-words task
- SenseVal-3 all-words task
- SemEval-2010 task #17: All-words Word Sense Disambiguation on a Specific Domain
- SemEval-2007 task #17: English all words
- SemEval-2013 Task 12: Multilingual Word Sense Disambiguation (langs en,es,fr,it,de)
You will find the repository with the data here: http://github.com/rubenIzquierdo/wsd_corpora