UNC-Corpus: a UML-diagram corpus to solve completeness problems in software engineering
Main Article Content
Keywords
annotated corpus, UML diagrams, XMI, repository, metamodelling, NLP, information extraction
Abstract
Computational corpora are used as tools in Natural Language Processing (NLP) to solve disambiguation, translation and automated text generation problems. In order to complete these tasks, the main feature of computational corpora (the fact that they have proven uses of a language) is combined with statistical analysis along with information extraction methods based on neural networks or genetic algorithms. In software engineering, there is no evidence supporting the use of diagram computational corpora. Diagram repositories have a similar application working with real examples of diagrams (mainly for reuse purposes), but without using neither statistics nor heuristic methods for information extraction. In this paper, the UNC-Corpus, a tool for managing a corpus of UML (Unified Modelling Language) diagrams, which applies NPL traditional techniques in order to solve completeness problems in software engineering, is proposed.
Downloads
Download data is not yet available.