UNC-Corpus: a UML-diagram corpus to solve completeness problems in software engineering

Main Article Content

Carlos M. Zapata J.
Juan C. Hernández P.
Raúl A. Zuluaga

Keywords

annotated corpus, UML diagrams, XMI, repository, metamodelling, NLP, information extraction

Abstract

Computational corpora are used as tools in Natural Language Processing (NLP) to solve disambiguation, translation and automated text generation problems. In order to complete these tasks, the main feature of computational corpora (the fact that they have proven uses of a language) is combined with statistical analysis along with information extraction methods based on neural networks or genetic algorithms. In software engineering, there is no evidence supporting the use of diagram computational corpora. Diagram repositories have a similar application working with real examples of diagrams (mainly for reuse purposes), but without using neither statistics nor heuristic methods for information extraction. In this paper, the UNC-Corpus, a tool for managing a corpus of UML (Unified Modelling Language) diagrams, which applies NPL traditional techniques in order to solve completeness problems in software engineering, is proposed.

Downloads

Download data is not yet available.
Abstract 717 | PDF (Español) Downloads 614

Most read articles by the same author(s)