On the analysis of Identifier Names in Software System
Software Quality, Names, Static Analysis
One of the main concerns of a software developer is writing code that can be easily read and understood by developers, and one way to positively contribute to solving this concern is in the choice of good identifier names. In fact, identifier names are a topic of great interest to Software Engineering, with studies claiming that names found in code are crucial for understanding code, making it easier to write and maintain. Identifier names serve as code documentation and can lead to confusion and even implementation errors when they are misnamed. Due to such importance in an often underestimated act, naming identifiers in an elegant way is essential for writing quality code, and knowing if in fact the chosen names are good, should be a constant concern of software developers. Therefore, in this one we conducted three studies to verify the current quality of identifier naming in open source software systems. Initially, the semantic similarity between identifiers present in similar scopes was considered. Second, we checked the pronunciation of names, which impact code review discussions. And finally, we describe some naming practices and verify their recurrence in these software systems. To achieve this goal, 1,421,607 identifier names present in the source code of 40 projects were collected, analyzed and extracted. The results of this research can contribute to the quality analysis of identifiers present in real projects, using different perspectives regarding nominal quality and to the development of a tool to support software development.