Prediction of school dropout at the Federal Institute of Minas Gerais with the support of Machine Learning techniques
School Dropout, Machine Learning, Data Mining, IFMG
School dropout is a phenomenon characterized by being influenced by several variables, which makes the study to identify which factors contribute to the dropout of a student from their academic institution complex. In the last decade there has been a considerable expansion in the offer of higher education courses in Federal Education Institutions, especially due to public policies that have fostered improvements in the physical infrastructure and personnel of educational units, allowing individuals with the most varied profiles to start their studies and do make the task of understanding school dropout more complex for managers. Parallel to this scenario, the Machine Learning area also expanded its application possibilities to the most diverse areas, including education, providing different ways of analyzing and understanding the data that are generated in the environment of each institution/organization. This Dissertation aimed to use Machine Learning techniques to predict the risk of school dropout in undergraduate courses at the Federal Institute of Science, Education and Technology of Minas Gerais (IFMG), as well as to identify which attributes are most associated with this phenomenon in institution. The structuring and organization of the activities foreseen in this work was supported by the CRISP-DM (Cross-Industry Standard Process for Data Mining) methodology. Three phases of experiments were conducted, the first dealing with the balancing of the dataset, the second using Feature Selection techniques and the third applying a semi-supervised learning strategy to improve the performance metrics collected. As a main result, we obtained a model capable of classifying dropout with 90% accuracy and 86% F1, indicating a considerable possibility of complementing institutional action with regard to actions aimed at controlling dropout levels at the IFMG.