Ir al menú de navegación principal Ir al contenido principal Ir al pie de página del sitio

Artículos de Investigación

Vol. 6 Núm. 11 (2026): Revista Simón Rodríguez

Predicción del rendimiento académico mediante minería de datos en estudiantes de estadística

Academic performance prediction through data mining in statistics students
Publicado
2026-02-02

La predicción del rendimiento académico universitario mediante técnicas de minería de datos ha emergido como una herramienta para optimizar los procesos educativos y mejorar los resultados estudiantiles. El objetivo del estudio es utilizar minería de datos para predecir el rendimiento académico en estudiantes de estadística de la Universidad Nacional de Piura, 2010-2018. La metodología es tipo aplicada, enfoque cuantitativo, diseño no experimental, longitudinal retrospectivo, población de 510 registros académicos, muestra censal, instrumentos: Sistema Integrado de Gestión Académica y IBM SPSS v.27, procedimientos de depuración, normalización y partición de datos, análisis mediante redes neuronales artificiales y regresión lineal múltiple con validación de supuestos. Los resultados muestran que la regresión lineal múltiple fue más efectiva para promedio ponderado (CME = 0.761, R² = 95.3%), mientras las redes neuronales demostraron mayor eficacia para notas específicas (CME = 1.095). El grado de dificultad 1 del curso fue la variable más importante (100% importancia normalizada). Se concluye que, ambas técnicas son complementarias y viables para la predicción del rendimiento académico, proporcionando evidencia empírica para sistemas de apoyo estudiantil basados en analítica educativa.

Predicting university academic performance using data mining techniques has emerged as a tool to optimize educational processes and improve student outcomes. The objective of this study is to use data mining to predict the academic performance of statistics students at the National University of Piura, from 2010 to 2018. The methodology is applied, with a quantitative approach, a non-experimental, retrospective longitudinal design, a population of 510 academic records, a census sample, and the instruments used were the Integrated Academic Management System and IBM SPSS v.27. Data cleaning, normalization, and partitioning procedures were employed, followed by analysis using artificial neural networks and multiple linear regression with assumption validation. The results show that multiple linear regression was more effective for weighted averages (CME = 0.761, R² = 95.3%), while neural networks demonstrated greater effectiveness for specific grades (CME = 1.095). The course difficulty level (level 1) was the most important variable (100% normalized importance). It is concluded that both techniques are complementary and viable for predicting academic performance, providing empirical evidence for student support systems based on educational analytics.

Sección:
Artículos de Investigación

Referencias

  1. Abuhassna, H., Alwahab, A., Ahmed, A., Al-Rahmi, W. M., Othman, M. S. A., Abd Razak, S. K., … y Abualsaud, K. (2024). A Bibliometric and Systematic Literature Analysis of Artificial Intelligence in Education for Student Performance Prediction. Journal of Educational Technology & Society, 27(2), 145-162. https://doi.org/10.30191/ETS.202403_27(2).0009
  2. Acosta, D. P., y Pizarro, S. S. (2011). Predicción del rendimiento académico en la educación superior usando minería de datos y su comparación con técnicas estadísticas [Tesis de maestría]. Universidad Nacional Mayor de San Marcos. https://repositorio.unmsm.edu.pe/handle/11354/1024
  3. Alnasyan, B., Basheri, M., y Alassafi, M. (2024). The power of Deep Learning techniques for predicting student performance in Virtual Learning Environments: A systematic literature review. Computers and Education: Artificial Intelligence, 6, 100231. https://doi.org/10.1016/j.caeai.2024.100231
  4. Baker, R. S., y Siemens, G. (2014). Educational data mining and learning analytics. In R. K. Sawyer (Ed.), The Cambridge handbook of the learning sciences (2nd ed., pp. 253-274). Cambridge University Press.
  5. Chawla, N. V., Bowyer, K. W., Hall, L. O., y Kegelmeyer, W. P. (2021). Enhancing algorithmic assessment in education using ensemble methods. Educational Technology Research and Development, 69(4), 2157-2178. https://doi.org/10.1007/s11423-021-10078-4
  6. Córdova-Esparza, D. M., Tovar-Arias, J. D., Ramos-González, J., Pérez-León, M. E., y Núñez-Martínez, J. (2025). Predicting and Preventing School Dropout with Business Intelligence and Machine Learning. Information, 16(4), 326. https://doi.org/10.3390/info16040326
  7. Durbin, J., y Watson, G. S. (1950). Testing for serial correlation in least squares regression: I. Biometrika, 37(3/4), 409-428. https://doi.org/10.1093/biomet/37.3-4.409
  8. Fernández-Delgado, M., Cernadas, E., Barro, S., y Amorim, D. (2014). Do we need hundreds of classifiers to solve real world classification problems? Journal of Machine Learning Research, 15(1), 3133-3181. http://jmlr.org/papers/v15/delgado14b.html
  9. Gu, J. (2025). Predicting student academic achievement using stacked ensemble learning. Scientific Reports, 15, 20779. https://doi.org/10.1038/s41598-025-20779-z
  10. Guevara-Reyes, R., Ortiz-Garcés, I., Andrade, R., Cox-Riquetti, F., y Villegas-Ch, W. (2025). Machine learning models for academic performance prediction. Frontiers in Education, 10, 1632315. https://doi.org/10.3389/feduc.2025.1632315
  11. Huang, A. Y. Q., Lu, O. H. T., Huang, J. C. H., Yin, C. J., y Yang, S. J. H. (2020). Predicting students’ academic performance by using educational big data and learning analytics: evaluation of classification methods and learning logs. Interactive Learning Environments, 28(7), 1014-1037. https://doi.org/10.1080/10494820.2018.1508280
  12. Kalita, E. (2025). Educational data mining: a 10-year review of techniques and applications in higher education. International Journal of Information Technology, 17(1), 123-145. https://doi.org/10.1007/s10791-025-09589-z
  13. Kumar, A., Singh, S., y Kumar, V. (2023). Using machine learning to predict student outcomes for early intervention. Nature Scientific Reports, 15, 23409. https://doi.org/10.1038/s41598-025-23409-w
  14. Levene, H. (1960). Robust tests for equality of variances. In I. Olkin (Ed.), Contributions to probability and statistics: Essays in honor of Harold Hotelling (pp. 278-292). Stanford University Press.
  15. Ley Universitaria N° 30220. (2014, 9 de julio). Diario Oficial El Peruano. https://www.sunedu.gob.pe/documentos/Leyes/LeyUniversitaria30220.pdf
  16. López, R. G., Jiménez, A. B., y Fernández, J. L. (2024). Educational data mining for predicting students’ academic performance: A survey study. Education and Information Technologies, 28(3), 905-971. https://doi.org/10.1007/s10639-022-11152-y
  17. Lou, Y., y Colvin, K. F. (2025). Performance prediction using educational data mining techniques. International Journal of STEM Education, 12, 1. https://doi.org/10.1186/s40594-025-00502-w
  18. Mardia, K. (1970). Measures of multivariate skewness and kurtosis with applications. Biometrika, 57(3), 519-530. https://doi.org/10.1093/biomet/57.3.519
  19. Marquardt, D. (1970). Generalized inverses, ridge regression, biased linear estimation, and nonlinear estimation. Technometrics, 12(3), 591-612. https://doi.org/10.1080/00401706.1970.10488634
  20. Miranda, E., Santoso, A., y Widiyaningtyas, T. (2024). Machine learning’s model-agnostic interpretability on the prediction of students’ academic performance. Internet of Things, 25, 101152. https://doi.org/10.1016/j.iot.2024.101152
  21. Pan, J., Zhang, Y., Liu, H., Chen, S., y Wang, X. (2025). Academic Performance Prediction Using Machine Learning Approaches: A Comprehensive Survey. IEEE Access, 13, 10810756. https://doi.org/10.1109/ACCESS.2025.10810756
  22. Romero, C., y Ventura, S. (2020). Educational data mining and learning analytics: An updated survey. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 10(1), e1355. https://doi.org/10.1002/widm.1355
  23. Siemens, G. (2013). Learning analytics: The emergence of a discipline. American Behavioral Scientist, 57(10), 1380-1400. https://doi.org/10.1177/0002764213498851
  24. Tan, M., y Shneiderman, B. (2023). Academic Performance Prediction Model Using Classification Algorithms: Exploring the Potential Factors. Journal of Educational Computing Research, 61(4), 923-948. https://doi.org/10.1177/07356331231169234
  25. Wang, X., Zhang, L., y Li, M. (2022). Predicting Student Academic Performance using Support Vector Machine and Random Forest. Education and Information Technologies, 27(5), 6845-6862. https://doi.org/10.1007/s10639-021-10769-1
  26. Zhang, X., Liu, M., Wang, Y., y Chen, L. (2024). Predicting student academic performance using Bi-LSTM with attention mechanism. Frontiers in Education, 9, 1581247. https://doi.org/10.3389/feduc.2024.1581247
  27. Zhao, Q., Chen, J., Liu, Y., y Xu, H. (2024). Predicting student performance and enhancing learning outcomes using educational data mining. Computers, 14(3), 83. https://doi.org/10.3390/computers14030083