C1 Insights | Correlation One Blog

DS4A Capstone Project Spotlight | Analyzing and Predicting Academic Dropout at the Universidad de Córdoba

Written by Correlation One | November 2, 2021

In this blog series, we’re proud to shine a light on some of the top Capstone projects from the fifth graduating class of Data Science for All / Colombia. Capstone projects are a critical component of the program’s curriculum. Teams work on projects together to apply what they learn during the program to a real-world data problem. These projects were sourced directly from public and private entities in Colombia and solve a real problem these entities are facing. Through these projects, our graduates learn practical, job-oriented data skills and give back to their community using the power of data science & AI.

Meet the Team

Dennis García

Electronic engineer, master in IT, working as a project manager in innovation, digital transformation, technology adoption, e-government, and other ICT related projects. He hopes that after graduating from the DS4A program he could help more companies and people using data science, and also improve future teams performance and monitoring with better tools to visualize and understand the data .

Juan José Rodríguez

Juan is a last year student of Civil Engineering at Universidad Nacional de Colombia. He decided to join DS4A to enhance his data science and analytics skills. Juan also works as a Business Intelligence Analyst at Teleperformance, applying the knowledge gained during the program on text mining based projects using natural language processing techniques. He plans to continue learning more about data science and machine learning for career development.

Fabio Andrés Bombiela Ramírez

Fabio is an Electronic Engineer who is currently doing a master's degree in information science and working as an automation tester in the IT area. Fabio has always been interested in acquiring new knowledge and took the opportunity to participate in the DS4A call. His goals are to apply what he learned during the program to different areas such as electronics and other related subjects and continue learning all about Data science. He is an enthusiast in IoT topics.


Erika Ximena Rojas González

Last semester student of Industrial Engineering, with a special interest in telling stories through data. She currently works in People Recruitment Analytics in a company from the financial sector.

 

 

 

 

Iván David Molina Naizir

Mathematician, Master in Systems Engineering with an emphasis on Software Engineering. Currently working as a Business Intelligence developer in Gases del Caribe. I chose to take part in the DS4A course, to strengthen my knowledge in Data Science in order to put it into practice in my work environment.

 

Carolina Albarracín Hernández

Mathematician, Master in Mathematical Sciences and PhD candidate with a great interest in the study of partial differential equations, in mathematical modeling and programming. I am currently dedicated to teaching and learning about mathematical applications.

 

 

 

 

Daniel Otero

Daniel is an Electronic Engineer who holds a M.Sc. in Electronics and a Ph.D. in Applied Mathematics. He currently works as a Research Professor at the Tecnológico de Monterrey. He joined the DS4A program to expand his knowledge in the areas of Data Science and Machine Learning since these are two of the fields that he is currently most interested in.

 

About the Project: Analyzing and Predicting Academic Dropout at the Universidad de Córdoba

Project Overview

Given the circumstances created by the pandemic, it was in the university’s interest to study, analyze and predict the academic performance of the students based on their online activities on both LMS and the academic platform Power Campus. Also, the development of technological tools to attain a deep understanding of the academic behaviour of the students and to propose institutional strategies of continuous improvement is highly desirable. Given this, the following research question was formulated:

How can the academic performance of the students be predicted from their online activities on both LMS and the academic platform Power Campus so that student's desertion can be reduced?

Click to read the datafolio

What was the most exciting/surprising findings from your project? 

When exploring the data, something surprising, which is also good news, is that academic dropout is not a serious issue at the Universidad de Córdoba, nevertheless, this led to imbalanced training sets, which was a challenge for developing our models.
Also, we noticed that there were some students that were not so “young,” which shows that “learning has no age.”

What were some challenges you faced and how did you overcome them?

Having a large amount of data can be an advantage for training models, but preprocessing and cleaning the data, as well as having a fast interface for a good user experience, can be difficult tasks when data is large. We overcame these challenges by using a variety of tools for processing the data such as PostgreSQL, Google Cloud and Java.

Also, we had to deal with imbalanced training sets, which is not ideal for training machine learning models. Assigning weights to the training classes and choosing appropriate performance metrics for picking the right model were two key factors for obtaining good results.

Who is your team’s mentor and how did he/she help?

Our teacher assistants were Luis Rojas and Esteban Betancur. They guided us through the different phases of the project and helped us to find technical assistance for building the user interface of our app using the Dash framework.

What do you view as the impact of your project?

We think our solution will mainly contribute to an early detection of desertion in the university population and, consequently, mitigate it and avoid it. In addition, an interactive dashboard was developed to visualize the available information that will contribute to use data efficiently for decision-making by the university's professors and administrative personnel.

Furthermore, this project will help the administrative staff and teachers to see how the Moodle platform is used by the different actors: students, teachers and editing teachers. Also, the user of the app can analyze and compare the different campuses, programs and courses by the usage of the Moodle platform and the academic performance of the students. Evenmore, teachers will find the app useful for detecting which students are at a high risk of dropping out so that they can give them special assistance.

 

Congratulations to this team, their mentors, and TA, for this accomplishment! 

If you're interested in joining our Data Science for All mission to recruit our Data Science for All fellows or to become a Mentor, please get in touch.