Data Mining Application for Determining Students’ Academic Performance (a Case Study of Kwara State Polytechnic, Ilorin)
Chapter One
AIM AND OBJECTIVES OF THE STUDY
The aim of this project is to design a computer-based application that summarizes all the qualities of assessment and performance monitoring of students’ which when expanded holds key information that answers questions on students’ academic performances. The objectives o the study are as follows:
- To observe and compare individual, segmented and well aggregated students’ performance variables by analyzing the whole student base activities and then building one predictive model.
- To provide a continuous “Just-In-Time” student performance assessment model for predicting performance with reasonable degree of accuracy, thereby enhancing monitoring of student academic pursuance and any other stakeholder’ interests, at any point, for any student during the student’s tenure at the educational institution.
- To develop computer-based modeling process that will be effective and integrate all the data objects and rules needed for performance prediction allowing for quality control in the institution.
CHAPTER TWO
LITERATURE REVIEW
REVIEW OF PAST WORK ON THE SUBJECT
As state in project undertaken by Rahmon Mosunmola Mariam (2010) titled “Timetable Generation System for Academic Purpose”; “Heuristics ordering based methods, very similar to those used for graph colouring problems, have long been applied successfully to the examination timetabling problem. Despite the success of these methods on real life problems, even with limited computing resources, the approach has the fundamental flow that it is only as effective as the heuristic that is used. One of the motivations of this paper is to attempt to develop approaches that can operate at a higher level of generality and that can adapt heuristics to suit the particular problems in hand”.
In a project carried out by Abdulafeez Olaitan (2000) titled “Computerization of Student Result of Academics Records, a case study of Kwara State Polytechnic Ilorin; the issue of processing students result was attempted but not solved as this work is implemented using an Input/Output file method. This method of managing records is a old system with no level of data and information security at all. Among the advantages of the project stated are implementation of students result entry across a network of computer and the application is said to solve the problem of redundancy and record duplication. The programming interface displayed in the project does not show the implementation of a network system of data entry and more so, duplication and redundancy cannot be well handled by an I/O file record keeping method.
Adademo Christiana Iyabo (1998) carried out a project work titled “Computerization of Issuance of Final Year Students’ Result”, which covers the concept of and is limited to production of National Diploma students’ result only. The work entails both result processing and issuance of result for Final-year ND students. One major problem stated in her work that prompted for development of the system of the population of the students which overshadowed the available resources for result processing. A setback of this project is from the programming language used in the program implementation. The writer made use of Dbase IV programming language to develop the proposed application for result entry, which a DOS based program without a defined user-friendly interface.
REVIEW OF GENERAL TEXT
Several authors have written on data mining and its application on several fields of study. Data mining can be applicable in medicine, academics, engineering, etc. Frawley defined data mining as “The nontrivial extraction of implicit, previously unknown, and potentially useful information from data”. It uses machine learning, statistical and visualization techniques to discovery and present knowledge in a form which is easily comprehensible to humans. Data Mining evolved from a simple extraction of raw data to an analytical process of exploring large amount of data in order to cite the common denominators or patterns. (Frawley, 1991)
Kantardzic further emphasized that manual extraction of patterns from data has occurred for centuries. Early methods of identifying patterns in data include Bayes’ theorem and Regression Analysis. The proliferation, ubiquity and increasing power of computer technology have increased data collection, storage and manipulations. As data sets have grown in size and complexity, direct hands-on data analysis has increasingly been augmented with indirect, automatic data processing. He stressed that data mining has been aided by other discoveries in computer science, such as neural networks, clustering, genetic algorithms, decision trees and support vector machines. Data mining is the process of applying these methods to data with the intention of uncovering hidden patterns. It has been used for many years by businesses, scientists and governments to sift through volumes of data such as airline passenger trip records, census data and supermarket scanner data to produce market research reports. (Kantardzic 2003)
A primary reason for using data mining is to assist in the analysis of collections of data. Such data are vulnerable to collinearity because of unknown interrelations. An unavoidable fact of data mining is that the subsets of data being analysed may not be representative of the whole domain, and therefore may not contain examples of certain critical relationships and behaviours that exist across other parts of the domain. To address this sort of issue, the analysis may be augmented, using experiment-based and other approaches, such as Choice Modeling for human-generated data. In these situations, inherent correlations can be either controlled for, or removed altogether, during the construction of the experimental design. [Miller, H. and Han, J., 2001]
CHAPTER THREE
PROJECT METHODOLOGY
METHOD OF DATA COLLECTION
Data collection method employed in gathering information for this project involves the following:
Personal Observation
This was conducted through physical observation within the organization. This form of investigation was introduced to give room to measuring the extent to which the company actually lacks computer operational facilities the constraint resulting from the managerial protocols.
The Internet
Internet survey helps in gathering related online journals and texts. Relevant information are extracted from these materials and used in aiding the effective analysis and design of this project.
Literature review
Relevant information was also extracted from literature review. This refers to reading of textbooks and past projects for related information
CHAPTER FOUR
DESIGN, IMPLEMENTATION AND DOCUMENTATION OF THE SYSTEM
DESIGN OF THE SYSTEM
System design describes the output deign, input design, procedure design and database design. Output design gives the description of the program outputs. Input design dives detail of the input medium and interface, database design gives detail specification of the database structure while procedure design describes the modules that make up the program.
CHAPTER FIVE
CONCLUSION AND RECOMMENDATIONS
SUMMARY
Performance monitoring involves assessments which serve a vital role in providing information that is geared to help students, teachers, administrators, and policy makers to take decisions (Council, 2001) The changing factors in contemporary education has led to the quest to effectively and efficiently monitor student performance in educational institutions, which is now moving away from the traditional measurement and evaluation techniques to the use of data mining techniques which employs various intrusive data penetration and investigation methods to isolate vital implicit or hidden information.
Assessment as a dynamic process produces data that reasonable conclusions are derived by stakeholders for decision making that expectedly impact on students’ learning outcomes. The data mining methodology while extracting useful, valid patterns from higher education database environment contribute to proactively ensuring students maximize their academic output. This project has developed a methodology by the derivation of performance prediction indicators to deploying a simple student performance assessment and monitoring system within a teaching and learning environment by mainly focusing on performance monitoring of students’ examination scores in order to predict their final achievement status upon graduation. Based on various data mining techniques and the application of machine learning processes, rules are derived that enable the classification of students in their predicted classes. The deployment of the computer-based solution, integrates measuring, ‘recycling’ and reporting procedures in the new system to optimize prediction accuracy.
EXPERIENCE GAINED AND PROBLEMS ENCOUNTERED
The research work has helped to improve my knowledge in several ways. These include understanding the basic concept of data mining and how it is applicable in different fields of study such as academic, medicine, engineering, etc. I gained more experience in how important Microsoft visual basic is on application development. The project made me believe that computer programming is just far beyond computing sum and average of numbers. Surfing the internet for related literatures on data mining reveals many ideas that are (since these days) implicit to me. I realize how complicated data mining techniques are, its role in the academic settings and how it can be used to enhance administrative performance through effective decision making. It is quite impossible for someone to go through a research work individually or collective without encountering some problems along side. The problems encountered ranges from limited resources to time constraint. Data mining is a very wide aspect to cover within the stipulated time and gathering resources for the project initially looks difficult but thanks to WWW consortium that has made the internet a knowledge world bank for everybody where you can borrow without collaterals.
CONCLUSION
The result of this project indicates that Data Mining Tools capabilities provided effective monitoring tools for student academic performance with overall 94% success rating and fine tuning derived variables improves rules quality producing improved performance.
The various reporting tools that this system offers serve mainly to compare changes over time in performances as may be affected by the different rules that are available plus other well chosen variables exposes systematic structures required to improve performance monitoring. Computer-based implementation with dynamic reporting capabilities and efficiency is perceived as better solution and recommended for very large student databases in Oracle or MS SQL Server database environment
RECOMMENDATIONS FOR FUTURE WORK
The encouraging results obtained on application of knowledge discovery, begs for a comprehensive strategic implementation, an integration of the results of other research efforts in areas such as Instructor assessment and performance, curriculum, course relevance, student attitude, demographics, etc and its impact on the student learning process must be determined and integrated into any prototype. Learning process must be determined and integrated into any prototype performance, course relevance, student attitude, demographics, etc and its impact on the student learning process must be evaluated and integrated into any future performance monitoring prototype. Data Mining Tools has a potential in performance monitoring of High school and other levels education offering historical perspectives of students’ performances. The results may both complement and supplement tertiary education performance monitoring and assessment implementations.
REFERENCES
- Alex Guazzelli, Wen-Ching Lin, Tridivesh Jena (2010). PMML in Action: Unleashing the Power of Open Standards for Data Mining and Predictive Analytics. CreateSpace.
- Arnold K. E. (2010). Signals: Applying academic analytics. Educause Quarterly, 33.
- Black E. W., Dawson, K., & Priem, J. (2008). Data for free: Using LMS activity logs to measure community in online courses. Internet and Higher Education.
- Campbell J. P. (2007). Utilizing student data within the course management system to determine undergraduate student academic success: An exploratory study. Unpublished doctoral dissertation, Purdue University.
- Campbell J. P., DeBlois P. B., & Oblinger D. G. (2007). Academic analytics: A new tool for a new era. Educause Review.
- Castro F., Vellido A., Nebot A., & Mugica F. (2007). Applying data mining techniques to e-learning problems. Studies in Computational Intelligence.
- Cook C. E., Wright M. and O’Neal C. (2007). Action research for instructional improvement: Using data to enhance student learning at your institution. To Improve the Academy.