DATA MINING APPLICATION FOR DETERMINING STUDENT’S ACADEMIC PERFORMANCE (A CASE STUDY OF KWARA STATE POLYTECHNIC, ILORIN)
ABSTRACT
The project mainly focused on developing an application for information extract or retrieval from pool of data (i.e. a large database) to form basis for decision making. Information extracted from the database in the course of data mining process can be presented in graphical format in form of graphs patterns, histogram, etc. and also in text format. The reason for suggesting the project is the need for employing computer software medium for sanitizing academic standard through computer based decision making. Data mining package can present clear reasons and factor that affects students’ performance and hence allow administrators to derive strategic means of tackling such issues. The package will be developed in a .net integrated development environment (.net IDE). The package IDE is chosen following the fact that extracted information needs to be presented in an enhanced pictorial/graphical format and easy communication with the database for program flexibility in windows platform.
TABLE OF CONTENTS
Title page
Certification
Dedication
Acknowledgment
Abstract
Table of Contents
CHAPTER ONE
1.1GENERAL INTRODUCTION
1.1 Introduction
1.2 Statement of the problem
1.3 Aims and objectives
1.4 Significance of the study
1.5 Scope and limitations
1.6 Organization of report
1.7 Definition of terms/acronyms
CHAPTER TWO
2.0 LITERATURE REVIEW
2.1 Data mining in higher education
2.2 Review of general text
2.3 Research and evolution of data mining
2.4 Data mining process
2.5 Academic analytics
2.6 Data mining in higher education
CHAPTER THREE
3.0PROJECT METHODOLOGY
3.1. Methods of data collection
3.2 Description of the existing system
3.3 Problems of the existing system
3.4 Description of the proposed system
3.5 Advantages of the proposed system
3.6 Design and implementation methodologies
CHAPTER FOUR
4.0 DESIGN, IMPLEMENTATION AND DOCUMENTATION OF THE SYSTEM
4.1 Design of the system
4.2 Output design
4.3 Input Design
4.4 Database design
4.5 Procedure Design
4.6 Implementation of the system
4.6.1Hardware Support
4.6.2Software support
4.7 Documentation of the system
4.7.1Operating the system
4.7.2Maintaining the system
CHAPTER FIVE
5.0 SUMMARY AND CONCLUSION
5.1 Summary
5.2 Conclusions
5.3 Recommendation
REFERENCES
APPENDICES
1. System flowchart
2. Program flowchart
3. Source Program listing
4. Computer output
CHAPTER ONE
GENERAL INTRODUCTION
1.1 INTRODUCTION
Data mining is a branch of computer science which deals with the process of extracting patterns from large data sets by combining methods from statistics and artificial intelligence with database management. Data mining is seen as an increasingly important tool by modern business to transform data into business intelligence giving an informational advantage. It is currently used in a wide range of profiling practices, such as marketing, surveillance, fraud detection, and scientific discovery. (Clifton, 2010)
The related terms data dredging, data fishing and data snooping refer to the use of data mining techniques to sample portions of the larger population data set that are (or may be) too small for reliable statistical inferences to be made about the validity of any patterns discovered. These techniques can, however, be used in the creation of new hypotheses to test against the larger data populations. (Clifton, 2010)
Performance monitoring involves assessments which serve a vital role in providing information that is geared towards helping students, teachers, administrators, and policy makers to take decisions.(Counsil, 2001) The changing factors in contemporary education has led to the quest to effectively and efficiently monitor students’ performance in educational institutions, which is now moving away from the traditional measurement and evaluation techniques to the use of Data Mining Techniques which employ various intrusive data penetration and investigation methods to isolate vital implicit or hidden information. Due to the fact that several new technologies have contributed and generated huge explicit knowledge, causing implicit knowledge to be unobserved and stacked away within huge amounts of data. The main attribute of data mining is that it subsumes Knowledge Discovery which according to Frawley (1991) is a nontrivial process of identifying valid, novel, potentially useful and ultimately understandable patterns in data processes, thereby contributing to predicting trends of outcomes by profiling performance attributes that supports effective decisions making. This project deploys theory and practice of data mining as it relates to students’ performance and monitoring program in Kwara State Polytechnic, Ilorin.
Technological developments and new programming techniques have improved understanding and use of Artificial Intelligence (AI). The isolation of hidden data and exposed relationships embedded within it, without a prior knowledge of the nature of any inherent relationship leading [Rubenking 2001] to assert that data mining is a logical evolution of database technology with the development of enhanced query tools such as SQL, database managers are capable querying data more flexibly. Rules derived from various algorithms during the implementation of Data Mining Tools in researches, support this opinion.
Recently educational institutions target activities within its organizations with computer-based tools to handle and store huge data available in educational processes for hidden patterns. The face value assessment of students at the point of entry can only be confirmed or dispelled by the dynamic follow-up monitoring of students’ performance during the course of study leading to serve as an indicator of the suitability and unsuitability of students before admission and during their course of study.
Fuzzy Set Theory is used in applications involving educational assessment and performance as it is regarded as efficient and effective in uncertain situations involving performance assessment. It is known that Expert Fuzzy scoring systems noted [Nolan 1998]; help teachers make assessment in less time and with a level of accuracy that compares favorably to the best teacher examiner. The package will be developed using dot net frame work(c#) crumple with mysql database. Graphics will be use in this project work to give a quick view of the level of performance of student fetching record from the database.
1.2 STATEMENTS OF THE PROBLEM
The ideal goal of higher education is to continually maintain sustainable increasing graduation rates and growth with the most efficient procedures that allows for the accounting of input resources. The degree of quality students’ involves the pertinent issue of how to enhance and evaluate it through overt and covert processes. Hence, Data Mining processes for knowledge is the data which while dependent on quality, characteristics and preparation, supports and facilitates the thorough examination of the data’s different aspects for knowledge discovery in tertiary processes. The result helps Kwara State Polytechnic, Ilorin to predict the degree of likelihood of a student’s persistence, learning outcomes in terms of performance and by using computer-based evaluation tools, meaningful learning outcome topologies are created using charts and graphical representations. Other studies have shown that some techniques are particularly beneficial for the various sub process.
1.3 AIM AND OBJECTIVES OF THE STUDY
The aim of this project is to design a computer-based application that summarizes all the qualities of assessment and performance monitoring of students’ which when expanded holds key information that answers questions on students’ academic performances. The objectives are as follow:
I. To observe and compare individual, segmented and well aggregated students’ performance variables by analyzing the whole student base activities and then building one predictive model.
II. To provide a continuous “Just-In-Time” student performance assessment model for predicting performance with reasonable degree of accuracy, thereby enhancing monitoring of student academic pursuance and any other stakeholder’ interests, at any point, for any student during the student’s tenure at the educational institution.
III. To develop computer-based modeling process that will be effective and integrate all the data objects and rules needed for performance prediction allowing for quality control in the institution,using .netime.
1.4 SIGNIFICANCE OF THE STUDY
Data mining is a system of searching through large amounts of data. It is a relatively new concept which is directly related to computer science. Despite this, it can be used with a number of older computer techniques such as statistics.
There are a number of software products that have been designed for those who wish to use data mining techniques. Once you are able to search through large amounts of information, you will be able to analyze it in a large number of different ways. Once you've analyzed the information, you can make conclusions and decisions which are based on logic. While the term data mining is a new concept, the concept of searching through data for patterns is not. Many large institutions have powerful computers that allow them to search through information to analyze reports over a given period of time.
What sets data mining apart from these older research methods is that data mining is a result of the advancement of computer processing power. In addition to this, the storage capabilities of contemporary computers have allowed data mining to be much more accurate than techniques that were used in the past. Because most data mining tools come in the form of software, the costs involved with searching and analyzing information have greatly dropped.
1.5 SCOPE AND LIMITATIONS OF THE STUDY
Data mining in academics needs large volume of data which analytical conclusions can be drawn from. This project work focused on developing an application that will take live data of students’ GP per semester and store it in a database. Analysis of Students’ Academic Performance Monitoring and Evaluation can be drawn from the stored results. Reports of the performance evaluation can be extracted and categorized by users’ choice; per semester, session evaluation of students’ performance grouped by department, institute or the entire polytechnic as a whole.
The proposed system does not take account of evaluating any factor affecting students’ performance. The system as well is not target at computing students’ result or function as record keeping software for students in the institution. Due to difficulties foreseen in covering the entire polytechnic as a case study for the research, the research coverage is limited to Institute of Basic and Applied Sciences (IBAS).
1.6 ORGANISATION OF THE REPORT
This research work provides efficient way of handling importation and exportation operation job and sheds more light on how to design software for it. The project consists of five chapters. The preliminaries contain the title page, table of contents and abstract.
Chapter one contains the introduction of the study, statement of the research problem, aims and objectives of the study, significance of the study, and the organization of the report.
Chapter two contains the literature review on data mining and its implementation in academic and students’ performance analysis. It also discusses issues related to data mining and it is used for academic performance in higher institutions.
Chapter three contain analysis of the existing and proposed system, which entails method employed in gathering facts, analysis and problems of the existing system, its contain the description of the current system, problems of the existing system, Description of the proposed system, Advantage of the proposed system , Disadvantage of the proposed system, implementation techniques and choice of programming language.
Chapter four is basically contains Design implement and Documentation of the system. It contain output design, input design, file design, procedure design, contain implementation technique, programming language, Hardware and Software, it contains document of the system.
Chapter five contains the summary, Experience, problem encountered, Recommendation and conclusion.
1.7 DEFINITION OF TERMS
SQL: SQL, often referred to as Structured Query Language, is a database computer declarative language designed for managing data in relational database management systems (RDBMS), and originally based upon relational algebra and topple relational calculus. [http//en.wikipedia,or/wiki/SQL]
Data Mining: Generally, data mining (sometimes called data or knowledge discovery) is the process of analyzing data from different perspectives and summarizing it into useful information - information that can be used to increase revenue, cuts costs, or both.
(http//www.anderson.ucla,edu/faculty/jason.frand/teacher/technologies/palace/datamining.htm)
Decision trees: The term neural network was traditionally used to refer to a network or circuit of biological neurons. The modern usage of the term often refers to artificial neural networks, which are composed of artificial neurons or nodes.
Co linearity: A set of points is collinear (also co-linear or colinear) if they lie on a single straight line or a projective line (for example, projective line over any field). [http//en.wikipedia,or/wiki/collinearitydata_modelling]
Data Modeling: Data modeling is a method used to define and analyze data requirements needed to support the business processes of an organization. The data requirements are recorded as a conceptual data model with associated data definitions. [http//en.wikipedia,or/wiki/data_modelling]
.