Machine Learning Basics

Table of contents

Definition : A common definition of machine learning is (Mitchell, 1997): “A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P if its performance at tasks in T, as measured by P, improves with experience E.”

Training Set

The act of creating a prediction model from previously known data is called training, and such data is called the training data or a training set. After the model is created, it must be applied to another data set to test its effectiveness. Data used for such purpose is called test data or test set.

Educational Data Mining

Data mining is a process of sorting data and extracting information from existing databases. With the help of pattern mining and data analysis, hidden information can be obtained from huge datasets. The strategy of data mining is now applied in the field of education by researchers.

They are busy in exploiting a lot of dimensions in education sector. This is now known as educational data mining. Data mining is being applied in educational sector by considering the performance of students and finding the position of students by using their academic records.

Educational dataset is being collected from various resources such as interactive learning systems, computer-supported collaborative systems, and administrative datasets of school, colleges and universities. Data mining methods are now implemented in well known universities to analyze the patterns of student performance from the dataset through which information can be extract and decision making may become easier for the management of institutions.

With the incremental growth in the use of technology everywhere, educational institutions are now busy in finding hidden trends and patterns in their larger datasets. With the help of these sources, dataset can easily be collected if authorization is accessed. One purpose of extracting information from its own dataset is to make its prestige among other educational institutions stronger. Another purpose is to build the student career.

Data mining is often used to build predictive/inference models aimed to predict future trends or behaviors based on the analysis of structured data. In this context, prediction is constructing the model and used to assess the class of an unlabeled example, or to assess the value or value ranges of an attribute.

We have proposed data mining process for evaluation of school dropout and failure. Experiment done on real information of 200 university students of Mehran University of Engineering and Technology. Data mining should work the same way as a human brain. It uses historical information (experience) to learn. However, in order for data mining technology to get information out of the database, the user must “tell it” what the information looks like (i.e. what is the problem that the user would like to solve).

It uses the description of that information to look for similar example in database, and uses these pieces of information from the past to develop a predictive model of what will happen in the future. The essential ingredient in building a successful predictive model is to have some information in the database that describes what has happened in the past. Data mining tools are designed to “learn” from these past success and failure (theoretically as a human being would), and then be able to predict what is going to happen next.

However, one of the major advantages of a data mining tool over a human mind is that data mining tool can automatically go through a very large database quickly, and find even the smallest pattern that may help in a better prediction.

Our main objectives of this proposed work are:

To understand, analyze and then find the difference between different prediction techniques of data mining in education.
To identify and understand different student attributes which are mainly used for the predicting the student performance.

Predicting Student Performance

Predicting student’s performance by using data mining techniques to extract information from the academic dataset of universities has become state of the art research in the scientific society. Universities are facing with some challenges now a day to analyze the performance of their students; only being active in class is not to analyze student performance that’s why we create such a system which will try to improve student performance.

We are focusing on student’s profiles and characteristics to make the university management aware of student’s performance and overall academic result. There is another dimension of student’s performance that is the dependence of student retention upon student student’s performance. To minimize the problem of student retention cases in the universities, different researchers have proposed different methods to predict the performance of students in their future semester based on the performance of previous one.

Student Data Attributes

For predicting the next semester academic performance of student based on previous academic record of student we taken data of two batches of Computer System (15CS & 16CS) till now and have considered following attributes in our project that are:

  • Roll No
  • Subject Marks
  • Attendance Marks
  • Sessional Marks
  • Mid–term Marks
  • Practical Marks
  • GPA

Based upon these parameters, recommended system can be trained to predict the grades of students accurately in any of the educational institution. We had used KNN algorithm approach for predicting student academic performance.

K – Nearest Neighbor

K – Nearest Neighbor (KNN) is a supervised learning algorithm. It is basically a classic method for clustering samples based on similarity. It is basically a non-parametric learning algorithm which belongs to data mining class. Its purpose is to use a database in which the data points are separated into several classes to predict the classification of a new sample point by matching it with previous data.

We have use KNN algorithm to obtain more accurate diagnostic results. KNN algorithm is used to analyze distance measurement using a set of data. Classification is a process of analyzing input and building a model for a class.

The K-NN algorithm can be used for:

  • Regression: predicting what number value a variable will have (if it is a variable that varies with time, it’s called ‘time series’ prediction).
  • Classification: predicting what category or class a case falls.

An alternate way of understanding KNN is by thinking about it as calculating a decision boundary (i.e. boundaries for more than 2 classes) which is then used to classify new points.
For Example:

  • Feature Extraction

Feature extraction is the transformation of high-dimensional data input data into a meaningful representation of reduced dimensionality. So, basically transforming the input data into some particular set of features is called feature extraction. The representation extracted is often beneficial to improve the accuracy of a particular classifier. And, feature extraction is basically performed on raw data prior to applying k-NN algorithm on the transformed data in feature space.

Issues Regarding Classification

  • Missing Data

Missing data values cause problems during both the training phase and to the classification process itself. For example, the reason for non-availability of data may be due to:

  • Equipment malfunction

Deletion due to inconsistency with other recorded data

This missing data can be handled using following approaches:

  1. Data miners can ignore the missing data
  2. Data miners can replace all missing values with a single global constant
  3. Data miners can replace a missing value with its feature mean for the given class
  4. Data miners and domain experts, together, can manually examine samples with missing values and enter a reasonable, probable or expected value

In our case, the chances of getting missing values in the training data are very less. The training data is to be retrieved from the admission records of a particular institute and the attributes considered for the input of the classification process are mandatory for each student. The tuple which is found to have a missing value for any attribute will be ignored from training set as the missing values cannot be predicted or set to some default value. Considering low chances of the occurrence of missing data, ignoring missing data will not affect the accuracy adversely.

The methodology of Algorithm:

Firstly real data is gathered of almost 200 students and that data is pre-processed.

Then that data set is trained and tested using a particular algorithm.

Then K-NN algorithm is applied to that data set to build prediction models then, predictions made by these models are compared using common evaluation criteria, such as accuracy, precision, and recall.

Testing data set is compared with training data set to check the accuracy of the algorithm.

Advantages of K-NN:

KNN has several main advantages: simplicity, effectiveness, intuitiveness and competitive classification performance in many domains. It is Robust to noisy training data and is effective if the training data is large.

Disadvantages of K-NN:

Despite the advantages given above, KNN has a few limitations. KNN can have poor run-time performance when the training set is large. It is very sensitive to irrelevant or redundant features because all features contribute to the similarity and thus to the classification.

Two other disadvantages of the method are:

  1. Distance-based learning is not clear which type of distance to use and which attribute to use to produce the best results.
  2. Computation cost is quite high because we need to compute distance of each query instance to all training samples.

Applications of K-NN:

KNN as a data mining technique has a wide variety of applications in classification as well as regression. Some of the applications of this method are mentioned below:

  • Text Mining
  • Agriculture
  • Finance
  • Medicine

Student attrition and retention

With the passage of time, growth of private educational institutions has been increased up to the remarkable extend. These institutions have become source of higher learning and business entity. Therefore, maximum number of student’s enrollment is its lifeline. For the survival of private institutions, profitability, proper management and alignment are mandatory. In this respect, student retention until the completion of degree is quite necessary.

That’s why institutions are finding that factor that ultimately causes student attrition. After analyzing those factors, it is important for educational institutions to make strategic adjustments accordingly to improve student retention in institutions. The problem of student attrition and retention is not new for the educational institutions.

It has been enlightened by the researchers from the fields of data mining and information visualization. Now it has become very common research problem for the researchers. Student attrition and retention been observed by the researchers when this problem was raised up to the ratio of 50% on the colleges of Ontario. To reduce attrition rates, institutions should focus on student retention.


University students in all degree programs are motivated to enroll into university programs by a desire for personal accomplishment and completion of a previously set goal. All mature-age students in all degree programs are often believed to be highly motivated to return to university for promotion in their employment, improvement of their professional skills.

Kantanis (1999) observed that some mature-age students engage in studies because they want to enjoy personal advancement and achieve a higher status in their professional positions. Hence, motivation to embark on a career is clearly linked to expectations that the career will bring about the desired rewards and prestige.

Science is considered to be challenging, hence, students doing Science Education feel proud once they achieve their goal of successful completing the program. For instance, interest in the subject, perception of its usefulness, general desire to achieve, self-confidence, self-esteem, patience and persistence are factors motivating students to engage in studies.

In Science Education some students are motivated to choose the program in this area by approval from significant others while other students are motivated by the desire to overcome the perceived challenges in these program as they acquire new knowledge and skills.

Social factor

Social support is a factor that can affect academic performance of students both negatively and positively. The social support networks have great value to enhance academic performance as students form friendship groups to exchange information on assignments and find out about tutorials and lecture schedules. Peer support and relationships have been found to enhance persistence of students both directly and indirectly. Parker and Johnson (1981) note that student-to-student interactions with peers have shown to be an extremely effective form of learning.

Calculate the price
Make an order in advance and get the best price
Pages (550 words)
*Price with a welcome 15% discount applied.
Pro tip: If you want to save more money and pay the lowest price, you need to set a more extended deadline.
We know how difficult it is to be a student these days. That's why our prices are one of the most affordable on the market, and there are no hidden fees.

Instead, we offer bonuses, discounts, and free services to make your experience outstanding.
How it works
Receive a 100% original paper that will pass Turnitin from a top essay writing service
step 1
Upload your instructions
Fill out the order form and provide paper details. You can even attach screenshots or add additional instructions later. If something is not clear or missing, the writer will contact you for clarification.
Pro service tips
How to get the most out of your experience with MyStudyWriters
One writer throughout the entire course
If you like the writer, you can hire them again. Just copy & paste their ID on the order form ("Preferred Writer's ID" field). This way, your vocabulary will be uniform, and the writer will be aware of your needs.
The same paper from different writers
You can order essay or any other work from two different writers to choose the best one or give another version to a friend. This can be done through the add-on "Same paper from another writer."
Copy of sources used by the writer
Our college essay writers work with ScienceDirect and other databases. They can send you articles or materials used in PDF or through screenshots. Just tick the "Copy of sources" field on the order form.
See why 20k+ students have chosen us as their sole writing assistance provider
Check out the latest reviews and opinions submitted by real customers worldwide and make an informed decision.
Business and administrative studies
looks good thank you
Customer 452773, March 3rd, 2023
Criminal Justice
The paper was not accused of plagiarism and was written very well. I will let you know the grade once it is graded. Thank you
Customer 452671, April 26th, 2021
Thank youuuu
Customer 452729, May 30th, 2021
Human Resources Management (HRM)
excellent, great job
Customer 452773, June 19th, 2023
Human Resources Management (HRM)
Customer 452773, July 11th, 2023
Leadership Studies
excellent job as always
Customer 452773, September 2nd, 2023
Human Resources Management (HRM)
Customer 452773, June 25th, 2023
Social Work and Human Services
Great work I would love to continue working with this writer thought out the 11 week course.
Customer 452667, May 30th, 2021
Human Resources Management (HRM)
excellent job
Customer 452773, July 17th, 2023
Business and administrative studies
always perfect work and always completed early
Customer 452773, February 21st, 2023
Great job
Customer 452773, February 13th, 2023
Business and administrative studies
Excellent work ,always done early
Customer 452773, February 21st, 2023
Customer reviews in total
Current satisfaction rate
3 pages
Average paper length
Customers referred by a friend
15% OFF your first order
Use a coupon FIRST15 and enjoy expert help with any task at the most affordable price.
Claim my 15% OFF Order in Chat

Sometimes it is hard to do all the work on your own

Let us help you get a good grade on your paper. Get professional help and free up your time for more important courses. Let us handle your;

  • Dissertations and Thesis
  • Essays
  • All Assignments

  • Research papers
  • Terms Papers
  • Online Classes
Live ChatWhatsApp