Machine Learning Basics

Table of contents

Definition : A common definition of machine learning is (Mitchell, 1997): “A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P if its performance at tasks in T, as measured by P, improves with experience E.”

Training Set

Get your 100% original paper on any topic done
in as little as 4 hours

Write My Essay

The act of creating a prediction model from previously known data is called training, and such data is called the training data or a training set. After the model is created, it must be applied to another data set to test its effectiveness. Data used for such purpose is called test data or test set.

Educational Data Mining

Data mining is a process of sorting data and extracting information from existing databases. With the help of pattern mining and data analysis, hidden information can be obtained from huge datasets. The strategy of data mining is now applied in the field of education by researchers.

They are busy in exploiting a lot of dimensions in education sector. This is now known as educational data mining. Data mining is being applied in educational sector by considering the performance of students and finding the position of students by using their academic records.

Educational dataset is being collected from various resources such as interactive learning systems, computer-supported collaborative systems, and administrative datasets of school, colleges and universities. Data mining methods are now implemented in well known universities to analyze the patterns of student performance from the dataset through which information can be extract and decision making may become easier for the management of institutions.

With the incremental growth in the use of technology everywhere, educational institutions are now busy in finding hidden trends and patterns in their larger datasets. With the help of these sources, dataset can easily be collected if authorization is accessed. One purpose of extracting information from its own dataset is to make its prestige among other educational institutions stronger. Another purpose is to build the student career.

Data mining is often used to build predictive/inference models aimed to predict future trends or behaviors based on the analysis of structured data. In this context, prediction is constructing the model and used to assess the class of an unlabeled example, or to assess the value or value ranges of an attribute.

We have proposed data mining process for evaluation of school dropout and failure. Experiment done on real information of 200 university students of Mehran University of Engineering and Technology. Data mining should work the same way as a human brain. It uses historical information (experience) to learn. However, in order for data mining technology to get information out of the database, the user must “tell it” what the information looks like (i.e. what is the problem that the user would like to solve).

It uses the description of that information to look for similar example in database, and uses these pieces of information from the past to develop a predictive model of what will happen in the future. The essential ingredient in building a successful predictive model is to have some information in the database that describes what has happened in the past. Data mining tools are designed to “learn” from these past success and failure (theoretically as a human being would), and then be able to predict what is going to happen next.

However, one of the major advantages of a data mining tool over a human mind is that data mining tool can automatically go through a very large database quickly, and find even the smallest pattern that may help in a better prediction.

Our main objectives of this proposed work are:

To understand, analyze and then find the difference between different prediction techniques of data mining in education.
To identify and understand different student attributes which are mainly used for the predicting the student performance.

Predicting Student Performance

Predicting student’s performance by using data mining techniques to extract information from the academic dataset of universities has become state of the art research in the scientific society. Universities are facing with some challenges now a day to analyze the performance of their students; only being active in class is not to analyze student performance that’s why we create such a system which will try to improve student performance.

We are focusing on student’s profiles and characteristics to make the university management aware of student’s performance and overall academic result. There is another dimension of student’s performance that is the dependence of student retention upon student student’s performance. To minimize the problem of student retention cases in the universities, different researchers have proposed different methods to predict the performance of students in their future semester based on the performance of previous one.

Student Data Attributes

For predicting the next semester academic performance of student based on previous academic record of student we taken data of two batches of Computer System (15CS & 16CS) till now and have considered following attributes in our project that are:

Roll No
Subject Marks
Attendance Marks
Sessional Marks
Mid–term Marks
Practical Marks
GPA

Based upon these parameters, recommended system can be trained to predict the grades of students accurately in any of the educational institution. We had used KNN algorithm approach for predicting student academic performance.

K – Nearest Neighbor

K – Nearest Neighbor (KNN) is a supervised learning algorithm. It is basically a classic method for clustering samples based on similarity. It is basically a non-parametric learning algorithm which belongs to data mining class. Its purpose is to use a database in which the data points are separated into several classes to predict the classification of a new sample point by matching it with previous data.

We have use KNN algorithm to obtain more accurate diagnostic results. KNN algorithm is used to analyze distance measurement using a set of data. Classification is a process of analyzing input and building a model for a class.

The K-NN algorithm can be used for:

Regression: predicting what number value a variable will have (if it is a variable that varies with time, it’s called ‘time series’ prediction).
Classification: predicting what category or class a case falls.

An alternate way of understanding KNN is by thinking about it as calculating a decision boundary (i.e. boundaries for more than 2 classes) which is then used to classify new points.
For Example:

Feature Extraction

Feature extraction is the transformation of high-dimensional data input data into a meaningful representation of reduced dimensionality. So, basically transforming the input data into some particular set of features is called feature extraction. The representation extracted is often beneficial to improve the accuracy of a particular classifier. And, feature extraction is basically performed on raw data prior to applying k-NN algorithm on the transformed data in feature space.

Issues Regarding Classification

Missing Data

Missing data values cause problems during both the training phase and to the classification process itself. For example, the reason for non-availability of data may be due to:

Equipment malfunction

Deletion due to inconsistency with other recorded data

This missing data can be handled using following approaches:

Data miners can ignore the missing data
Data miners can replace all missing values with a single global constant
Data miners can replace a missing value with its feature mean for the given class
Data miners and domain experts, together, can manually examine samples with missing values and enter a reasonable, probable or expected value

In our case, the chances of getting missing values in the training data are very less. The training data is to be retrieved from the admission records of a particular institute and the attributes considered for the input of the classification process are mandatory for each student. The tuple which is found to have a missing value for any attribute will be ignored from training set as the missing values cannot be predicted or set to some default value. Considering low chances of the occurrence of missing data, ignoring missing data will not affect the accuracy adversely.

The methodology of Algorithm:

Firstly real data is gathered of almost 200 students and that data is pre-processed.

Then that data set is trained and tested using a particular algorithm.

Then K-NN algorithm is applied to that data set to build prediction models then, predictions made by these models are compared using common evaluation criteria, such as accuracy, precision, and recall.

Testing data set is compared with training data set to check the accuracy of the algorithm.

Advantages of K-NN:

KNN has several main advantages: simplicity, effectiveness, intuitiveness and competitive classification performance in many domains. It is Robust to noisy training data and is effective if the training data is large.

Disadvantages of K-NN:

Despite the advantages given above, KNN has a few limitations. KNN can have poor run-time performance when the training set is large. It is very sensitive to irrelevant or redundant features because all features contribute to the similarity and thus to the classification.

Two other disadvantages of the method are:

Distance-based learning is not clear which type of distance to use and which attribute to use to produce the best results.
Computation cost is quite high because we need to compute distance of each query instance to all training samples.

Applications of K-NN:

KNN as a data mining technique has a wide variety of applications in classification as well as regression. Some of the applications of this method are mentioned below:

Text Mining
Agriculture
Finance
Medicine

Student attrition and retention

With the passage of time, growth of private educational institutions has been increased up to the remarkable extend. These institutions have become source of higher learning and business entity. Therefore, maximum number of student’s enrollment is its lifeline. For the survival of private institutions, profitability, proper management and alignment are mandatory. In this respect, student retention until the completion of degree is quite necessary.

That’s why institutions are finding that factor that ultimately causes student attrition. After analyzing those factors, it is important for educational institutions to make strategic adjustments accordingly to improve student retention in institutions. The problem of student attrition and retention is not new for the educational institutions.

It has been enlightened by the researchers from the fields of data mining and information visualization. Now it has become very common research problem for the researchers. Student attrition and retention been observed by the researchers when this problem was raised up to the ratio of 50% on the colleges of Ontario. To reduce attrition rates, institutions should focus on student retention.

Prestige

University students in all degree programs are motivated to enroll into university programs by a desire for personal accomplishment and completion of a previously set goal. All mature-age students in all degree programs are often believed to be highly motivated to return to university for promotion in their employment, improvement of their professional skills.

Kantanis (1999) observed that some mature-age students engage in studies because they want to enjoy personal advancement and achieve a higher status in their professional positions. Hence, motivation to embark on a career is clearly linked to expectations that the career will bring about the desired rewards and prestige.

Science is considered to be challenging, hence, students doing Science Education feel proud once they achieve their goal of successful completing the program. For instance, interest in the subject, perception of its usefulness, general desire to achieve, self-confidence, self-esteem, patience and persistence are factors motivating students to engage in studies.

In Science Education some students are motivated to choose the program in this area by approval from significant others while other students are motivated by the desire to overcome the perceived challenges in these program as they acquire new knowledge and skills.

Social factor

Social support is a factor that can affect academic performance of students both negatively and positively. The social support networks have great value to enhance academic performance as students form friendship groups to exchange information on assignments and find out about tutorials and lecture schedules. Peer support and relationships have been found to enhance persistence of students both directly and indirectly. Parker and Johnson (1981) note that student-to-student interactions with peers have shown to be an extremely effective form of learning.

We will write a custom Essay on Machine Learning Basics specifically for you for only ~~$16.05~~ $13/page

805 certified writers online

Order Now

Need help with your Assignment?

Give us your paper requirements,and we’ll deliver the highest-quality essay at only $13 a page.

Order with discount

Calculate the price

Make an order in advance and get the best price

Type of paper

Academic level

Deadline

Pages (550 words)

$0.00

*Price with a welcome 15% discount applied.

Pro tip: If you want to save more money and pay the lowest price, you need to set a more extended deadline.

We know how difficult it is to be a student these days. That's why our prices are one of the most affordable on the market, and there are no hidden fees.

Instead, we offer bonuses, discounts, and free services to make your experience outstanding.

How it works

Receive a 100% original paper that will pass Turnitin from a top essay writing service

step 1

Upload your instructions

Fill out the order form and provide paper details. You can even attach screenshots or add additional instructions later. If something is not clear or missing, the writer will contact you for clarification.

step 2

Control the process

Once you place an order with our professional essay writing services, we will email you login details to your account. There, you'll communicate with the writer and support team and track the writer's progress.

step 3

Download your paper on time

As soon as your work is ready, we’ll notify you via email. You'll then be able to download it from your account and request a revision if needed. Please note that you can also rate the writer's work in your account.

Pro service tips

How to get the most out of your experience with MyStudyWriters

One writer throughout the entire course

If you like the writer, you can hire them again. Just copy & paste their ID on the order form ("Preferred Writer's ID" field). This way, your vocabulary will be uniform, and the writer will be aware of your needs.

The same paper from different writers

You can order essay or any other work from two different writers to choose the best one or give another version to a friend. This can be done through the add-on "Same paper from another writer."

Copy of sources used by the writer

Our college essay writers work with ScienceDirect and other databases. They can send you articles or materials used in PDF or through screenshots. Just tick the "Copy of sources" field on the order form.

Testimonials

See why 20k+ students have chosen us as their sole writing assistance provider

Check out the latest reviews and opinions submitted by real customers worldwide and make an informed decision.

Business and administrative studies

always perfect work and always completed early

Customer 452773, February 21st, 2023

History

Looks great and appreciate the help.

Customer 452675, April 26th, 2021

Leadership Studies

excellent job

Customer 452773, August 26th, 2023

English 101

great summery in terms of the time given. it lacks a bit of clarity but otherwise perfect.

Customer 452747, June 9th, 2021

Social Work and Human Services

Although it took 2 revisions I am satisfied but I did receive it late because of that.

Customer 452603, March 25th, 2021

BUSINESS LAW

excellent job made a 93

Customer 452773, March 22nd, 2023

10th grade English

very good

Customer 452773, March 26th, 2023

Human Resources Management (HRM)

excellent

Customer 452773, June 25th, 2023

Business and administrative studies

excellent paper

Customer 452773, March 3rd, 2023

Managerial Accounting & Legal Aspects of Business ACC/543

excellent work

Customer 452773, February 7th, 2024

Business and administrative studies

Thank you

Customer 452773, March 19th, 2023

Business and administrative studies

Perfect

Customer 452773, February 23rd, 2023

11,595

Customer reviews in total

96%

Current satisfaction rate

3 pages

Average paper length

37%

Customers referred by a friend

OUR GIFT TO YOU

15% OFF your first order

Use a coupon FIRST15 and enjoy expert help with any task at the most affordable price.

Claim my 15% OFF Order in Chat

Check the price of your paper

Machine Learning Basics

Educational Data Mining

Predicting Student Performance

Student Data Attributes

K – Nearest Neighbor

Issues Regarding Classification

Student attrition and retention

Prestige

Social factor

Share this:

Related

Need help with your Assignment?

Sometimes it is hard to do all the work on your own