Pg. 01 Question One College of Computing and Informatics PROJECT Deadline: Thursday

Pg. 01

Question One

College of Computing and Informatics

PROJECT

Deadline: Thursday 09/12/2021 @ 23:59

[Total Mark for this Project is 10]

PROJECT

Deadline: Thursday 09/12/2021 @ 23:59

[Total Mark for this Project is 10]

Data Mining and Datawarehouse

IT446

Data Mining and Datawarehouse

IT446

Instructions:

This project report must be submitted on Blackboard (WORD format only) via the allocated folder.

You are advised to make your work clear and well-presented; marks may be reduced for poor presentation

Email submission will not be accepted.

Late submission will result in ZERO mark.

The work should be your own, copying from students or other resources will result in ZERO mark.

Use Times New Roman font for all your answers.

Instructions:

This project report must be submitted on Blackboard (WORD format only) via the allocated folder.

You are advised to make your work clear and well-presented; marks may be reduced for poor presentation

Email submission will not be accepted.

Late submission will result in ZERO mark.

The work should be your own, copying from students or other resources will result in ZERO mark.

Use Times New Roman font for all your answers.

Student Details:

Name: ###

ID: ###

CRN: ###

Student Details:

Name: ###

ID: ###

CRN: ###

1 Marks

1 Marks

Learning Outcome(s):1

Explain different data mining tasks, problems and the algorithms most appropriate for addressing them

Learning Outcome(s):1

Explain different data mining tasks, problems and the algorithms most appropriate for addressing them

Question One

Select one of the datasets from UCI Machine Learning Repositories

(http://archive.ics.uci.edu/ml/) OR ( https://www.kaggle.com/datasets )

OR use your own dataset if available.

1 Marks

1 Marks

Learning Outcome(s): 2

Apply and evaluate data mining algorithms with respect to problems they are specifically designed for.

Learning Outcome(s): 2

Apply and evaluate data mining algorithms with respect to problems they are specifically designed for.

Question Two

The dataset may follow the following requirements (Data description)

Number of instances: between 300-500

Number of attributes: between 10 to 15

0.5 Marks

0.5 Marks

Learning Outcome(s):1

Explain different data mining tasks, problems and the algorithms most appropriate for addressing them.

Learning Outcome(s):1

Explain different data mining tasks, problems and the algorithms most appropriate for addressing them.

Question Three

Prepare a CSV OR ARFF format data file of the data.

0.5 Marks

0.5 Marks

Learning Outcome(s):1

Explain different data mining tasks, problems and the algorithms most appropriate for addressing them.

Learning Outcome(s):1

Explain different data mining tasks, problems and the algorithms most appropriate for addressing them.

Question Four

Load the dataset in Weka or if you prefer to use any python tools such as Google Collaborate Lab https://research.google.com/colaboratory/

2 Marks

2 Marks

Learning Outcome(s): 2

Apply and evaluate data mining algorithms with respect to problems they are specifically designed for.

Learning Outcome(s): 2

Apply and evaluate data mining algorithms with respect to problems they are specifically designed for.

Question Five

Do a basic preprocessing to the dataset such data cleaning / Data reduction /Normalization (if exist or required) etc.

Learning Outcome(s): 2

Apply and evaluate data mining algorithms with respect to problems they are specifically designed for.

Learning Outcome(s): 2

Apply and evaluate data mining algorithms with respect to problems they are specifically designed for.

2 Marks

2 MarksQuestion six

Based on dataset run Apriori algorithm with different support and confidence values. Discuss the generated rules.

Learning Outcome(s): 2

Apply and evaluate data mining algorithms with respect to problems they are specifically designed for.

Learning Outcome(s): 2

Apply and evaluate data mining algorithms with respect to problems they are specifically designed for.

1 Mark

1 MarkQuestion seven

Based on your dataset selection, apply SVM data mining algorithm.

Provide the result and accuracies of the algorithms and discuss it with supporting screenshots.

Learning Outcome(s): 2

Apply and evaluate data mining algorithms with respect to problems they are specifically designed for.

Learning Outcome(s): 2

Apply and evaluate data mining algorithms with respect to problems they are specifically designed for.

1 Mark

1 MarkQuestion eight

Based on your selection dataset, Apply the Decision tree data mining algorithm with different parameter setting and record the accuracies.

Learning Outcome(s): 2

Apply and evaluate data mining algorithms with respect to problems they are specifically designed for.

Learning Outcome(s): 2

Apply and evaluate data mining algorithms with respect to problems they are specifically designed for.

1 Mark

1 MarkQuestion nine

Apply the K-mean algorithm on the dataset (for k=4) and study the clusters formed.