CS5483 Data Warehousing and Data Mining

Part I

Course Duration: One semester
Credit Units: 3
Level: P5
Medium of Instruction: English
Prerequisites: CS3402 Database Systems, or equivalent
Precursors: Nil
Equivalent Courses: Nil
Exclusive Courses
: Nil 

Part II

Course Aims
This course aims to introduce students to a new frontier in database technology, "data warehousing and data mining", by studying their principles, algorithms, implementation methodology, and applications.  It will analyze the components of a data warehouse, including data source and transformation tools, metadata management, query reporting and OLAP; provide a comprehensive introduction to data mining, including data selection, cleaning, coding, using different pattern recognition techniques, and reporting; and introduce students to the applications of data warehousing and data mining by using commercial tools for creating business applications.

Course Intended Learning Outcomes (CILOs)
Upon successful completion of this course, students should be able to:

No.

CILOs

Weighting
(if applicable)

1.

identify the main characteristics of different data mining techniques;

 

2.

differentiate between data warehousing and data mining;

 

3.

evaluate the performance of different data warehousing and data mining approaches;

 

4.

apply data warehousing or data mining techniques to real world problems.

 

Teaching and Learning Activities (TLAs)
(Indicative of likely activities and tasks designed to facilitate students’ achievement of the CILOs. Final details will be provided to students in their first week of attendance in this course)

  
Teaching pattern:
  
Suggested lecture/tutorial/laboratory mix: 2 hrs. lecture; 1 hr. tutorial.

CILO No.

TLAs

Hours/week
(if applicable)

CILO 1,2,3

Lecture: The lecture will focus on the introduction of data warehousing and data mining techniques, and their applications in different domains.

 

CILO 1,2,3

Tutorial: Students are required to complete a set of exercise questions, and present their solutions in the class.

 

CILO 4

Project: The students are required to implement a data warehousing or data mining approach, and apply it to a real world problem.

 

Assessment Tasks/Activities
(Indicative of likely activities and tasks designed to assess how well the students achieve the CILOs. Final details will be provided to students in their first week of attendance in this course)

  
Examination duration:  2 hours
  
Percentage of coursework, examination, etc.:  50% CW; 50% Exam

CILO No.

Type of Assessment Tasks/Activities

Weighting
(if applicable)

Remarks

CILO 1

Coursework: The ability of students to propose suitable solutions to the tutorial exercise questions will be used to assess this ILO.
Examination: Final examination will include questions to assess the capability of students to identify the distinguishing features of different data warehousing and data mining techniques.

 

 

CILO 2

Coursework: In the tutorial exercise, students are required to judiciously select suitable data warehousing or data mining techniques for a particular application based on its requirements, and their capabilities to choose the correct technique will be used to assess this ILO.
Examination: Final examination will include questions to assess the capability of students to identify the differences between data warehousing and data mining.

 

 

CILO 3

Coursework: Students are required to characterize the performance of different data warehousing and data mining algorithms based on suitable metrics.  Their capabilities to identify the merits and shortcomings of the different approaches will be used to assess this ILO.
Examination: Final examination will include question to assess the capability of students to perform a critical comparison of different data warehousing and data mining techniques.

 

 

CILO 4

Coursework: Students are required to implement a data warehousing or data mining approach, and apply it to a real world problem.  The effectiveness and efficiency of the implemented approach will be used to assess this ILO.

 

 

Grading of Student Achievement: Refer to Grading of Courses in the Academic Regulations
Grading pattern: Standard (A+, A, A-…F)
For a student to pass the course, at least 30% of the maximum mark for the examination must be obtained.
     

Part III

Keyword Syllabus:

Data extraction, data cleansing, data transformation, metadata, on-line analytical processing (OLAP), star schema, decision trees, neural networks, nearest neighbor and clustering, genetic algorithms, rule induction, data visualization, knowledge discovery in database.

Syllabus

Data Warehousing
XML database, schema translation, schema integration, star schema and data cube, data conversion and intergration, Online Analytical Processing

Data Mining
Association rule, web mining, decision tree, clustering, neural network, genetic algorithm

Related Links
Department of Computer Science