CS5481 Data Engineering

Part I

Course Duration: One semester
Credit Units: 3
Level: P5
Medium of Instruction: English
Prerequisites: CS3402 Database Systems, or equivalent
Precursors: Nil
Equivalent Courses: Nil
Exclusive Courses
: Nil

Part II

Course Aims
This course aims at providing a formal and analytical treatment of distributed data bases.  Different types of data models and different data manipulation languages are reviewed.

Course Intended Learning Outcomes (CILOs)
Upon successful completion of this course, students should be able to:

No.

CILOs

Weighting
(if applicable)

1.

designing a distributed database system;

20%

2.

2 member project on a distributed database system;

20%

3.

understand distributed online transactions management;

20%

4.

demonstrate replication in distributed database system;

10%

5.

optimize SQL operations;

10%

6.

develop an distributed database system in a case study.

20%

Teaching and Learning Activities (TLAs)
(Indicative of likely activities and tasks designed to facilitate students’ achievement of the CILOs. Final details will be provided to students in their first week of attendance in this course)

  
Teaching pattern:
  
Suggested lecture/tutorial/laboratory mix: 2 hrs. lecture; 1 hr. tutorial.

CILO No.

TLAs

Hours/week
(if applicable)

CILO 1,3,5

Lecture: The lecture will focus on the distributed database design and distributed transactions management.

 

CILO 1,3,5

Tutorial: Students are required to complete a set of exercise questions, and present their solutions in the class.

 

CILO 2,4,6

Project: The students are required to implement a distributed database system, and apply to a real world problem.

 

Assessment Tasks/Activities
(Indicative of likely activities and tasks designed to assess how well the students achieve the CILOs. Final details will be provided to students in their first week of attendance in this course)

  
Examination duration:  2 hours
  
Percentage of coursework, examination, etc.:  30% CW; 70% Exam

CILO No.

Type of Assessment Tasks/Activities

Weighting
(if applicable)

Remarks

CILO 1

Coursework: The ability of students to design a distributed database data modelling, and answer questions on distributed transaction management.
Examination: Final examination will include the theories on data modelling for distributed databases, and distributed online transactions management.

 

 

CILO 2

Coursework: Students are required to develop a distributed database application system in a two members’ project. Upon completion of the project, students need to submit a report documenting the conceptual and logical design of the database, and the application of the distributed database to solve real world problem.

 

 

CILO 3

Coursework: The ability of students to answer tutorial questions on how to apply the distributed database theories to solve distributed transactions problems in case studies.
Examination: Final examination will include the applications on data modelling for distributed databases, and distributed transaction management.

 

 

CILO 4

Coursework: Students are required to perform hands-on exercises on developing replicated databases in a distributed system environment during tutorial sessions.

 

 

CILO 5

Coursework: Students are required to answer questions on query optimization process.
Examination: Students will be asked how to optimize a global query in SQL format.

 

 

CILO 6

Coursework: Students are required to demonstrate a prototype in their project assignment to develop a distributed database system.
Examination: Students will be asked to design a global schema in a case study.

 

 

Grading of Student Achievement: Refer to Grading of Courses in the Academic Regulations
Grading pattern: Standard (A+, A, A-…F)
For a student to pass the course, at least 30% of the maximum mark for the examination must be obtained.
   

Part III

Keyword Syllabus:

Distributed database, Global schema, Fragmentation, Allocation, Replication, Two Phase Commit, Distributed Deadlock, Distributed Concurrency Control, Two phase lock, Distributed query optimization, Client Server, Timestamp, XML.

Syllabus

Distributed data architectures, Architectural approach (via data modelling) to data base design; data fragmentation, allocation and partitioning; scheme design; Distributed files: Transparencies, shared files, multi-server implementation; Replicated files, distributed updates; Transaction management: Distributed transactions; update propagation; views update; recovery; Concurrency control: Deadlocks; concurrency control methods; Reliability: Basic concepts, 2 and 3-phase commitment protocols, detection and resolution of network inconsistency; Query processing: Query decomposition and optimization; query translation; Distributed data bases and administration: Catalogue management; integrity and consistency; security and access control; heterogeneous and homogeneous data bases.

Related Links
Department of Computer Science