Project Description I

---------------------

A data warehouse extracts data from different data sources (called

operational databases) and stores them in the warehouse as aggregrates.

For example, a simple academic database stores relational tables

about course grades for students in different courses. If any user

wants to know the number of students with an "A" in a course, they need

to pose SQL query to count the number of "A"'s in the base table. With

the data warehouse, this aggregrate information (number of "A"'s) in

each course are pre-extracted from the base database and stored as

aggregates. Data warehouse also allows integration of a number of

independent databases (possibly in different data models and on different

hardware/software platforms).

Past 499 students and graduate students have written a number of

warehouse extraction tools. However, some extensions and features

still need to be developed and these may include:

1. Creating a more general data extractor capable of integrating

data from any different data sources (eg. from the web, and

from MS access)

2. Getting a data mining tool to query the created data warehouse.

We currently have a data mining tool on Unix called MineSet.

However, getting it to talk to the Oracle 7.x version we have

is not possible since it works with a newer version of Oracle.

3. Issues associated with building and developing comprehensive

metadata for the warehouse and the sources.

4. Other graphical user interfaces for access to the warehouses

which may be developed with tools like Powerbuilder and

Designer 2000 or others.

5. Implementing warehouse algorithms (published) and running

experiments.

6. Other projects that may be defined by students on warehousing

and mining or object-orientation.

The work involved is not complicated at all but needs commitment and hardwork.

Knowledge of relational database and SQL, knowledge of Oracle on UNIX and/or

MS Access on PC is required, information on datawarehouse can be read up.

Project may entail, creating two or more source databases in MS Access, web,

in ORACLE on UNIX . Next, student is expected to write an database programs

(SQL embedded in a programming language, e.g., Pro*C) that will create

aggregate tables from your base databases to be stored in a data warehouse.

Then, student may need to write an application to query the warehouse.

Project Description II

-----------------------

Datamining systems are used as front ends to Warehousing systems

to provide more intelligent and efficient querying of massive data

in order to assist business management in their use of online

analytical processsing systems (like those used to access

warehouse data). We have just a Datamining system on

SGI on UNIX students can work on learning and using

this system, providing quick documentation on how it can be

easily used by others. Work on interfacing a MineSet system

with a warehouse system. Of course, there is need to do some preliminary

literature review on Datamining.

Eh, you may want to know that these new database systems are hot

and people having experience in these areas are in high demand in

the industry.