Project Description I
---------------------
A data warehouse extracts data from different data sources (called
operational databases) and stores them in the warehouse as aggregrates.
For example, a simple academic database stores relational tables
about course grades for students in different courses. If any user
wants to know the number of students with an "A" in a course, they need
to pose SQL query to count the number of "A"'s in the base table. With
the data warehouse, this aggregrate information (number of "A"'s) in
each course are pre-extracted from the base database and stored as
aggregates. Data warehouse also allows integration of a number of
independent databases (possibly in different data models and on different
hardware/software platforms).
Past 499 students and graduate students have written a number of
warehouse extraction tools. However, some extensions and features
still need to be developed and these may include:
1. Creating a more general data extractor capable of integrating
data from any different data sources (eg. from the web, and
from MS access)
2. Getting a data mining tool to query the created data warehouse.
We currently have a data mining tool on Unix called MineSet.
However, getting it to talk to the Oracle 7.x version we have
is not possible since it works with a newer version of Oracle.
3. Issues associated with building and developing comprehensive
metadata for the warehouse and the sources.
4. Other graphical user interfaces for access to the warehouses
which may be developed with tools like Powerbuilder and
Designer 2000 or others.
5. Implementing warehouse algorithms (published) and running
experiments.
6. Other projects that may be defined by students on warehousing
and mining or object-orientation.
The work involved is not complicated at all but needs commitment and hardwork.
Knowledge of relational database and SQL, knowledge of Oracle on UNIX and/or
MS Access on PC is required, information on datawarehouse can be read up.
Project may entail, creating two or more source databases in MS Access, web,
in ORACLE on UNIX . Next, student is expected to write an database programs
(SQL embedded in a programming language, e.g., Pro*C) that will create
aggregate tables from your base databases to be stored in a data warehouse.
Then, student may need to write an application to query the warehouse.
Project Description II
-----------------------
Datamining systems are used as front ends to Warehousing systems
to provide more intelligent and efficient querying of massive data
in order to assist business management in their use of online
analytical processsing systems (like those used to access
warehouse data). We have just a Datamining system on
SGI on UNIX students can work on learning and using
this system, providing quick documentation on how it can be
easily used by others. Work on interfacing a MineSet system
with a warehouse system. Of course, there is need to do some preliminary
literature review on Datamining.
Eh, you may want to know that these new database systems are hot
and people having experience in these areas are in high demand in
the industry.