Course
information 2025, Vol 1
Hello Class,
Welcome to COMP 8390 class and you can
download course notes and get information regarding the course from the course
web site found through UWindsor
brightspace site
https://brightspace.uwindsor.ca/d2l/login or directly through:
http://cezeife.myweb.cs.uwindsor.ca/courses/60-539/539index.htm.
Ensure you have
all materials so far handed out (or to be handed out) in class which are:
1. course
outline,
2. course
information sheet (https://cezeife.myweb.cs.uwindsor.ca/courses/60-539/announce/crseinfo_v1.htm),
3. seminar topic
list to choose from (also available through the seminar link on this course
page),
4. seminar
presentation schedule (posted or to be posted on the web page),
5. SQL_some document
in support of course material, part I (also found through
https://cezeife.myweb.cs.uwindsor.ca/courses/60-539/databases/index.htm),
6. More Course
Information sheet (on test, seminar and project),
(https://cezeife.myweb.cs.uwindsor.ca/courses/60-539/announce/crseinfo_v2.htm)
7. Practice
Midterm Test (using Winter 2021 test)(in a Brightspace folder),
8. project
presentation schedule (updated version yet to be posted)
Read the course
information below carefully and the seminar topic
list from three
conference proceedings of ACM SIGKDD, ACM SIGMOD and
VLDB 2023 which
are in the file handed out in class and found through the
60-539 course web
link (seminar topics): They take time
to compile.
Here is a summary
of course expectations as presented in the
course outline
and discussed in class.
****
A. On Test
As announced in class, date for midterm test
is Mar. 3, 2025,
Monday, 2:30 pm
in the course Class Room. A copy of a sample past midterm exam will be handed
out in class or through Brightspace only for practice and as a model.
****
B. On Seminar presentations
I have posted a
text file that has all the seminar topics I would
like each student
to pick one of. Remember
that each student
is responsible for providing copies of the
paper that they
will deliver to the class (students who
will participate
in grading that student's seminar - student grading
of the seminar
will be worth 25% or 0% while my grading makes up 75% or
100%).
Regardless, students' grading of other students' seminars as part
of course work is
important for their seminar contribution marks even if
it is only my
seminar grading that counts for each student seminar mark.
For seminar,
students are expected to have chosen their seminar topic
by Feb. 3, 2025.
No one seminar
topic is to be picked by more than one student. That is,
every topic is to
be picked and presented once. Thus, if you pick a
topic already
picked by someone, you need to choose another from the
remaining topics.
Email your topic to me (cezeife@uwindsor.ca) or discuss
during class or
office hour.
I will wait for
the complete list to be able to place you into seminar
grading
groups. I will provide a tentative list
below. Reading
the seminar
papers in your seminar group is part of course work, and helps with meaningful
contribution during seminar presentations.
Note that it is
every student's responsibility to make copies of
their seminar
available to all students who will be grading him/her.
Copies of seminar
papers should be provided to student graders not
later than Feb.
3, 2025.
Seminars begin,
Monday, Mar. 10, 2025 as announced earlier (and in course outline).
Thus, your one
seminar report should be 3 to 5 pages (double line
spaced with
12-point font) on just your own one seminar paper.
Seminar report
should clearly state title of paper, authors, proceedings
or journal it
came from, year of publication, name of student and
seminar group.
Then, include clearly, problem addressed by paper,
contributions of
this paper, solution provided with clear algorithms and
running example,
limitations and advantages of solution and your opinion of work. Seminar reports are due on the day of
presentation of the paper or latest the last day of seminar presentations.
****
C. On Project
Topics,
Students should
have chosen project topics by Feb. 3, 2025. Projects can be worked on
individually or in group. If it is a
group project, each
group member's
contribution has to be clearly indicated. Projects
can be research
or application based. You can select
your own
project topic on
issues related to course material and present it for
approval. However, here are some project topic
suggestions. A generic project is to design and build an enterprise data
warehouse system with decision support querying capabilities or information
system for an application domain (e.g., Hospital data warehouse information
system, student information system, Airline data warehouse information system,
etc).
Research-Based
Big Data Mining
Problems and Techniques for varying domains
Addressing Big
Data Mining size and variety issues using Map-Reduce paradigm
Big Data
Integration Problems and Techniques for varying domains
Object oriented
mining
Collaborative
Mining Approaches for emerging unsolved issues
Web Content
Mining and automatic wrapper generation
New Research
issues and applications of Sequential Pattern Mining.
Web
Recommendation Systems
High Utility
Pattern Mining Problems and applications.
Social Web Mining
and Analysis
Mining Social Web
Communities
Implementation of
Popular sequential pattern Mining algorithms like PrefixSpan, etc.
A Comparative
Analysis of recent Existing Warehouse View Materialization Techniques.
New Research
issues in Data Warehouses.
Web Warehousing
Research and Systems.
Approaches for
Improving Warehouse Performance through Query Optimization.
Appropriate Index
and Access Methods for Warehouses.
Statistical
Analysis of Organizations Using Existing Warehouses and Data
Mining Systems:
Use and Robustness.
Use of Neural
Network Techniques in Data Mining
Statistical
Approaches to Data Mining
Text Mining
Approaches
Stream mining
Sensor Data
Mining approaches and Applications.
Semantic Web
Mining.
New Research
Problems in Data Mining
Any other,
propose for approval
****
Application-Based
This includes
implementing data mining and warehousing information systems and querying as
well as algorithms that
You may have read
in your research paper or applying them to a problem. You can also learn a
mining or warehousing tool and implement some process with the tool.
1. Implementation
of Mining and Warehousing algorithms (not just downloaded from the web) for
solving a simple problem or to be used in good thesis work. Examples are Sequential and Web Content and
Sequential Mining techniques, Clustering, and others.
Algorithms to be
implemented and tested include:
A.1. Sequential
Mining algorithms like GSP, PrefixSpan
and other recent ones of importance.
A.2. Association
Rule mining algorithms like those we do not have.
A.3. Clustering
algorithms like K-Means, Online K-Means, etc.
A.4. Outlier
Mining algorithms like LOF, LSC, etc.
A.5. Web Content
Mining Algorithms: E.g. Algorithms for
Dynamically
extracting heterogeneous data types (e.g., text, images, video, list, etc.)
from a set of
B2C (E-Commerce)
web sites: Extending our WebOMiner and
WebOMiner_S systems.
A.6 Decision Support GUI front end recommendation
system for Comparative analysis of B2C web sites based on the WebOMiner_S
system. (similar to say google shop).
A.7 Web Recommendation Systems for E-Commerce
sites- building their data sets, algorithms, querying systems, etc. (similar to
established web recommendation systems like Grouplens, Movielens).
A.8 NOSQL Databases – Exploring a specific one
(its storage structure, querying system, handling of data mining and olap
querying, growing data or big data, etc.)
B. Exploring use
and application of Data Mining Systems like:
B.1. WEKA Data
miner
B.2. SAS data
miner
B.3. MineSet data
miner
B.4. SPSS ??
B.5. Any other
data mining system in use and available
Project involves
thorough learning of the data mining system,
Exploring how the
mining algorithms and techniques like association rule,
decision tree
classification and others are implemented, and used
to solve real
life problems. Show what the advantages,
limitations of
this system are. How can this system be
extended to
handle the
limitations?
2. Implementing
warehouse systems using php, PL/SQL, Pro*C, SQLJ, or others (e.g.)
2.1. Student Information System
3. Building
various components of warehouses on Unix/Linux
- that integrate PC databases
- that integrate web databases
- that integrate video or image databases
- that integrate knowledge based systems
4. Building data
base generator for testing data for warehouse/mining components as cleaning
algorithms.
5. Building olap
systems for querying warehouses
8. Building
warehouse meta data
7. Extending
Oracle SQL by implementing Data Cube operator
8. Implementing
any good algorithms you find in seminar papers or
in my work that I
can make use of.
9. Implementing
Data Mining algorithms and Mining huge data.
10. Building a
web warehouse for any two online retail store similar to webominer
11. web yellow
page classified for structured web links of topics
12. Automating
web log cleaner
13. CS student
database system
14. Any other you
compose and that is approved by me.
As announced in
class, work on projects should proceed gradually
starting now as
project reports and complete implementation results
are due on the
last day of classes. Project
presentations will
occur during the
last 2 classes. You are encouraged to show me
the outline of
how you wish to proceed with your project once you
have it ready.
Tentative Seminar
Grading Groups and Papers so far Picked:
Seminar Class
(for Monday, 2:30pm – 5:20pm)
(GRADING GROUP
A) SEMINAR PAPER
NUMBER
Amin,Affan
Barnoff,Aaron
Eren,Ahmet Furkan
Hoque,Tanjina
Hussain,Anisa
Lakdawala,Faizaan
Naeem
Masih,Akanksha
Patel,Dhruv
Sejalbhai
Patel,Harshkumar
Bharatbhai
Seminar Class
(for Monday, 2:30pm – 5:20pm)
(GRADING GROUP
B) SEMINAR PAPER NUMBER
Patel,Jainil
Patel,Kirtankumar
Sunilbhai
Patel,Kishankumar
Mahendrabhai
Picchioni,Joshua
Alexander
Rawal,Vrushank
Hiten
Shirodkar,Gauthami
Ulhas
Siddhapura,Om
Vijaybhai
Singh,Prabhdeep
I have used the
updated UWinsite class list to form the groups.
It is your
responsibility to provide copies of your own paper to other
student graders
of your research seminar. But provide copies only after
your picked topic
has been confirmed.
Every one needs
to give me a copy of their seminar paper.
***************************
Dr. Christie
Ezeife
Professor
School of
Computer Science
University of
Windsor
Ont N9B 3P4.
Phone: (519)
253-3000 ext. 3012
email: cezeife@uwindsor.ca
***