Course information 2024, Vol 1

 

Hello Class,

   Welcome to COMP 8390 class and you can download course notes and get information regarding the course from the course web site found through UWindsor

brightspace site https://brightspace.uwindsor.ca/d2l/login or directly through:

http://cezeife.myweb.cs.uwindsor.ca/courses/60-539/539index.htm.

Ensure you have all materials so far handed out (or to be handed out) in class which are:

1. course outline,

2. course information sheet (https://cezeife.myweb.cs.uwindsor.ca/courses/60-539/announce/crseinfo_v1.htm),

3. seminar topic list to choose from (also available through the seminar link on this course page),

4. seminar presentation schedule (posted or to be posted on the web page),

5. SQL_some document in support of course material, part I (also found through https://cezeife.myweb.cs.uwindsor.ca/courses/60-539/databases/index.htm),

6. More Course Information sheet (on test, seminar and project), (https://cezeife.myweb.cs.uwindsor.ca/courses/60-539/announce/crseinfo_v2.htm)

7. Practice Midterm Test (using Winter 2008 test)(in a Brightspace folder),

8. project presentation schedule (updated version yet to be posted)

 

 

Read the course information below carefully and the seminar topic

list from three conference proceedings of ACM SIGKDD, ACM SIGMOD and

VLDB 2023 which are in the file handed out in class and found through the

60-539 course web link (seminar topics):   They take time to compile.

 

Here is a summary of course expectations as presented in the

course outline and discussed in class.

 

****

A. On Test

  As announced in class, date for midterm test is Mar. 4, 2024,

Monday, 2:30 pm in the course Class Room. A copy of a sample past midterm exam will be handed out in class or through Brightspace only for practice and as a model.

 

****

B.  On Seminar presentations

I have posted a text file that has all the seminar topics I would

like each student to pick one of.  Remember

that each student is responsible for providing copies of the

paper that they will deliver to the class (students who

will participate in grading that student's seminar - student grading

of the seminar will be worth 25% or 0% while my grading makes up 75% or

100%). Regardless, students' grading of other students' seminars as part

of course work is important for their seminar contribution marks even if

it is only my seminar grading that counts for each student seminar mark.

 

For seminar, students are expected to have chosen their seminar topic

by Feb. 4, 2024.

No one seminar topic is to be picked by more than one student. That is,

every topic is to be picked and presented once. Thus, if you pick a

topic already picked by someone, you need to choose another from the

remaining topics. Email your topic to me (cezeife@uwindsor.ca) or discuss

during class or office hour.

 

I will wait for the complete list to be able to place you into seminar

grading groups.  I will provide a tentative list below. Reading

the seminar papers in your seminar group is part of course work, and helps with meaningful contribution during seminar presentations.

 

Note that it is every student's responsibility to make copies of

their seminar available to all students who will be grading him/her.

Copies of seminar papers should be provided to student graders not

later than Feb. 4, 2024.

 

Seminars begin, Monday, Mar. 11, 2024 as announced earlier (and in course outline).

Thus, your one seminar report should be 3 to 5 pages (double line

spaced with 12-point font) on just your own one seminar paper.

 

Seminar report should clearly state title of paper, authors, proceedings

or journal it came from, year of publication, name of student and

seminar group. Then, include clearly, problem addressed by paper,

contributions of this paper, solution provided with clear algorithms and

running example, limitations and advantages of solution and your opinion of work.  Seminar reports are due on the day of presentation of the paper or latest the last day of seminar presentations.

 

****

C. On Project Topics,

 

Students should have chosen project topics by Feb. 4, 2024. Projects can be worked on individually or in group.  If it is a group project, each

group member's contribution has to be clearly indicated. Projects

can be research or application based.  You can select your own

project topic on issues related to course material and present it for

approval.  However, here are some project topic suggestions. A generic project is to design and build an enterprise data warehouse system with decision support querying capabilities or information system for an application domain (e.g., Hospital data warehouse information system, student information system, Airline data warehouse information system, etc).

 

Research-Based

 

Big Data Mining Problems and Techniques for varying domains

Big Data Integration Problems and Techniques for varying domains

Object oriented mining

Collaborative Mining Approaches for emerging unsolved issues

Web Content Mining and automatic wrapper generation

New Research issues and applications of Sequential Pattern Mining.

Web Recommendation Systems

High Utility Pattern Mining Problems and applications.

Social Web Mining and Analysis

Mining Social Web Communities

Implementation of Popular sequential pattern Mining algorithms like PrefixSpan, etc.

A Comparative Analysis of recent Existing Warehouse View Materialization Techniques.

New Research issues in Data Warehouses.

Web Warehousing Research and Systems.

Approaches for Improving Warehouse Performance through Query Optimization.

Appropriate Index and Access Methods for Warehouses.

Statistical Analysis of Organizations Using Existing Warehouses and Data

Mining Systems: Use and Robustness.

Use of Neural Network Techniques in Data Mining

Statistical Approaches to Data Mining

Text Mining Approaches

Stream mining

Sensor Data Mining approaches and Applications.

Semantic Web Mining.

New Research Problems in Data Mining

Any other, propose for approval

 

****

Application-Based

This includes implementing data mining and warehousing information systems and querying as well as algorithms that

You may have read in your research paper or applying them to a problem. You can also learn a mining or warehousing tool and implement some process with the tool.

 

1. Implementation of Mining and Warehousing algorithms (not just downloaded from the web) for solving a simple problem or to be used in good thesis work.  Examples are Sequential and Web Content and Sequential Mining techniques, Clustering, and others.

Algorithms to be implemented and tested include:

 

A.1. Sequential Mining algorithms like GSP, PrefixSpan

  and other recent ones of importance.

 

A.2. Association Rule mining algorithms like those we do not have.

 

A.3. Clustering algorithms like K-Means, Online K-Means, etc.

 

A.4. Outlier Mining algorithms like LOF, LSC, etc.

 

A.5. Web Content Mining Algorithms: E.g. Algorithms for

Dynamically extracting heterogeneous data types (e.g., text, images, video, list, etc.) from a set of

B2C (E-Commerce) web sites:  Extending our WebOMiner and WebOMiner_S systems.

 

A.6  Decision Support GUI front end recommendation system for Comparative analysis of B2C web sites based on the WebOMiner_S system. (similar to say google shop).

 

A.7  Web Recommendation Systems for E-Commerce sites- building their data sets, algorithms, querying systems, etc. (similar to established web recommendation systems like Grouplens, Movielens).

 

A.8  NOSQL Databases – Exploring a specific one (its storage structure, querying system, handling of data mining and olap querying, growing data or big data, etc.)

 

B. Exploring use and application of Data Mining Systems like:

B.1. WEKA Data miner

B.2. SAS data miner

B.3. MineSet data miner

B.4. SPSS ??

B.5. Any other data mining system in use and available

Project involves thorough learning of the data mining system,

Exploring how the mining algorithms and techniques like association rule,

decision tree classification and others are implemented, and used

to solve real life problems.  Show what the advantages,

limitations of this system are.  How can this system be extended to

handle the limitations?

 

2. Implementing warehouse systems using php, PL/SQL, Pro*C, SQLJ, or others (e.g.)

  2.1. Student Information System

 

3. Building various components of warehouses on Unix/Linux

  - that integrate PC databases

  - that integrate web databases

  - that integrate video or image databases

  - that integrate knowledge based systems

 

4. Building data base generator for testing data for warehouse/mining components as cleaning algorithms.

 

5. Building olap systems for querying warehouses

8. Building warehouse meta data

7. Extending Oracle SQL by implementing Data Cube operator

8. Implementing any good algorithms you find in seminar papers or

in my work that I can make use of.

9. Implementing Data Mining algorithms and Mining huge data.

10. Building a web warehouse for any two online retail store similar to webominer

 

11. web yellow page classified for structured web links of topics

12. Automating web log cleaner

13. CS student database system

14. Any other you compose and that is approved by me.

 

As announced in class, work on projects should proceed gradually

starting now as project reports and complete implementation results

are due on the last day of classes.  Project presentations will

occur during the last 2 classes. You are encouraged to show me

the outline of how you wish to proceed with your project once you

have it ready.

 

Tentative Seminar Grading Groups and Papers so far Picked:

 

Seminar Class (for Monday, 2:30pm – 5:20pm in ER Classroom)

(GRADING GROUP A)                         SEMINAR PAPER NUMBER

Ahluwalia,Gurpartap Singh   

Ahmed,Rabea

Ambani,Yash Atul

Amin,Affan 

Baghbanzadeh,Amin

Bhate,Aditya Kunal    

Bhojak,Rishit Devang  

Cherry,Nathan Mackenzie     

Dhar,Sudipta    

Fatema,Kausar   

Garg,Ashma 

Ghiasi,Behrad   

Jahandideh,Younes

Jain,Divyam

     

 

Seminar Class (for Monday, 2:30pm – 5:20pm in ER Classroom)

(GRADING GROUP B)                         SEMINAR PAPER NUMBER

Joshi,Yash Niravbhai  

Kaur,Parneet    

Kumar,Atul 

Madu,Esther Mercy

Mahajan,Abhishek Pradeep    

Mohammed,Talha Haseeb 

Momin,Mobin Ali 

Pandya,Kedar Rajeshkumar 

Patel,Aniket Darpan   

Prabhakar,Asmita      

Shams,Haseeb    

Tyagi,Harshil   

Vishwakarma,Pooja

Zeeshan,Muhammad Zohaib

 

I have used the updated UWinsite class list to form the groups.

 

It is your responsibility to provide copies of your own paper to other

student graders of your research seminar. But provide copies only after

your picked topic has been confirmed.

 

Every one needs to give me a copy of their seminar paper.

 

***************************

Dr. Christie Ezeife

Professor

School of Computer Science

University of Windsor

Ont N9B 3P4.

 

Phone: (519) 253-3000 ext. 3012

email: cezeife@uwindsor.ca