198858 VU Data Analysis II: Data Mining

summer semester 2021 | Last update: 26.02.2021 Place course on memo list
198858
VU Data Analysis II: Data Mining
VU 3
5
weekly
not applicable
English

Upon the successful completion of the course, students will have:

  • Learned an overview of the methodologies and approaches to data mining;  

  • Gained insight into the challenges and limitations of different data mining techniques;

  • Practiced applying data mining solutions using a common data mining software tool;

  • Enhanced their communication and problem-solving skills.

Data mining, or intelligent analysis of information stored in data sets, has recently gained a substantial interest among practitioners in a variety of fields and industries. This course will introduce the process of knowledge discovery and the basic theory of automatically extracting models from data, validating those models, solving the problems of how to extract valid, useful, and previously unknown interesting patterns from a source (database or web) which contains an overwhelming amount of information. Students will be introduced to various models (decision trees, association rules, linear model, clustering, Bayesian network, neural network) and learn how to apply them in practice. Algorithms applied include searching for patterns in the data, using machine learning, and applying artificial intelligence techniques. Students will learn how to implement several relevant algorithms and use existing tools to mine real-world data.

Presentation of lecture slides by the lecturer; hands-on exercises using analysis examples; and presentation of research papers by participants.

Graded work for the course will consist of written homework assignments, two exams, a research paper presentation and a group project.

  • J. Han, M. Kamber, and J. Pei, Data Mining: Concepts and Techniques. Morgan Kaufmann, 3rd edition. 

  • Jure Leskovec, Anand Rajaraman, Jeff Ullman. Mining of Massive Datasets, 2nd edition.

Programming skills in Python or R are required in this course. Ideally, this course should be taken after module 1 (198801, 198803). In your booking request, please add a remark on your programming experience. Moreover, the basic statistical knowledge, as in module 3a (198841),  may be helpful but is not necessary to complete this course. Participants are required to use their laptop computers (Windows, Linux, Mac) for hands-on exercises.

The acceptance procedure is based on prioritised randomisation. Students advanced in completion of the Digital Science minor get precedence

see dates
Group 0
Date Time Location
Tue 2021-03-02
08.15 - 11.00 eLecture - online eLecture - online
Tue 2021-03-09
08.15 - 11.00 eLecture - online eLecture - online
Tue 2021-03-16
08.15 - 11.00 eLecture - online eLecture - online
Tue 2021-03-23
08.15 - 11.00 eLecture - online eLecture - online
Tue 2021-04-13
08.15 - 11.00 eLecture - online eLecture - online
Tue 2021-04-20
08.15 - 11.00 eLecture - online eLecture - online
Tue 2021-04-27
08.15 - 11.00 eLecture - online eLecture - online
Tue 2021-05-04
08.15 - 11.00 eLecture - online eLecture - online
Tue 2021-05-11
08.15 - 11.00 eLecture - online eLecture - online
Tue 2021-05-18
08.15 - 11.00 eLecture - online eLecture - online
Tue 2021-05-25
08.15 - 11.00 eLecture - online eLecture - online
Tue 2021-06-01
08.15 - 11.00 eLecture - online eLecture - online
Tue 2021-06-08
08.15 - 11.00 eLecture - online eLecture - online
Tue 2021-06-15
08.15 - 11.00 eLecture - online eLecture - online
Tue 2021-06-22
08.15 - 11.00 eLecture - online eLecture - online