English   Danish

2019/2020  KAN-CDASO2020U  Data Mining, Machine Learning, and Deep Learning

English Title
Data Mining, Machine Learning, and Deep Learning

Course information

Language English
Course ECTS 7.5 ECTS
Type Mandatory offered as elective
Level Full Degree Master
Duration One Semester
Start time of the course Spring
Timetable Course schedule will be posted at calendar.cbs.dk
Study board
Study Board for BSc/MSc in Business Administration and Information Systems, MSc
Course coordinator
  • Raghava Rao Mukkamala - Department of Digitalisation
Main academic disciplines
  • Information technology
  • Statistics and quantitative methods
Teaching methods
  • Face-to-face teaching
Last updated on 04-11-2020

Relevant links

Learning objectives
  • Access the fundamental challenges of machine learning such as model selection, model complexity, etc.
  • Understand the underlying mathematical relationships within and across machine learning algorithms
  • Characterize the strengths and weaknesses of various machine learning approaches and algorithms
  • Design, implement, analyse and apply different data mining, machine learning techniques and deep learning techniques for big/business datasets in organizational contexts and for real-world applications
  • Summarize the application areas, trends, and challenges in data mining and machine learning
  • Critically assess the ethical and legal issues in applying machine learning algorithms
  • Exhibit deeper knowledge and understanding of the topics as part of the project and the report should reflect on critical awareness of the methodological choices with written skills to accepted academic standards.
Course prerequisites
This course requires a fundamental understanding of programming in Python language as achieved in, or comparable to, Foundations of Business Data Analytics: Architectures, Statistics and Programming course from 1st semester of Cand.Merc.IT (Data Science).
Prerequisites for registering for the exam (activities during the teaching period)
Number of compulsory activities which must be approved: 2
Compulsory home assignments
Each assignment is 1-3 pages in group of 1-4 students.
The students have to get 2 out of 4 assignments approved in order to go to the exam.

There will not be any extra attempts provided to the students before the ordinary exam.
If a student cannot hand in due to documented illness, or if a student does not get the activity approved in spite of making a real attempt, then the student will be given one extra attempt before the re-exam. Before the re-exam, there will be one home assignment (max. 10 pages) which will cover 2 mandatory assignments.
Examination
Data Mining, Machine Learning and Deep Learning:
Exam ECTS 7,5
Examination form Home assignment - written product
Individual or group exam Group exam
Please note the rules in the Programme Regulations about identification of individual contributions.
Number of people in the group 2-4
Size of written product Max. 15 pages
PLEASE NOTE: Different exam form in summer 2021; details below.
Assignment type Project
Duration Written product to be submitted on specified date and time.
Grading scale 7-point grading scale
Examiner(s) Internal examiner and second internal examiner
Exam period Summer
Make-up exam/re-exam
Same examination form as the ordinary exam
Regarding the group project: Students can submit the same project or they can choose to submit a revised project.
Description of the exam procedure

Summer 2021:
Individual oral exam based on a group project. The oral exam will be 20 minutes per student including examiners’ discussion, and informing of grade. The group project is max. 15 pages. It will also be possible to write the project individually. The number for pages of an individual project is also max. 15 pages.

 

In order to participate in the oral exam, the written product must be handed in before the oral exam; by the set deadline. The grade is based on an overall assessment of the written product and the individual oral performance.

 

Make-up exam/re-exam:
Same examination form as the ordinary exam. Students can submit the same project or they can choose to submit a revised project.

 

Summer 2020:

The exam consists of two elements: a group project and an individual home assignment.

 

a)   The group project is max. 15 pages. The students have to individualize their group project. The students must show what their individual contributions are, and in such a way that it is ensured that individual assessment is possible. See ‘Individualisation of group papers etc.’ in the study administrative rules  (SAR).

 

b)   The individual home assignment is without a specific number of pages and is made on a certain time in a 2 hour slot. 

Course content, structure and pedagogical approach

The course provides knowledge of various concepts, techniques and methods related to data mining, machine learning and deep learning approaches. Furthermore, it introduces

 

  • Basics of Data mining and machine learning
  • Strengths and weaknesses of Dimensionality Reduction Algorithms: variance thresholds,Correlation Thresholds, Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA)
  • Linear models for regression such as maximum likelihood, sequential learning, regularized least squares
  • Linear models for classification such as linear classification, logistic regression, support vector machines
  • Classification models such as probabilistic generative models, probabilistic discriminative models
  • Unsupervised learning: clustering, probabilistic clustering, Expectation-Maximization Algorithm.
  • Neural Networks: feed-forward neural networks, network training, backpropagation, convolutional neural networks
  • Deep Learning: deep feed-forward networks, regularization for deep learning, optimization for training deep models, application of deep learning

 

Furthermore, the course provides the students with practical hands-on experience on data mining and machine learning using open source machine learning libraries such as scikit-learn in Python programming language. After completing the course, the students will be able to apply and use various data mining and machine-learning techniques on real-word big/business datasets.

Description of the teaching methods
The course consists of lectures, exercises, and assignments. Each lecture is followed by an exercise session, and there will be a teaching assistant providing technical support for assignments and course projects.

The presented theories, concepts and methods should be applied in practice and exercise sessions. The students work in the entire semester on a mini project displaying the understanding of the concepts presented in the lectures and exercises. CBS Learn is used for sharing documents, slides, exercises etc. as well as for interactive lessons if applicable.
Feedback during the teaching period
Feedback on mandatory assignments will provided in general
Student workload
Lectures 24 hours
Exercises 24 hours
Prepare to class 48 hours
Project work & report 100 hours
Exam and prepare 10 hours
Total 206 hours
Expected literature

The literature can be changed before the semester starts. Students are advised to find the final literature on Canvas before they buy the books.

 

Text Books:

 

 

Authors(s)

Title

Publisher/ ISBN/ DOI

[AIAMA]

Russell, Stuart J., and Peter Norvig.

Artificial intelligence: a modern approach.

Malaysia; Pearson Education Limited, 2016.

[CIMI]

Kruse, Rudolf, Christian Borgelt, Christian Braune, Sanaz Mostaghim, and Matthias Steinbrecher.

Computational intelligence: a methodological introduction.

Springer, 2016.

[CML]

Hal Daumé III

A Course in Machine Learning

-

[DMCT]

Jiawei Han, Micheline Kamber, Jian Pei

Data Mining: Concepts and Techniques

Morgan Kaufmann; 3 edition (July 6, 2011)

ISBN-13: 978-9380931913

[ESL]

Friedman, Jerome, Trevor Hastie, and Robert Tibshirani.

The Elements of Statistical Learning: Data Mining, Inference, and Prediction

Second Edition, Springer; 2nd edition (2016).  ISBN-13: 978-0387848570

[HML]

Aurélien Géron

Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems

 

[IDM]

Pang-Ning Tan, Michael Steinbach, Vipin Kumar

Introduction to Data Mining

Pearson; 1 edition (May 12, 2005), ISBN-13: 978-0321321367

[ISL]

Gareth James, Daniela Witten, Trevor Hastie, Robert Tibshirani

An Introduction to Statistical Learning

Springer

ISBN 978-1-4614-7137-0

[MLAPP]

Kevin P. Murphy

Machine Learning: A Probabilistic Perspective

The MIT Press

[MMD]

Leskovec, Jure, Anand Rajaraman, and Jeffrey David Ullman.

Mining of massive datasets.

Cambridge university press, 2014.

[PDSH]

Jake VanderPlas

Python Data Science Handbook: Essential Tools for Working with Data

Oreilly, ISBN-13: 978-1491912058

Online Book Link:

https:/​/​jakevdp.github.io/​PythonDataScienceHandbook/​

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Notes, articles, chapters and webpages will be handed out/made available during the

course

Last updated on 04-11-2020