Learning objectives |
- Access the fundamental challenges of machine learning such as
model selection, model complexity, etc.
- Understand the underlying mathematical relationships within and
across machine learning algorithms
- Characterize the strengths and weaknesses of various machine
learning approaches and algorithms
- Design, implement, analyse and apply different data mining,
machine learning techniques and deep learning techniques for
big/business datasets in organizational contexts and for real-world
applications
- Summarize the application areas, trends, and challenges in data
mining and machine learning
- Critically assess the ethical and legal issues in applying
machine learning algorithms
- Exhibit deeper knowledge and understanding of the topics as
part of the project and the report should reflect on critical
awareness of the methodological choices with written skills to
accepted academic standards.
|
Course prerequisites |
This course requires a fundamental understanding
of programming in Python language as achieved in, or comparable to,
Foundations of Business Data Analytics: Architectures, Statistics
and Programming course from 1st semester of Cand.Merc.IT (Data
Science). |
Prerequisites for registering for the exam
(activities during the teaching period) |
Number of compulsory
activities which must be approved: 2
Compulsory home
assignments
Each assignment is 1-3 pages in group of 1-4 students.
The students have to get 2 out of 4 assignments approved in order
to go to the exam.
There will not be any extra attempts provided to the students
before the ordinary exam.
If a student cannot hand in due to documented illness, or if a
student does not get the activity approved in spite of making a
real attempt, then the student will be given one extra attempt
before the re-exam. Before the re-exam, there will be one home
assignment (max. 10 pages) which will cover 2 mandatory
assignments.
|
Examination |
Data Mining,
Machine Learning and Deep Learning:
|
Exam
ECTS |
7,5 |
Examination form |
Home assignment - written product |
Individual or group exam |
Group exam
Please note the rules in the Programme Regulations about
identification of individual contributions. |
Number of people in the group |
2-4 |
Size of written product |
Max. 15 pages |
|
PLEASE NOTE: Different exam form in summer 2021;
details below. |
Assignment type |
Project |
Duration |
Written product to be submitted on specified date
and time. |
Grading scale |
7-point grading scale |
Examiner(s) |
Internal examiner and second internal
examiner |
Exam period |
Summer |
Make-up exam/re-exam |
Same examination form as the ordinary exam
Regarding the group project:
Students can submit the same project or they can choose to submit a
revised project.
|
Description of the exam
procedure
Summer 2021:
Individual oral exam based on a group project. The oral exam will
be 20 minutes per student including examiners’ discussion, and
informing of grade. The group project is max. 15 pages. It will
also be possible to write the project individually. The number for
pages of an individual project is also max. 15 pages.
In order to participate in the oral exam, the written product
must be handed in before the oral exam; by the set deadline. The
grade is based on an overall assessment of the written product and
the individual oral performance.
Make-up exam/re-exam:
Same examination form as the ordinary exam. Students can submit the
same project or they can choose to submit a revised project.
Summer 2020:
The exam consists of two elements: a group project and an
individual home assignment.
a) The group project is max. 15 pages. The students
have to individualize their group project. The students must
show what their individual contributions are, and in such a way
that it is ensured that individual assessment is possible. See
‘Individualisation of group papers etc.’ in the study
administrative
rules
(SAR).
b) The individual home assignment is without a
specific number of pages and is made on a certain time in a 2 hour
slot.
|
|
Course content, structure and pedagogical
approach |
The course provides knowledge of various concepts, techniques
and methods related to data mining, machine learning and deep
learning approaches. Furthermore, it introduces
- Basics of Data mining and machine learning
- Strengths and weaknesses of Dimensionality Reduction
Algorithms: variance thresholds,Correlation Thresholds, Principal
Component Analysis (PCA), Linear Discriminant Analysis (LDA)
- Linear models for regression such as maximum likelihood,
sequential learning, regularized least squares
- Linear models for classification such as linear classification,
logistic regression, support vector machines
- Classification models such as probabilistic generative models,
probabilistic discriminative models
- Unsupervised learning: clustering, probabilistic clustering,
Expectation-Maximization Algorithm.
- Neural Networks: feed-forward neural networks, network
training, backpropagation, convolutional neural networks
- Deep Learning: deep feed-forward networks, regularization for
deep learning, optimization for training deep models, application
of deep learning
Furthermore, the course provides the students with practical
hands-on experience on data mining and machine learning using open
source machine learning libraries such as scikit-learn in Python
programming language. After completing the course, the students
will be able to apply and use various data mining and
machine-learning techniques on real-word big/business
datasets.
|
Description of the teaching methods |
The course consists of lectures, exercises, and
assignments. Each lecture is followed by an exercise session, and
there will be a teaching assistant providing technical support for
assignments and course projects.
The presented theories, concepts and methods should be applied in
practice and exercise sessions. The students work in the entire
semester on a mini project displaying the understanding of the
concepts presented in the lectures and exercises. CBS Learn is used
for sharing documents, slides, exercises etc. as well as for
interactive lessons if applicable. |
Feedback during the teaching period |
Feedback on mandatory assignments will provided
in general |
Student workload |
Lectures |
24 hours |
Exercises |
24 hours |
Prepare to class |
48 hours |
Project work & report |
100 hours |
Exam and prepare |
10 hours |
Total |
206 hours |
|
Expected literature |
The literature can be changed before the semester starts.
Students are advised to find the final literature on Canvas
before they buy the books.
Text Books:
|
Authors(s)
|
Title
|
Publisher/ ISBN/ DOI
|
[AIAMA]
|
Russell, Stuart J., and Peter Norvig.
|
Artificial intelligence: a modern approach.
|
Malaysia; Pearson Education Limited, 2016.
|
[CIMI]
|
Kruse, Rudolf, Christian Borgelt, Christian Braune, Sanaz
Mostaghim, and Matthias Steinbrecher.
|
Computational intelligence: a methodological
introduction.
|
Springer, 2016.
|
[CML]
|
Hal Daumé III
|
A Course in Machine Learning
|
-
|
[DMCT]
|
Jiawei Han, Micheline Kamber, Jian Pei
|
Data Mining: Concepts and Techniques
|
Morgan Kaufmann; 3 edition (July 6, 2011)
ISBN-13: 978-9380931913
|
[ESL]
|
Friedman, Jerome, Trevor Hastie, and Robert Tibshirani.
|
The Elements of Statistical Learning: Data Mining, Inference,
and Prediction
|
Second Edition, Springer; 2nd edition (2016). ISBN-13:
978-0387848570
|
[HML]
|
Aurélien Géron
|
Hands-On Machine Learning with Scikit-Learn and TensorFlow:
Concepts, Tools, and Techniques to Build Intelligent
Systems
|
|
[IDM]
|
Pang-Ning Tan, Michael Steinbach, Vipin Kumar
|
Introduction to Data Mining
|
Pearson; 1 edition (May 12, 2005), ISBN-13:
978-0321321367
|
[ISL]
|
Gareth James, Daniela Witten, Trevor Hastie, Robert
Tibshirani
|
An Introduction to Statistical Learning
|
Springer
ISBN 978-1-4614-7137-0
|
[MLAPP]
|
Kevin P. Murphy
|
Machine Learning: A Probabilistic Perspective
|
The MIT Press
|
[MMD]
|
Leskovec, Jure, Anand Rajaraman, and Jeffrey David
Ullman.
|
Mining of massive datasets.
|
Cambridge university press, 2014.
|
[PDSH]
|
Jake VanderPlas
|
Python Data Science Handbook: Essential Tools for Working with
Data
|
Oreilly, ISBN-13: 978-1491912058
Online Book Link:
https://jakevdp.github.io/PythonDataScienceHandbook/
|
Notes, articles, chapters and webpages will be handed out/made
available during the
course
|