English   Danish

2020/2021  BA-BIBAV1010U  Introduction to Data Science for Business and Social Applications

English Title
Introduction to Data Science for Business and Social Applications

Course information

Language English
Course ECTS 7.5 ECTS
Type Elective
Level Bachelor
Duration One Semester
Start time of the course Autumn
Timetable Course schedule will be posted at calendar.cbs.dk
Max. participants 65
Study board
Study Board for BSc in Business, Asian Language and Culture
Course coordinator
  • Zoltan Fazekas - Department of International Economics, Goverment and Business (EGB)
Main academic disciplines
  • Statistics and quantitative methods
Teaching methods
  • Online teaching
Last updated on 16-06-2020

Relevant links

Learning objectives
  • Understand the process of data driven decision making in and its potential limitations within the business, social, and political realm.
  • Be able to load, process, and transform data stemming from different sources and structured in different ways.
  • Summarize data and apply visualization techniques to explore and present data.
  • Critically evaluate experimental approaches and interventions in business and social data analysis.
  • Identify and apply machine learning techniques for prediction, classification, and dimensionality reduction using business and social data.
  • Evaluate and compare the performance of various machine learning techniques.
  • Formulate data driven conclusions for social and business applications and questions.
Course prerequisites
Note on prerequisites and software use:
This is an introductory course, hence there are no formal prerequisites for taking this course.
Notwithstanding, any previous statistics, quantitative research methods, or potential computer science/programming course participation will facilitate progress with the course.
Students are expected to spend substantial amount of time working with software. The course software will be R, but no prior knowledge of R is expected. The first part of the course will introduce the software step-by-step through various applications.
Prerequisites for registering for the exam (activities during the teaching period)
Number of compulsory activities which must be approved (see s. 13): 2
Compulsory home assignments
There will be a total of three compulsory activities consisting of short written exercises. For each mandatory activity, each student will have to hand in an up to 5-pages individually written home assignment on given set of tasks.

Two out of three activities must be approved to qualify for the exam. Feedback on the assignments will be offered through video supporting materials and live online meetings.
No further attempts to pass the mandatory activities will be provided before the ordinary exam. If a student has not had the required number of activities approved, the student will not be able to attend the ordinary exam. Should the student fail at the ordinary exam then no further activities are required to qualify for the retake.

If the student fails to qualify for the ordinary exam:
In order to qualify for an extra mandatory activity before the retake the student must have (either) 1) attempted all three activities without succeeding in having them all approved; and/or 2) provided relevant documentation of illness or other extenuating circumstances.

In such cases s/he must, before the retake submit a 10-page paper covering the substance of the required number of mandatory activities. Specific requirements are provided by the course coordinator. When the paper is approved by the course coordinator, the student may be registered for the retake.

Number of compulsory activities which must be approved: 2
Compulsory home assignments
Examination
Introduction to Data Science for Business and Social Applications:
Exam ECTS 7,5
Examination form Home assignment - written product
Individual or group exam Group exam
Please note the rules in the Programme Regulations about identification of individual contributions.
Number of people in the group 2-3
Size of written product Max. 10 pages
Assignment type Written assignment
Duration 7 days to prepare
Grading scale 7-point grading scale
Examiner(s) One internal examiner
Exam period Winter
Make-up exam/re-exam
Same examination form as the ordinary exam
Description of the exam procedure

Students will be required to work on a data analysis project on data provided in the course. These projects will have to cover the tools and methods discussed in the course and answer a particular question picked by the students.

Feedback on project topics and topics will be offered through a project workshop, prior to the examination period.

Course content, structure and pedagogical approach

Data science—bridging statistics, computer science, and substantive area expertise, has become an integral part of decision-making since the availability and diversity of data sources have recently increased at an unprecedented pace. The course provides students with applied knowledge about working with data to understand and inform various business and social decisions. Upon completing the course, the students should be able to understand the techniques introduced in the course and

apply them to new data and specific problems. The course content covers two broader areas:

 

1. Working with data: data wrangling, transformations, summary, text-as-data, & visualization.

2. Machine learning techniques and principles, such as: A/B testing, regression, classification, regularization, & cross-validation.

 

The teaching format is “particular general particular”. We will introduce a particular question, then discuss how such a question is handled in general by reviewing core concepts from the literature, and we then return to the particular application by focusing implementation and extensions.

 

Throughout the course we will follow an applied, hands-on approach, also emphasizing the implementation related aspects. Hence, students will spend substantial amount of time working with software, including mandatory activities and final assignments. The course software will be R.

All hands-on activities will be based on data (or type of data) often used in private and public organizations.

 

Among others, we will look at examples related to labor market discrimination, analysis of (social) media coverage, investment decisions, and policy interventions.

Description of the teaching methods
Teaching will be carried out fully online relying of pre-recorded videos, live discussion sessions of the video content, forum activities, and live online exercise sessions where we will be coding together.
Feedback during the teaching period
Feedback will be offered for the mandatory activities during the course through solution videos and the review of most common recommendations for future work. Furthermore, student projects will be discussed in a live online project presentation workshop with feedback offered during the workshop (live) and after the workshop through the use of forums. Feedback regarding specific inquiries will be given during ‘virtual office hours’ offered by full-time staff members, although these can never be a substitute for participation in the regular teaching activities. Generally, we encourage you to ask questions or make comments on the online forums used and during our live sessions.
Student workload
Lectures, exercises and workshops 48 hours
Exam 56 hours
Course preparation. Includes: readings for lectures and exercises work on activities (homeworks) 104 hours
Expected literature

Main resources

• Taddy, M. (2019). Business Data Science: Combining Machine Learning and Economics to Optimize,

Automate, and Accelerate Business Decisions. McGraw-Hill Education

• Wickham, H., & Grolemund, G. (2017). R for data science. O’Reilly Media

Additional resources

• Healy, K. (2018). Data visualization: a practical introduction. Princeton University Press

• Benoit, K. (2019). Text as data: An Overview. In L. Cuirini & R. Franzese (Eds.), Handbook of

Research Methods in Political Science and International Relations. Thousand Oaks: Sage

• Salganik, M. (2019). Bit by bit: Social research in the digital age. Princeton University Press

• James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An Introduction to Statistical Learning.

Springer

 

Last updated on 16-06-2020