DataScience with




This workshop is about accessing, manipulating visualizing, and analyzing data with a statistical software called and its RStudio interface.

Working with different kinds of data, you will learn some essential statistical concepts along the way, building up from exploratory data analysis to statistical modeling.

You must know that you will be using your computer most of the time throughout the course. The goal is for you to turn to a data scientist 2 hours of each week!

Chapters

Session Date Prepare Slides Exercises Exam
1. DataScience & Software Introduction to DataScience and R Software setup Jan 30, 2024 🖥️
2. Workflow Utilizing `dplyr` functions: Importing data, exploring structure, selecting variables... Feb 6, 2024 📖️ 🖥️ 1️️ 2️⃣️
3. Data 1 `dplyr` for tidying data: Renaming, aggregating... Feb 13, 2024 📖️ 🖥️ 1️️
4. Data 2 More on `dplyr` for tidying data: aggregating, merging... Feb 27, 2024 📖️ 🖥️ 1️️ 2️⃣️ 🎓️
5. Datavisualization Exploration of data visualization's significance, featuring examples, and introducing `ggplot2` Mar 5, 2024 📖️ 🖥️ 1️️ 2️⃣️
6. Univariate Exploratory analysis Exploratory Analysis for continuous and qualitative variables. Mar 12, 2024 🖥️ 1️️
7. Bivariate Exploratory analysis Correlation and causality Mar 19, 2024 📖️ 🖥️ 1️️
8. Statistical inference Distributions, confidence intervals, and test statistics Mar 26, 2024 📖️ 🖥️ 1️️
9. Linear Regression Linear Regression Models Apr 2, 2024 📖️ 🖥️ 1️️
10. Logistic Regression Linear Generalized Regression Models Apr 16, 2024 📖️ 🖥️ 1️️ 🎓️
11. Spatial 1 Spatial analysis and Cartography Apr 17, 2024 🖥️ 1️️
12. Spatial 2 End of Spatial and notions of classification methods Apr 23, 2024 🖥️ 1️️ 2️⃣️ 🎓️
No matching items

Teacher

Kim ANTUNEZ

Learning Outcomes

  1. Proficiency in exploratory data analysis
  2. Knowledge of statistical inference and modeling
  3. Knowledge of the R programming language
  4. Knowledge of the RStudio software
  5. Exposition to current data science trends

Professional Skills

Quantitative methods, R and RStudio software, data science skills. After this course, the students will be more able to interact with scientific professions such as data scientists.

Language of tuition

English

Workload

  • Attendance: 2 hours a week / 24 hours a semester
  • Online learning activities: 12 hours a week / 24 hours a semester
  • Reading and Preparation for Class: 1 hour a week / 12 hours a semester
  • Research and Preparation for Group Work: 2 hours a week / 24 hours a semester

Pre-requisite

Students need a high curiosity for quantitative methods and motivation to learn how to code around data analysis. Therefore, minimal computing skills (e.g unzipping files) and notions of introductory descriptive statistics would be appreciated. Each student needs to use a laptop running a recent version of Windows, MacOS or Linux, with full admin privileges and the ability to run the latest versions of R (r-project.org) and RStudio (posit.co) installed and to install new libraries using the internet.

Semester

Autumn and Spring 2023-2024

Course validation

There will be exercises to be completed in between workshop sessions, and possibly group projects to be elaborated throughout the semester. Pedagogical format All classes are structured around a slides-based presentation and a ‘demo’ session on the statistical software, followed by a ‘debrief’ email that includes readings and other homework, with feedbacks during the next class.