2  Workflow

Session 2

Author

François Briatte
(small modifs by Kim Antunez & ChatGPT)

session date

February 6, 2024

By the end of this session, you have learned how to understand the nature of R code, which outlines a sequence of instructions, by giving basic examples.

In a nutshell

  • Importing a CSV dataset into R using the read.csv function (and the same with readxl::read_excel for xls files)
  • Describing a dataset using the View, str and nrow functions to understand its structure.
  • Selecting specific variables and values from a dataset.
  • Using R pipes (%>%) to chain operations together, providing a more readable and concise code structure.
  • Creating basic point and line plots with ggplot2
  • Count the values of the categorical variable with table and describe continuous ones with summary
  • Getting to know dplyr functions such as select and group_by+summarise
  • Understanding that the output in R includes not only results but also messages, warnings, and errors.

2.1 Essential R syntax to do things

2.1.1 Working directory

By setting the working directory, you specify the location where RStudio will look for files and save outputs.

Using R:

getwd() #To know your working directory
setwd("C:/Desktop/DSR/Session 2") # To set your WD

This can be helpful for organizing your projects and ensuring that R scripts can access the necessary files.

To set the working directory in RStudio using the menu, follow these steps:

  1. Go to the “Session” menu located at the top of the RStudio window.
  2. Click on “Set Working Directory.”
  3. From the dropdown menu that appears, select “Choose Directory.” A file dialog will open, allowing you to browse and select the directory you want to set as your working directory.
  4. click the “Open” button.

You’ll see the path to the chosen directory displayed in the Console panel at the bottom left corner of the RStudio interface or at the top right like in the image.

2.1.2 Exercises

Let’s start with Exercises 1 and Exercise 2 !

Source : https://www.esri.com/arcgis-blog/

Source : https://www.esri.com/arcgis-blog/

It is Dr. John Snow’s famous map of the 1854 cholera outbreak in Soho, London. He drew and published to document the data collected during the epidemic. Each cross (bold lines stacked along the street) indicates a cholera-related death at that address. The article examines the true story behind the map but overall, it sheds light on the actual role of the map in understanding cholera transmission.

2.1.3 Recap

# import dataset.csv into object data
# [!] set the working directory first
data <- read.csv("data/dataset.csv")

# Describe the dataset
str(data)

# select a variable in a data frame
data$variable

# select values 2 to 5 in a variable
data$variable[2:5]

# select positive values
data$variable[ data$variable > 0]

Syntactic sugar: R pipes (see Irizarry ch. 4.5)

# do what the first line says, and then do
# what the second line says to that result
group_by(data, variable) %>%
summarise(mu_x = mean(x, na.rm = TRUE))

# alternative pipe symbol (base R)
group_by(data, variable) |>
summarise(mu_x = mean(x, na.rm = TRUE))

2.2 Go further

2.2.1 Important principles

  • Code is like a cooking recipe — it contains instructions and comments to replicate the results of an analysis
    • R code is imperative: ʻdo this, then do that, then that,…ʼ
    • Run code from top to bottom: respect order of execution
  • Some { blocks of code } or functions span over multiple lines
  • Output = results, but also messages, warnings and errors

2.2.2 Useful keyboard shortcuts

Execute / run selected code

Select multiple lines of code

Clear (erase) the Console

Insert <- in your code

Insert %>% in your code

Ctrl/Cmd-Enter

Shift + arrows

Ctrl-L

Alt-dash (Alt + dash ʻ-ʼ key)

Ctrl/Cmd-Shift-M

Ctrl is for Windows and Mac, Cmd for Mac only

2.2.3 Cheatsheets

Source : https://raw.githubusercontent.com/rstudio

Source : https://raw.githubusercontent.com/rstudio

Source : https://raw.githubusercontent.com/rstudio

Source : https://raw.githubusercontent.com/rstudio

Source : https://raw.githubusercontent.com/rstudio

Source : https://raw.githubusercontent.com/rstudio

Source : https://raw.githubusercontent.com/rstudio

Source : https://raw.githubusercontent.com/rstudio

Homework for next week