Workflow

Session 2

François Briatte
(small modifs by Kim Antunez & ChatGPT)

2024-02-06

Essential R syntax to do things

Working directory

By setting the working directory, you specify the location where RStudio will look for files and save outputs.

Using R:

getwd() #To know your working directory
setwd("C:/Desktop/DSR/Session 2") # To set your WD

This can be helpful for organizing your projects and ensuring that R scripts can access the necessary files.

To set the working directory in RStudio using the menu, follow these steps:

  1. Go to the “Session” menu located at the top of the RStudio window.
  2. Click on “Set Working Directory.”
  3. From the dropdown menu that appears, select “Choose Directory.” A file dialog will open, allowing you to browse and select the directory you want to set as your working directory.
  4. click the “Open” button.

You’ll see the path to the chosen directory displayed in the Console panel at the bottom left corner of the RStudio interface or at the top right like in the image.

Exercises

Let’s start with Exercises 1 and Exercise 2 !

Source : https://www.esri.com/arcgis-blog/

Source : https://www.esri.com/arcgis-blog/

It is Dr. John Snow’s famous map of the 1854 cholera outbreak in Soho, London. He drew and published to document the data collected during the epidemic. Each cross (bold lines stacked along the street) indicates a cholera-related death at that address. The article examines the true story behind the map but overall, it sheds light on the actual role of the map in understanding cholera transmission.

Recap

# import dataset.csv into object data
# [!] set the working directory first
data <- read.csv("data/dataset.csv")

# Describe the dataset
str(data)

# select a variable in a data frame
data$variable

# select values 2 to 5 in a variable
data$variable[2:5]

# select positive values
data$variable[ data$variable > 0]

Syntactic sugar: R pipes (see Irizarry ch. 4.5)

# do what the first line says, and then do
# what the second line says to that result
group_by(data, variable) %>%
summarise(mu_x = mean(x, na.rm = TRUE))

# alternative pipe symbol (base R)
group_by(data, variable) |>
summarise(mu_x = mean(x, na.rm = TRUE))

Go further

Important principles

  • Code is like a cooking recipe — it contains instructions and comments to replicate the results of an analysis
    • R code is imperative: ʻdo this, then do that, then that,…ʼ
    • Run code from top to bottom: respect order of execution
  • Some { blocks of code } or functions span over multiple lines
  • Output = results, but also messages, warnings and errors

Useful keyboard shortcuts

Execute / run selected code

Select multiple lines of code

Clear (erase) the Console

Insert <- in your code

Insert %>% in your code

Ctrl/Cmd-Enter

Shift + arrows

Ctrl-L

Alt-dash (Alt + dash ʻ-ʼ key)

Ctrl/Cmd-Shift-M

Ctrl is for Windows and Mac, Cmd for Mac only

Cheatsheets

Source : https://raw.githubusercontent.com/rstudio

Source : https://raw.githubusercontent.com/rstudio

Source : https://raw.githubusercontent.com/rstudio

Source : https://raw.githubusercontent.com/rstudio

Source : https://raw.githubusercontent.com/rstudio

Source : https://raw.githubusercontent.com/rstudio

Source : https://raw.githubusercontent.com/rstudio

Source : https://raw.githubusercontent.com/rstudio

Homework for next week