General Information

Research in ecology and evolution presents a challenge with diverse and complicated datasets. With the wide-spread adoption of R for data management and analysis, there is huge potential for improving the efficiency of processing data. However, there is a significant learning curve with R that inhibits our ability to learn faster methods. Repetitive tasks in spreadsheets or even in R itself can often be revised to be faster, use less code, and have a simpler output. Copy-paste strategies can lead to errors and can be computation intensive for R relative to other methods. This workshop will explore writing functions, vectorization via the apply family (e.g. apply, lapply, vapply), “for” loops, and parallel computing. We will also touch upon some of the tools in the tidyverse package for large-scale data manipulation. These tools, while on the surface may appear intimidating, can be learned quickly with an exceptional payoff in time-saving efficiency. Using a combination of lecture and hands-on activities, this workshop will familiarize yourself with the tools necessary for improving your relationship with R and saving you time. A basic understanding of R is recommended because it will make the content more relevant and understandable. There is no prior knowledge necessarily for parallel computing or programming. Participants should bring a laptop with R already install using Mac, Linux, or Windows operating system (not a tablet, Chromebook, etc.).

Who: The course is aimed at R beginners or experienced analysts.

When: August 20, 2021 @ 11:30 EDT / 8:30 PDT / 13:00 NDT

Where: Virtual. https://utoronto.zoom.us/j/85097717914

Requirements: Participants should use a laptop with a Mac, Linux, or Windows operating system (not a tablet, Chromebook, etc.) with administrative privileges. Ideally, two screens would be beneficial to see the video and your workstation at the same time. However, participation is not necessary and you can simply follow along as we demonstrate.

Contact: Please contact for more information.

Notes: Live Notepad

Schedule

(time in PDT/EDT)

Time Goal
8:30 / 11:30 Introduction and set-up
8:45 / 11:45 for Loops
9:15 / 12:15 Vectorization
10:00 / 13:00 Break
10:15 / 13:15 Functions
11:00 / 14:00 Parallelization

Past events: We have previous run this workshop at SORTEE 2021 and CSEE 2021.

Software

R is a programming language that is especially powerful for data exploration, visualization, and statistical analysis. To interact with R, we use RStudio.

Windows Mac OS X Linux
Install R by downloading and running this .exe file from CRAN. Please also install the RStudio IDE. Install R by downloading and running this .pkg file from CRAN. Please also install the RStudio IDE. You can download the binary files for your distribution from CRAN. Please also install the RStudio IDE

Packages we will be using: We recommend you install these ahead of time and ensure they load correctly to reduce troubleshooting in the workshop.

install.packages(c("here","microbenchmark", "tidyr","dplyr","magrittr","broom","foreach","doParallel","palmerpenguins"))

Other workshops

If you enjoyed this workshop and were interested in learning more, I have also run workshops on Logistic Regression, an Introduction to Ecological Analyses, and an Introduction to Functions

You can find similar style workshops, usually that are longer and go into more detail, with Software Carpentry. They have teachers available globally and cover all forms of programming beyond R.


Thank You!


Center for Urban Environments University of Toronto

Copyright © Alessandro Filazzola and Sophie Breitbart 2021