Research in ecology and evolution presents a challenge with diverse and complicated datasets. With the wide-spread adoption of R for data management and analysis, there is huge potential for improving the efficiency of processing data. However, there is a significant learning curve with R that inhibits our ability to learn faster methods. Repetitive tasks in spreadsheets or even in R itself can often be revised to be faster, use less code, and have a simpler output. Copy-paste strategies can lead to errors and can be computation intensive for R relative to other methods. This workshop will explore writing functions, vectorization via the apply family (e.g. apply, lapply, vapply), “for” loops, and parallel computing. We will also touch upon some of the tools in the tidyverse package for large-scale data manipulation. These tools, while on the surface may appear intimidating, can be learned quickly with an exceptional payoff in time-saving efficiency. Using a combination of lecture and hands-on activities, this workshop will familiarize yourself with the tools necessary for improving your relationship with R and saving you time. A basic understanding of R is recommended because it will make the content more relevant and understandable. There is no prior knowledge necessarily for parallel computing or programming. Participants should bring a laptop with R already install using Mac, Linux, or Windows operating system (not a tablet, Chromebook, etc.).
Who: The course is aimed at R beginners or experienced analysts.
When: August 20, 2021 @ 11:30 EDT / 8:30 PDT / 13:00 NDT
Where: Virtual. https://utoronto.zoom.us/j/85097717914
Requirements: Participants should use a laptop with a Mac, Linux, or Windows operating system (not a tablet, Chromebook, etc.) with administrative privileges. Ideally, two screens would be beneficial to see the video and your workstation at the same time. However, participation is not necessary and you can simply follow along as we demonstrate.
Contact: Please contact alex.filazzola@outlook.com for more information.
Notes: Live Notepad
(time in PDT/EDT)
Time | Goal |
---|---|
8:30 / 11:30 | Introduction and set-up |
8:45 / 11:45 | for Loops |
9:15 / 12:15 | Vectorization |
10:00 / 13:00 | Break |
10:15 / 13:15 | Functions |
11:00 / 14:00 | Parallelization |
Past events: We have previous run this workshop at SORTEE 2021 and CSEE 2021.
R is a programming language that is especially powerful for data exploration, visualization, and statistical analysis. To interact with R, we use RStudio.
Windows | Mac OS X | Linux |
---|---|---|
Install R by downloading and running this .exe file from CRAN. Please also install the RStudio IDE. | Install R by downloading and running this .pkg file from CRAN. Please also install the RStudio IDE. | You can download the binary files for your distribution from CRAN. Please also install the RStudio IDE |
Packages we will be using: We recommend you install these ahead of time and ensure they load correctly to reduce troubleshooting in the workshop.
install.packages(c("here","microbenchmark", "tidyr","dplyr","magrittr","broom","foreach","doParallel","palmerpenguins"))
If you enjoyed this workshop and were interested in learning more, I have also run workshops on Logistic Regression, an Introduction to Ecological Analyses, and an Introduction to Functions
You can find similar style workshops, usually that are longer and go into more detail, with Software Carpentry. They have teachers available globally and cover all forms of programming beyond R.
Center for Urban Environments | University of Toronto |
---|---|
Copyright © Alessandro Filazzola and Sophie Breitbart 2021