Syllabus

Course Description

This course is meant to be a gentle introduction to data wrangling and visualization using the tidyverse in R. This course focuses on practical data science skills in R (loading data, data wrangling, visualization, automation, machine learning, and running statistical models) that you’ll use almost everyday in your work. It is meant for both beginners and students wanting to brush up on their R skills.

Credit Hours

3 credit hours.

Learning Objectives

  1. Understand and utilize R/RStudio.
  2. Understand basic data types and data structures in R.
  3. Familiarize and load data files (Excel, Comma Separated Value files) into R/Rstudio, with tips on formatting.
  4. Visualize datasets using ggplot2 and understand how to build basic plots using ggplot2 syntax.
  5. Filter and format data in R for use with various routines.
  6. Execute and Interpret some basic statistics in R.
  7. Automate repetitive tasks in R, such as loading a folder of files.
  8. Execute basic machine learning workflows using tidymodels.
  9. Learn about Bioconductor Data Structures and conduct simple analysis with these structures.
  10. Your Choice: shiny, leaflet or tidytext.

Course Website

For access to the RStudio.cloud notebooks, please subscribe here: https://ready4r.netlify.app/mailing/

Answers to assignments will be provided when you subscribe.

Code of Conduct

This class is governed by the BioData Club Code of Conduct: https://biodata-club.github.io/code_of_conduct/

This class is meant to be a psychologically safe space where it’s ok to ask questions. We want to normalize your own curiosity and fuel your desire to learn more.

If you are disruptive to class learning or disparaging to other students, I may mute you for the day. I am very serious about this.

Required Texts or Readings

We will be drawing on the following online textbooks during class and labs. These books are online and free, though you can order them as textbooks if you prefer that format.

Getting Used to R, RStudio, and RMarkdown. Chester Ismay. https://ismayc.github.io/rbasics-book/

Introduction to Data Science. Tiffany Timbers, Trevor Campbell, Melissa Lee. https://ubc-dsci.github.io/introduction-to-datascience/

RMarkdown for Scientists. Nick Tierney. https://rmd4sci.njtierney.com/

R for Data Science. Garret Grolemund and Hadley Wickham. https://r4ds.had.co.nz/

Words of Encouragement

This was adopted from Andrew Heiss. Thanks!

I promise you can succeed in this class.

Learning R can be difficult at first—it’s like learning a new language, just like Spanish, French, or Chinese. Hadley Wickham—the chief data scientist at RStudio and the author of some amazing R packages you’ll be using like ggplot2made this wise observation:

It’s easy when you start out programming to get really frustrated and think, “Oh it’s me, I’m really stupid,” or, “I’m not made out to program.” But, that is absolutely not the case. Everyone gets frustrated. I still get frustrated occasionally when writing R code. It’s just a natural part of programming. So, it happens to everyone and gets less and less over time. Don’t blame yourself. Just take a break, do something fun, and then come back and try again later.

Even experienced programmers find themselves bashing their heads against seemingly intractable errors. If you’re finding yourself taking way too long hitting your head against a wall and not understanding, take a break, talk to classmates, e-mail me, etc.

Alison Horst: Gator error

LeaRning is Social

The students who have a bad time in my workshops and courses are the ones who don’t work with each other to learn. We are a learning community, and we should help each other to learn.

If you understand something and someone is struggling with it, try and help them. If you are struggling, take a breath, and try to pinpoint what you are struggling with.

Our goal is to be better programmers each day, not to be the perfect programmer. There’s no such thing as a perfect programmer. I’ve been learning new things almost every day.