This course is meant to be a gentle introduction to data wrangling and visualization using the
tidyverse in R. This course focuses on practical data science skills in R (loading data, data wrangling, visualization, automation, machine learning, and running statistical models) that you’ll use almost everyday in your work. It is meant for both beginners and students wanting to brush up on their R skills.
3 credit hours.
- Understand and utilize R/RStudio.
- Understand basic data types and data structures in R.
- Familiarize and load data files (Excel, Comma Separated Value files) into R/Rstudio, with tips on formatting.
- Visualize datasets using ggplot2 and understand how to build basic plots using ggplot2 syntax.
- Filter and format data in R for use with various routines.
- Execute and Interpret some basic statistics in R.
- Automate repetitive tasks in R, such as loading a folder of files.
- Execute basic machine learning workflows using
- Learn about Bioconductor Data Structures and conduct simple analysis with these structures.
- Your Choice:
For access to the RStudio.cloud notebooks, please subscribe here: https://ready4r.netlify.app/mailing/
Answers to assignments will be provided when you subscribe.
Code of Conduct
This class is governed by the BioData Club Code of Conduct: https://biodata-club.github.io/code_of_conduct/
This class is meant to be a psychologically safe space where it’s ok to ask questions. We want to normalize your own curiosity and fuel your desire to learn more.
If you are disruptive to class learning or disparaging to other students, I may mute you for the day. I am very serious about this.
Required Texts or Readings
We will be drawing on the following online textbooks during class and labs. These books are online and free, though you can order them as textbooks if you prefer that format.
Getting Used to R, RStudio, and RMarkdown. Chester Ismay. https://ismayc.github.io/rbasics-book/
Introduction to Data Science. Tiffany Timbers, Trevor Campbell, Melissa Lee. https://ubc-dsci.github.io/introduction-to-datascience/
RMarkdown for Scientists. Nick Tierney. https://rmd4sci.njtierney.com/
R for Data Science. Garret Grolemund and Hadley Wickham. https://r4ds.had.co.nz/
Words of Encouragement
This was adopted from Andrew Heiss. Thanks!
I promise you can succeed in this class.
Learning R can be difficult at first—it’s like learning a new language, just like Spanish, French, or Chinese. Hadley Wickham—the chief data scientist at RStudio and the author of some amazing R packages you’ll be using like
ggplot2—made this wise observation:
It’s easy when you start out programming to get really frustrated and think, “Oh it’s me, I’m really stupid,” or, “I’m not made out to program.” But, that is absolutely not the case. Everyone gets frustrated. I still get frustrated occasionally when writing R code. It’s just a natural part of programming. So, it happens to everyone and gets less and less over time. Don’t blame yourself. Just take a break, do something fun, and then come back and try again later.
Even experienced programmers find themselves bashing their heads against seemingly intractable errors. If you’re finding yourself taking way too long hitting your head against a wall and not understanding, take a break, talk to classmates, e-mail me, etc.
Every reasonable effort has been made to protect the copyright requirements of materials used in this course. Class participants are warned not to copy, audio, or videotape in violation of copyright laws.
Journal articles will be kept on reserve at the library or online for student access. Copyright law does allow for making one personal copy of each article from the original article. This limit also applies to electronic sources.
To comply with the fair use fair use doctrine of the US copyright law, Sakai course sites close three weeks after grades are posted with the Registrar. Please be sure to download all course material you wish to keep before this time as you will have no further access to your courses.