Machine Learning in R

Machine Learning in R @ Cold Spring Harbor Laboratory, Summer 2018

Introductory words

This is an Introduction to Machine Learning in R, in which you’ll learn the basics of unsupervised learning for pattern recognition and supervised learning for prediction. At the end of this workshop, we hope that you’ll

The mathematical foundation for each section is not contained in these pages, as the instructor will explain and elaborate on the whiteboard.

Acknowledgements

This material has come from many conversations, workshops and online courses over the years, most notably the work that I have done at DataCamp. Some of the material is similar to material that I developed for DataCamp’s Supervised Learning with scikit-learn course, on which I collaborated with Andreas Müller and Yashas Roy, along with community articles that I have written, such as Kaggle Tutorial: EDA & Machine Learning & Experts’ Favorite Data Science Techniques. Finally, I found time to develop this material due to the 20% community time that I have at DataCamp and am indebted to them for this.

Prerequisites

Basic command of R

Schedule

Setup Download files required for the lesson
00:00 1. Loading and exploring data What is Exploratory Data Analysis (EDA) and why is it useful?
How can I do EDA in R?
00:30 2. Unsupervised Learning What is principal component analysis (PCA)?
How can I perform PCA in R?
What is clustering?
01:30 3. Supervised Learning I: classification How can I apply supervised learning to a data set?
02:00 4. Supervised Learning II: regression What if the target variable is numerical rather than categorical?
02:40 Finish

The actual schedule may vary slightly depending on the topics and exercises chosen by the instructor.