Launch Your Career in Data Science

A nine-course introduction to data science, developed and taught by leading professors.


About This Specialization

Ask the right questions, manipulate data sets, and create visualizations to communicate results.

This Specialization covers the concepts and tools you'll need throughout the entire data science pipeline, from asking the right kinds of questions to making inferences and publishing results. In the final Capstone Project, you’ll apply the skills learned by building a data product using real-world data. At completion, students will have a portfolio demonstrating their mastery of the material.
Created by:
Industry Partners:
courses
10 courses
Follow the suggested order or choose your own.
projects
Projects
Designed to help you practice and apply the skills you learn.
certificates
Certificates
Highlight your new skills on your resume or LinkedIn.
Courses
Beginner Specialization.
No prior experience required.


  1. COURSE 1

    The Data Scientist’s Toolbox

    Upcoming session: Oct 2
    Commitment
    1-4 hours/week
    Subtitles
    English, French, Chinese (Simplified), Greek, Italian, Portuguese (Brazilian), Vietnamese, Russian, Turkish, Hebrew

    About the Course

    In this course you will get an introduction to the main tools and ideas in the data scientist's toolbox. The course gives an overview of the data, questions, and tools that data analysts and data scientists work with. There are two components to this course. The first is a conceptual introduction to the ideas behind turning data into actionable knowledge. The second is a practical introduction to the tools that will be used in the program like version control, markdown, git, GitHub, R, and RStudio.
    Show or hide details about course The Data Scientist’s Toolbox

    WEEK 1
    Week 1
    During Week 1, you'll learn about the goals and objectives of the Data Science Specialization and each of its components. You'll also get an overview of the field as well as instructions on how to install R.
    Reading · Welcome to the Data Scientist's Toolbox
    Reading · Pre-Course Survey
    Reading · Syllabus
    Reading · Specialization Textbooks
    Video · Specialization Motivation
    Reading · The Elements of Data Analytic Style
    Video · The Data Scientist's Toolbox
    Video · Getting Help
    Video · Finding Answers
    Video · R Programming Overview
    Video · Getting Data Overview
    Video · Exploratory Data Analysis Overview
    Video · Reproducible Research Overview
    Video · Statistical Inference Overview
    Video · Regression Models Overview
    Video · Practical Machine Learning Overview
    Video · Building Data Products Overview
    Video · Installing R on Windows {Roger Peng}
    Video · Install R on a Mac {Roger Peng}
    Video · Installing Rstudio {Roger Peng}
    Video · Installing Outside Software on Mac (OS X Mavericks)
    Quiz · Week 1 Quiz

    WEEK 2
    Week 2: Installing the Toolbox
    This is the most lecture-intensive week of the course. The primary goal is to get you set up with R, Rstudio, Github, and the other tools we will use throughout the Data Science Specialization and your ongoing work as a data scientist.
    Video · Tips from Coursera Users - Optional Video
    Video · Command Line Interface
    Video · Introduction to Git
    Video · Introduction to Github
    Video · Creating a Github Repository
    Video · Basic Git Commands
    Video · Basic Markdown
    Video · Installing R Packages
    Video · Installing Rtools
    Quiz · Week 2 Quiz

    WEEK 3
    Week 3: Conceptual Issues
    The Week 3 lectures focus on conceptual issues behind study design and turning data into knowledge. If you have trouble or want to explore issues in more depth, please seek out answers on the forums. They are a great resource! If you happen to be a superstar who already gets it, please take the time to help your classmates by answering their questions as well. This is one of the best ways to practice using and explaining your skills to others. These are two of the key characteristics of excellent data scientists.
    Video · Types of Questions
    Video · What is Data?
    Video · What About Big Data?
    Video · Experimental Design
    Quiz · Week 3 Quiz

    WEEK 4
    Week 4: Course Project Submission & Evaluation
    In Week 4, we'll focus on the Course Project. This is your opportunity to install the tools and set up the accounts that you'll need for the rest of the specialization and for work in data science.
    Peer Review · Course Project
    Reading · Post-Course Survey

  2. COURSE 2

    R Programming

    Upcoming session: Oct 2
    Subtitles
    English, French, Japanese, Chinese (Simplified)

    About the Course

    In this course you will learn how to program in R and how to use R for effective data analysis. You will learn how to install and configure software necessary for a statistical programming environment and describe generic programming language concepts as they are implemented in a high-level statistical language. The course covers practical issues in statistical computing which includes programming in R, reading data into R, accessing R packages, writing R functions, debugging, profiling R code, and organizing and commenting R code. Topics in statistical data analysis will provide working examples.
    Show or hide details about course R Programming

    WEEK 1
    Week 1: Background, Getting Started, and Nuts & Bolts
    This week covers the basics to get you started up with R. The Background Materials lesson contains information about course mechanics and some videos on installing R. The Week 1 videos cover the history of R and S, go over the basic data types in R, and describe the functions for reading and writing data. I recommend that you watch the videos in the listed order, but watching the videos out of order isn't going to ruin the story.
    Reading · Welcome to R Programming
    Reading · About the Instructor
    Reading · Pre-Course Survey
    Reading · Syllabus
    Reading · Course Textbook
    Reading · Course Supplement: The Art of Data Science
    Reading · Data Science Podcast: Not So Standard Deviations
    Video · Installing R on a Mac
    Video · Installing R on Windows
    Video · Installing R Studio (Mac)
    Video · Writing Code / Setting Your Working Directory (Windows)
    Video · Writing Code / Setting Your Working Directory (Mac)
    Reading · Getting Started and R Nuts and Bolts
    Video · Introduction
    Video · Overview and History of R
    Video · Getting Help
    Video · R Console Input and Evaluation
    Video · Data Types - R Objects and Attributes
    Video · Data Types - Vectors and Lists
    Video · Data Types - Matrices
    Video · Data Types - Factors
    Video · Data Types - Missing Values
    Video · Data Types - Data Frames
    Video · Data Types - Names Attribute
    Video · Data Types - Summary
    Video · Reading Tabular Data
    Video · Reading Large Tables
    Video · Textual Data Formats
    Video · Connections: Interfaces to the Outside World
    Video · Subsetting - Basics
    Video · Subsetting - Lists
    Video · Subsetting - Matrices
    Video · Subsetting - Partial Matching
    Video · Subsetting - Removing Missing Values
    Video · Vectorized Operations
    Quiz · Week 1 Quiz
    Video · Introduction to swirl
    Reading · Practical R Exercises in swirl Part 1
    Practice Programming Assignment · swirl Lesson 1: Basic Building Blocks
    Practice Programming Assignment · swirl Lesson 2: Workspace and Files
    Practice Programming Assignment · swirl Lesson 3: Sequences of Numbers
    Practice Programming Assignment · swirl Lesson 4: Vectors
    Practice Programming Assignment · swirl Lesson 5: Missing Values
    Practice Programming Assignment · swirl Lesson 6: Subsetting Vectors
    Practice Programming Assignment · swirl Lesson 7: Matrices and Data Frames

    WEEK 2
    Week 2: Programming with R
    Welcome to Week 2 of R Programming. This week, we take the gloves off, and the lectures cover key topics like control structures and functions. We also introduce the first programming assignment for the course, which is due at the end of the week.
    Reading · Week 2: Programming with R
    Video · Control Structures - Introduction
    Video · Control Structures - If-else
    Video · Control Structures - For loops
    Video · Control Structures - While loops
    Video · Control Structures - Repeat, Next, Break
    Video · Your First R Function
    Video · Functions (part 1)
    Video · Functions (part 2)
    Video · Scoping Rules - Symbol Binding
    Video · Scoping Rules - R Scoping Rules
    Video · Scoping Rules - Optimization Example (OPTIONAL)
    Video · Coding Standards
    Video · Dates and Times
    Reading · Practical R Exercises in swirl Part 2
    Practice Programming Assignment · swirl Lesson 1: Logic
    Practice Programming Assignment · swirl Lesson 2: Functions
    Practice Programming Assignment · swirl Lesson 3: Dates and Times
    Quiz · Week 2 Quiz
    Reading · Programming Assignment 1 INSTRUCTIONS: Air Pollution
    Quiz · Programming Assignment 1: Quiz

    WEEK 3
    Week 3: Loop Functions and Debugging
    We have now entered the third week of R Programming, which also marks the halfway point. The lectures this week cover loop functions and the debugging tools in R. These aspects of R make R useful for both interactive work and writing longer code, and so they are commonly used in practice.
    Reading · Week 3: Loop Functions and Debugging
    Video · Loop Functions - lapply
    Video · Loop Functions - apply
    Video · Loop Functions - mapply
    Video · Loop Functions - tapply
    Video · Loop Functions - split
    Video · Debugging Tools - Diagnosing the Problem
    Video · Debugging Tools - Basic Tools
    Video · Debugging Tools - Using the Tools
    Reading · Practical R Exercises in swirl Part 3
    Practice Programming Assignment · swirl Lesson 1: lapply and sapply
    Practice Programming Assignment · swirl Lesson 2: vapply and tapply
    Quiz · Week 3 Quiz
    Peer Review · Programming Assignment 2: Lexical Scoping

    WEEK 4
    Week 4: Simulation & Profiling
    This week covers how to simulate data in R, which serves as the basis for doing simulation studies. We also cover the profiler in R which lets you collect detailed information on how your R functions are running and to identify bottlenecks that can be addressed. The profiler is a key tool in helping you optimize your programs. Finally, we cover the str function, which I personally believe is the most useful function in R.
    Reading · Week 4: Simulation & Profiling
    Video · The str Function
    Video · Simulation - Generating Random Numbers
    Video · Simulation - Simulating a Linear Model
    Video · Simulation - Random Sampling
    Video · R Profiler (part 1)
    Video · R Profiler (part 2)
    Quiz · Week 4 Quiz
    Reading · Practical R Exercises in swirl Part 4
    Practice Programming Assignment · swirl Lesson 1: Looking at Data
    Practice Programming Assignment · swrl Lesson 2: Simulation
    Practice Programming Assignment · swirl Lesson 3: Base Graphics
    Reading · Programming Assignment 3 INSTRUCTIONS: Hospital Quality
    Quiz · Programming Assignment 3: Quiz
    Reading · Post-Course Survey

  3. COURSE 3

    Getting and Cleaning Data

    Upcoming session: Oct 2
    Subtitles
    English, Russian, French, Chinese (Simplified)

    About the Course

    Before you can work with data you have to get some. This course will cover the basic ways that data can be obtained. The course will cover obtaining data from the web, from APIs, from databases and from colleagues in various formats. It will also cover the basics of data cleaning and how to make data “tidy”. Tidy data dramatically speed downstream data analysis tasks. The course will also cover the components of a complete data set including raw data, processing instructions, codebooks, and processed data. The course will cover the basics needed for collecting, cleaning, and sharing data.
    Show or hide details about course Getting and Cleaning Data

    WEEK 1
    Week 1
    In this first week of the course, we look at finding data and reading different file types.
    Reading · Welcome to Week 1
    Reading · Syllabus
    Reading · Pre-Course Survey
    Video · Obtaining Data Motivation
    Video · Raw and Processed Data
    Video · Components of Tidy Data
    Video · Downloading Files
    Video · Reading Local Files
    Video · Reading Excel Files
    Video · Reading XML
    Video · Reading JSON
    Video · The data.table Package
    Reading · Practical R Exercises in swirl Part 1
    Quiz · Week 1 Quiz

    WEEK 2
    Week 2
    Welcome to Week 2 of Getting and Cleaning Data! The primary goal is to introduce you to the most common data storage systems and the appropriate tools to extract data from web or from databases like MySQL.
    Video · Reading from MySQL
    Video · Reading from HDF5
    Video · Reading from The Web
    Video · Reading From APIs
    Video · Reading From Other Sources
    Quiz · Week 2 Quiz

    WEEK 3
    Week 3
    Welcome to Week 3 of Getting and Cleaning Data! This week the lectures will focus on organizing, merging and managing the data you have collected using the lectures from Weeks 1 and 2.
    Video · Subsetting and Sorting
    Video · Summarizing Data
    Video · Creating New Variables
    Video · Reshaping Data
    Video · Managing Data Frames with dplyr - Introduction
    Video · Managing Data Frames with dplyr - Basic Tools
    Video · Merging Data
    Reading · Practical R Exercises in swirl Part 2
    Practice Programming Assignment · swirl Lesson 1: Manipulating Data with dplyr
    Practice Programming Assignment · swirl Lesson 2: Grouping and Chaining with dplyr
    Practice Programming Assignment · swirl Lesson 3: Tidying Data with tidyr
    Quiz · Week 3 Quiz

    WEEK 4
    Week 4
    Welcome to Week 4 of Getting and Cleaning Data! This week we finish up with lectures on text and date manipulation in R. In this final week we will also focus on peer grading of Course Projects.
    Video · Editing Text Variables
    Video · Regular Expressions I
    Video · Regular Expressions II
    Video · Working with Dates
    Video · Data Resources
    Reading · Practical R Exercises in swirl Part 4
    Practice Programming Assignment · swirl Lesson 1: Dates and Times with lubridate
    Quiz · Week 4 Quiz
    Peer Review · Getting and Cleaning Data Course Project
    Reading · Post-Course Survey

  4. COURSE 4

    Exploratory Data Analysis

    Upcoming session: Oct 2
    Subtitles
    English, Chinese (Simplified)

    About the Course

    This course covers the essential exploratory techniques for summarizing data. These techniques are typically applied before formal modeling commences and can help inform the development of more complex statistical models. Exploratory techniques are also important for eliminating or sharpening potential hypotheses about the world that can be addressed by the data. We will cover in detail the plotting systems in R as well as some of the basic principles of constructing data graphics. We will also cover some of the common multivariate statistical techniques used to visualize high-dimensional data.
    Show or hide details about course Exploratory Data Analysis

    WEEK 1
    Week 1
    This week covers the basics of analytic graphics and the base plotting system in R. We've also included some background material to help you install R if you haven't done so already.
    Reading · Welcome to Exploratory Data Analysis
    Reading · Syllabus
    Reading · Pre-Course Survey
    Video · Introduction
    Reading · Exploratory Data Analysis with R Book
    Reading · The Art of Data Science
    Video · Installing R on Windows (3.2.1)
    Video · Installing R on a Mac (3.2.1)
    Video · Installing R Studio (Mac)
    Video · Setting Your Working Directory (Windows)
    Video · Setting Your Working Directory (Mac)
    Video · Principles of Analytic Graphics
    Video · Exploratory Graphs (part 1)
    Video · Exploratory Graphs (part 2)
    Video · Plotting Systems in R
    Video · Base Plotting System (part 1)
    Video · Base Plotting System (part 2)
    Video · Base Plotting Demonstration
    Video · Graphics Devices in R (part 1)
    Video · Graphics Devices in R (part 2)
    Reading · Practical R Exercises in swirl Part 1
    Practice Programming Assignment · swirl Lesson 1: Principles of Analytic Graphs
    Practice Programming Assignment · swirl Lesson 2: Exploratory Graphs
    Practice Programming Assignment · swirl Lesson 3: Graphics Devices in R
    Practice Programming Assignment · swirl Lesson 4: Plotting Systems
    Practice Programming Assignment · swirl Lesson 5: Base Plotting System
    Quiz · Week 1 Quiz
    Peer Review · Course Project 1

    WEEK 2
    Week 2
    Welcome to Week 2 of Exploratory Data Analysis. This week covers some of the more advanced graphing systems available in R: the Lattice system and the ggplot2 system. While the base graphics system provides many important tools for visualizing data, it was part of the original R system and lacks many features that may be desirable in a plotting system, particularly when visualizing high dimensional data. The Lattice and ggplot2 systems also simplify the laying out of plots making it a much less tedious process.
    Video · Lattice Plotting System (part 1)
    Video · Lattice Plotting System (part 2)
    Video · ggplot2 (part 1)
    Video · ggplot2 (part 2)
    Video · ggplot2 (part 3)
    Video · ggplot2 (part 4)
    Video · ggplot2 (part 5)
    Reading · Practical R Exercises in swirl Part 2
    Practice Programming Assignment · swirl Lesson 1: Lattice Plotting System
    Practice Programming Assignment · swirl Lesson 2: Working with Colors
    Practice Programming Assignment · swirl Lesson 3: GGPlot2 Part1
    Practice Programming Assignment · swirl Lesson 4: GGPlot2 Part2
    Practice Programming Assignment · swirl Lesson 5: GGPlot2 Extras
    Quiz · Week 2 Quiz

    WEEK 3
    Week 3
    Welcome to Week 3 of Exploratory Data Analysis. This week covers some of the workhorse statistical methods for exploratory analysis. These methods include clustering and dimension reduction techniques that allow you to make graphical displays of very high dimensional data (many many variables). We also cover novel ways to specify colors in R so that you can use color as an important and useful dimension when making data graphics. All of this material is covered in chapters 9-12 of my book Exploratory Data Analysis with R.
    Video · Hierarchical Clustering (part 1)
    Video · Hierarchical Clustering (part 2)
    Video · Hierarchical Clustering (part 3)
    Video · K-Means Clustering (part 1)
    Video · K-Means Clustering (part 2)
    Video · Dimension Reduction (part 1)
    Video · Dimension Reduction (part 2)
    Video · Dimension Reduction (part 3)
    Video · Working with Color in R Plots (part 1)
    Video · Working with Color in R Plots (part 2)
    Video · Working with Color in R Plots (part 3)
    Video · Working with Color in R Plots (part 4)
    Reading · Practical R Exercises in swirl Part 3
    Practice Programming Assignment · swirl Lesson 1: Hierarchical Clustering
    Practice Programming Assignment · swirl Lesson 2: K Means Clustering
    Practice Programming Assignment · swirl Lesson 3: Dimension Reduction
    Practice Programming Assignment · swirl Lesson 4: Clustering Example

    WEEK 4
    Week 4
    This week, we'll look at two case studies in exploratory data analysis. The first involves the use of cluster analysis techniques, and the second is a more involved analysis of some air pollution data. How one goes about doing EDA is often personal, but I'm providing these videos to give you a sense of how you might proceed with a specific type of dataset.
    Video · Clustering Case Study
    Video · Air Pollution Case Study
    Reading · Practical R Exercises in swirl Part 4
    Practice Programming Assignment · swirl Lesson 1: CaseStudy
    Peer Review · Course Project 2
    Reading · Post-Course Survey

  5. COURSE 5

    Reproducible Research

    Upcoming session: Oct 2
    Commitment
    4-9 hours/week
    Subtitles
    English

    About the Course

    This course focuses on the concepts and tools behind reporting modern data analyses in a reproducible manner. Reproducible research is the idea that data analyses, and more generally, scientific claims, are published with their data and software code so that others may verify the findings and build upon them. The need for reproducibility is increasing dramatically as data analyses become more complex, involving larger datasets and more sophisticated computations. Reproducibility allows for people to focus on the actual content of a data analysis, rather than on superficial details reported in a written summary. In addition, reproducibility makes an analysis more useful to others because the data and code that actually conducted the analysis are available. This course will focus on literate statistical analysis tools which allow one to publish data analyses in a single document that allows others to easily execute the same analysis to obtain the same results.
    Show or hide details about course Reproducible Research

    WEEK 1
    Week 1: Concepts, Ideas, & Structure
    This week will cover the basic ideas of reproducible research since they may be unfamiliar to some of you. We also cover structuring and organizing a data analysis to help make it more reproducible. I recommend that you watch the videos in the order that they are listed on the web page, but watching the videos out of order isn't going to ruin the story.
    Video · Introduction
    Reading · Syllabus
    Reading · Pre-course survey
    Reading · Course Book: Report Writing for Data Science in R
    Video · What is Reproducible Research About?
    Video · Reproducible Research: Concepts and Ideas (part 1)
    Video · Reproducible Research: Concepts and Ideas (part 2)
    Video · Reproducible Research: Concepts and Ideas (part 3)
    Video · Scripting Your Analysis
    Video · Structure of a Data Analysis (part 1)
    Video · Structure of a Data Analysis (part 2)
    Video · Organizing Your Analysis
    Quiz · Week 1 Quiz

    WEEK 2
    Week 2: Markdown & knitr
    This week we cover some of the core tools for developing reproducible documents. We cover the literate programming tool knitr and show how to integrate it with Markdown to publish reproducible web documents. We also introduce the first peer assessment which will require you to write up a reproducible data analysis using knitr.
    Video · Coding Standards in R
    Video · Markdown
    Video · R Markdown
    Video · R Markdown Demonstration
    Video · knitr (part 1)
    Video · knitr (part 2)
    Video · knitr (part 3)
    Video · knitr (part 4)
    Quiz · Week 2 Quiz
    Video · Introduction to Course Project 1
    Peer Review · Course Project 1

    WEEK 3
    Week 3: Reproducible Research Checklist & Evidence-based Data Analysis
    This week covers what one could call a basic check list for ensuring that a data analysis is reproducible. While it's not absolutely sufficient to follow the check list, it provides a necessary minimum standard that would be applicable to almost any area of analysis.
    Video · Communicating Results
    Video · RPubs
    Video · Reproducible Research Checklist (part 1)
    Video · Reproducible Research Checklist (part 2)
    Video · Reproducible Research Checklist (part 3)
    Video · Evidence-based Data Analysis (part 1)
    Video · Evidence-based Data Analysis (part 2)
    Video · Evidence-based Data Analysis (part 3)
    Video · Evidence-based Data Analysis (part 4)
    Video · Evidence-based Data Analysis (part 5)

    WEEK 4
    Week 4: Case Studies & Commentaries
    This week there are two case studies involving the importance of reproducibility in science for you to watch.
    Video · Caching Computations
    Video · Case Study: Air Pollution
    Video · Case Study: High Throughput Biology
    Video · Commentaries on Data Analysis
    Video · Introduction to Peer Assessment 2
    Peer Review · Course Project 2
    Reading · Post-Course Survey

  6. COURSE 6

    Statistical Inference

    Upcoming session: Oct 2
    Subtitles
    English

    About the Course

    Statistical inference is the process of drawing conclusions about populations or scientific truths from data. There are many modes of performing inference including statistical modeling, data oriented strategies and explicit use of designs and randomization in analyses. Furthermore, there are broad theories (frequentists, Bayesian, likelihood, design based, …) and numerous complexities (missing data, observed and unobserved confounding, biases) for performing inference. A practitioner can often be left in a debilitating maze of techniques, philosophies and nuance. This course presents the fundamentals of inference in a practical approach for getting things done. After taking this course, students will understand the broad directions of statistical inference and use this information for making informed choices in analyzing data.
    Show or hide details about course Statistical Inference

    WEEK 1
    Week 1: Probability & Expected Values
    This week, we'll focus on the fundamentals including probability, random variables, expectations and more.
    Video · Introductory video
    Reading · Welcome to Statistical Inference
    Reading · Some introductory comments
    Reading · Pre-Course Survey
    Reading · Syllabus
    Reading · Course Book: Statistical Inference for Data Science
    Reading · Data Science Specialization Community Site
    Reading · Homework Problems
    Reading · Probability
    Video · 02 01 Introduction to probability
    Video · 02 02 Probability mass functions
    Video · 02 03 Probability density functions
    Reading · Conditional probability
    Video · 03 01 Conditional Probability
    Video · 03 02 Bayes' rule
    Video · 03 03 Independence
    Reading · Expected values
    Video · 04 01 Expected values
    Video · 04 02 Expected values, simple examples
    Video · 04 03 Expected values for PDFs
    Reading · Practical R Exercises in swirl 1
    Practice Programming Assignment · swirl Lesson 1: Introduction
    Practice Programming Assignment · swirl Lesson 2: Probability1
    Practice Programming Assignment · swirl Lesson 3: Probability2
    Practice Programming Assignment · swirl Lesson 4: ConditionalProbability
    Practice Programming Assignment · swirl Lesson 5: Expectations
    Quiz · Quiz 1

    WEEK 2
    Week 2: Variability, Distribution, & Asymptotics
    We're going to tackle variability, distributions, limits, and confidence intervals.
    Reading · Variability
    Video · 05 01 Introduction to variability
    Video · 05 02 Variance simulation examples
    Video · 05 03 Standard error of the mean
    Video · 05 04 Variance data example
    Reading · Distributions
    Video · 06 01 Binomial distrubtion
    Video · 06 02 Normal distribution
    Video · 06 03 Poisson
    Reading · Asymptotics
    Video · 07 01 Asymptotics and LLN
    Video · 07 02 Asymptotics and the CLT
    Video · 07 03 Asymptotics and confidence intervals
    Reading · Practical R Exercises in swirl Part 2
    Practice Programming Assignment · swirl Lesson 1: Variance
    Practice Programming Assignment · swirl Lesson 2: CommonDistros
    Practice Programming Assignment · swirl Lesson 3: Asymptotics
    Quiz · Quiz 2

    WEEK 3
    Week: Intervals, Testing, & Pvalues
    We will be taking a look at intervals, testing, and pvalues in this lesson.
    Reading · Confidence intervals
    Video · 08 01 T confidence intervals
    Video · 08 02 T confidence intervals example
    Video · 08 03 Independent group T intervals
    Video · 08 04 A note on unequal variance
    Reading · Hypothesis testing
    Video · 09 01 Hypothesis testing
    Video · 09 02 Example of choosing a rejection region
    Video · 09 03 T tests
    Video · 09 04 Two group testing
    Reading · P-values
    Video · 10 01 Pvalues
    Video · 10 02 Pvalue further examples
    Reading · Knitr
    Video · Just enough knitr to do the project
    Reading · Practical R Exercises in swirl Part 3
    Practice Programming Assignment · swirl Lesson 1: T Confidence Intervals
    Practice Programming Assignment · swirl Lesson 2: Hypothesis Testing
    Practice Programming Assignment · swirl Lesson 3: P Values
    Quiz · Quiz 3

    WEEK 4
    Week 4: Power, Bootstrapping, & Permutation Tests
    We will begin looking into power, bootstrapping, and permutation tests.
    Reading · Power
    Video · 11 01 Power
    Video · 11 02 Calculating Power
    Video · 11 03 Notes on power
    Video · 11 04 T test power
    Video · 12 01 Multiple Comparisons
    Reading · Resampling
    Video · 13 01 Bootstrapping
    Video · 13 02 Bootstrapping example
    Video · 13 03 Notes on the bootstrap
    Video · 13 04 Permutation tests
    Quiz · Quiz 4
    Peer Review · Statistical Inference Course Project
    Reading · Practical R Exercises in swirl Part 4
    Practice Programming Assignment · swirl Lesson 1: Power
    Practice Programming Assignment · swirl Lesson 2: Multiple Testing
    Practice Programming Assignment · swirl Lesson 3: Resampling
    Reading · Post-Course Survey

  7. COURSE 7

    Regression Models

    Upcoming session: Oct 2
    Subtitles
    English

    About the Course

    Linear models, as their name implies, relates an outcome to a set of predictors of interest using linear assumptions. Regression models, a subset of linear models, are the most important statistical analysis tool in a data scientist’s toolkit. This course covers regression analysis, least squares and inference using regression models. Special cases of the regression model, ANOVA and ANCOVA will be covered as well. Analysis of residuals and variability will be investigated. The course will cover modern thinking on model selection and novel uses of regression models including scatterplot smoothing.
    Show or hide details about course Regression Models

    WEEK 1
    Week 1: Least Squares and Linear Regression
    This week, we focus on least squares and linear regression.
    Reading · Welcome to Regression Models
    Reading · Book: Regression Models for Data Science in R
    Reading · Syllabus
    Reading · Pre-Course Survey
    Reading · Data Science Specialization Community Site
    Reading · Where to get more advanced material
    Reading · Regression
    Video · Introduction to Regression
    Video · Introduction: Basic Least Squares
    Reading · Technical details
    Video · Technical Details (Skip if you'd like)
    Video · Introductory Data Example
    Reading · Least squares
    Video · Notation and Background
    Video · Linear Least Squares
    Video · Linear Least Squares Coding Example
    Video · Technical Details (Skip if you'd like)
    Reading · Regression to the mean
    Video · Regression to the Mean
    Reading · Practical R Exercises in swirl Part 1
    Practice Programming Assignment · swirl Lesson 1: Introduction
    Practice Programming Assignment · swirl Lesson 2: Residuals
    Practice Programming Assignment · swirl Lesson 3: Least Squares Estimation
    Quiz · Quiz 1

    WEEK 2
    Week 2: Linear Regression & Multivariable Regression
    This week, we will work through the remainder of linear regression and then turn to the first part of multivariable regression.
    Reading · *Statistical* linear regression models
    Video · Statistical Linear Regression Models
    Video · Interpreting Coefficients
    Video · Linear Regression for Prediction
    Reading · Residuals
    Video · Residuals
    Video · Residuals, Coding Example
    Video · Residual Variance
    Reading · Inference in regression
    Video · Inference in Regression
    Video · Coding Example
    Video · Prediction
    Reading · Looking ahead to the project
    Video · Really, really quick intro to knitr
    Reading · Practical R Exercises in swirl Part 2
    Practice Programming Assignment · swirl Lesson 1: Residual Variation
    Practice Programming Assignment · swirl Lesson 2: Introduction to Multivariable Regression
    Practice Programming Assignment · swirl Lesson 3: MultiVar Examples
    Quiz · Quiz 2

    WEEK 3
    Week 3: Multivariable Regression, Residuals, & Diagnostics
    This week, we'll build on last week's introduction to multivariable regression with some examples and then cover residuals, diagnostics, variance inflation, and model comparison.
    Reading · Multivariable regression
    Video · Multivariable Regression part I
    Video · Multivariable Regression part II
    Video · Multivariable Regression Continued
    Video · Multivariable Regression Examples part I
    Video · Multivariable Regression Examples part II
    Video · Multivariable Regression Examples part III
    Video · Multivariable Regression Examples part IV
    Reading · Adjustment
    Video · Adjustment Examples
    Reading · Residuals
    Video · Residuals and Diagnostics part I
    Video · Residuals and Diagnostics part II
    Video · Residuals and Diagnostics part III
    Reading · Model selection
    Video · Model Selection part I
    Video · Model Selection part II
    Video · Model Selection part III
    Reading · Practical R Exercises in swirl Part 3
    Practice Programming Assignment · swirl Lesson 1: MultiVar Examples2
    Practice Programming Assignment · swirl Lesson 2: MultiVar Examples3
    Practice Programming Assignment · swirl Lesson 3: Residuals Diagnostics and Variation
    Quiz · Quiz 3
    Practice Quiz · (OPTIONAL) Regression practice

    WEEK 4
    Week 4: Logistic Regression and Poisson Regression
    This week, we will work on generalized linear models, including binary outcomes and Poisson regression.
    Reading · GLMs
    Video · GLMs
    Reading · Logistic regression
    Video · Logistic Regression part I
    Video · Logistic Regression part II
    Video · Logistic Regression part III
    Reading · Count Data
    Video · Poisson Regression part I
    Video · Poisson Regression part II
    Reading · Mishmash
    Video · Hodgepodge
    Reading · Practical R Exercises in swirl Part 4
    Practice Programming Assignment · swirl Lesson 1: Variance Inflation Factors
    Practice Programming Assignment · swirl Lesson 2: Overfitting and Underfitting
    Practice Programming Assignment · swirl Lesson 3: Binary Outcomes
    Practice Programming Assignment · swirl Lesson 4: Count Outcomes
    Quiz · Quiz 4
    Peer Review · Regression Models Course Project
    Reading · Post-Course Survey

  8. COURSE 8

    Practical Machine Learning

    Upcoming session: Oct 2
    Subtitles
    English

    About the Course

    One of the most common tasks performed by data scientists and data analysts are prediction and machine learning. This course will cover the basic components of building and applying prediction functions with an emphasis on practical applications. The course will provide basic grounding in concepts such as training and tests sets, overfitting, and error rates. The course will also introduce a range of model based and algorithmic machine learning methods including regression, classification trees, Naive Bayes, and random forests. The course will cover the complete process of building prediction functions including data collection, feature creation, algorithms, and evaluation.
    Show or hide details about course Practical Machine Learning

    WEEK 1
    Week 1: Prediction, Errors, and Cross Validation
    This week will cover prediction, relative importance of steps, errors, and cross validation.
    Reading · Welcome to Practical Machine Learning
    Reading · Syllabus
    Reading · Pre-Course Survey
    Video · Prediction motivation
    Video · What is prediction?
    Video · Relative importance of steps
    Video · In and out of sample errors
    Video · Prediction study design
    Video · Types of errors
    Video · Receiver Operating Characteristic
    Video · Cross validation
    Video · What data should you use?
    Quiz · Quiz 1

    WEEK 2
    Week 2: The Caret Package
    This week will introduce the caret package, tools for creating features and preprocessing.
    Video · Caret package
    Video · Data slicing
    Video · Training options
    Video · Plotting predictors
    Video · Basic preprocessing
    Video · Covariate creation
    Video · Preprocessing with principal components analysis
    Video · Predicting with Regression
    Video · Predicting with Regression Multiple Covariates
    Quiz · Quiz 2

    WEEK 3
    Week 3: Predicting with trees, Random Forests, & Model Based Predictions
    This week we introduce a number of machine learning algorithms you can use to complete your course project.
    Video · Predicting with trees
    Video · Bagging
    Video · Random Forests
    Video · Boosting
    Video · Model Based Prediction
    Quiz · Quiz 3

    WEEK 4
    Week 4: Regularized Regression and Combining Predictors
    This week, we will cover regularized regression and combining predictors.
    Video · Regularized regression
    Video · Combining predictors
    Video · Forecasting
    Video · Unsupervised Prediction
    Quiz · Quiz 4
    Reading · Course Project Instructions (READ FIRST)
    Peer Review · Prediction Assignment Writeup
    Quiz · Course Project Prediction Quiz
    Reading · Post-Course Survey

  9. COURSE 9

    Developing Data Products

    Upcoming session: Oct 2
    Subtitles
    English

    About the Course

    A data product is the production output from a statistical analysis. Data products automate complex analysis tasks or use technology to expand the utility of a data informed model, algorithm or inference. This course covers the basics of creating data products using Shiny, R packages, and interactive graphics. The course will focus on the statistical fundamentals of creating a data product that can be used to tell a story about data to a mass audience.
    Show or hide details about course Developing Data Products

    WEEK 1
    Course Overview
    In this overview module, we'll go over some information and resources to help you get started and succeed in the course.
    Video · Welcome to Developing Data Products
    Reading · Syllabus
    Reading · Welcome
    Reading · Book: Developing Data Products in R
    Reading · Community Site
    Reading · R and RStudio Links & Tutorials

    Shiny, GoogleVis, and Plotly
    Now we can turn to the first substantive lessons. In this module, you'll learn how to develop basic applications and interactive graphics in shiny, compose interactive HTML graphics with GoogleVis, and prepare data visualizations with Plotly.
    Reading · Shiny
    Reading · Shinyapps.io Project
    Video · Shiny 1.1
    Video · Shiny 1.2
    Video · Shiny 1.3
    Video · Shiny 1.4
    Video · Shiny 1.5
    Video · Shiny 2.1
    Video · Shiny 2.2
    Video · Shiny 2.3
    Video · Shiny 2.4
    Video · Shiny 2.5
    Video · Shiny 2.6
    Video · Shiny Gadgets 1.1
    Video · Shiny Gadgets 1.2
    Video · Shiny Gadgets 1.3
    Video · GoogleVis 1.1
    Video · GoogleVis 1.2
    Video · Plotly 1.1
    Video · Plotly 1.2
    Video · Plotly 1.3
    Video · Plotly 1.4
    Video · Plotly 1.5
    Video · Plotly 1.6
    Video · Plotly 1.7
    Video · Plotly 1.8
    Quiz · Quiz 1

    WEEK 2
    R Markdown and Leaflet
    During this module, we'll learn how to create R Markdown files and embed R code in an Rmd. We'll also explore Leaflet and use it to create interactive annotated maps.
    Video · R Markdown 1.1
    Video · R Markdown 1.2
    Video · R Markdown 1.3
    Video · R Markdown 1.4
    Video · R Markdown 1.5
    Video · R Markdown 1.6
    Reading · Three Ways to Share R Markdown Products
    Video · Leaflet 1.1
    Video · Leaflet 1.2
    Video · Leaflet 1.3
    Video · Leaflet 1.4
    Video · Leaflet 1.5
    Video · Leaflet 1.6
    Quiz · Quiz 2
    Peer Review · R Markdown and Leaflet

    WEEK 3
    R Packages
    In this module, we'll dive into the world of creating R packages and practice developing an R Markdown presentation that includes a data visualization built using Plotly.
    Reading · R Packages
    Video · R Packages (Part 1)
    Video · R Packages (Part 2)
    Video · Building R Packages Demo
    Video · R Classes and Methods (Part 1)
    Video · R Classes and Methods (Part 2)
    Quiz · Quiz 3
    Peer Review · R Markdown Presentation & Plotly

    WEEK 4
    Swirl and Course Project
    Week 4 is all about the Course Project, producing a Shiny Application and reproducible pitch.
    Video · Swirl 1.1
    Video · Swirl 1.2
    Video · Swirl 1.3
    Peer Review · Course Project: Shiny Application and Reproducible Pitch
    Reading · Post-Course Survey

  10. COURSE 10

    Data Science Capstone

    Upcoming session: Oct 16
    Commitment
    4-9 hours/week
    Subtitles
    English

    About the Capstone Project

    The capstone project class will allow students to create a usable/public data product that can be used to show your skills to potential employers. Projects will be drawn from real-world problems and will be conducted with industry, government, and academic partners.
    Show or hide details about course Data Science Capstone

    WEEK 1
    Overview, Understanding the Problem, and Getting the Data
    This week, we introduce the project so you can get a clear grip on the problem at hand and begin working with the dataset.
    Video · Welcome to the Capstone Project
    Reading · Project Overview
    Video · Welcome from SwiftKey
    Video · You Are a Data Scientist Now
    Reading · Syllabus
    Video · Introduction to Task 0: Understanding the Problem
    Reading · Task 0 - Understanding the problem
    Reading · About the Copora
    Video · Introduction to Task 1: Getting and Cleaning the Data
    Reading · Task 1 - Getting and cleaning the data
    Video · Regular Expressions: Part 1 (Optional)
    Video · Regular Expressions: Part 2 (Optional)
    Quiz · Quiz 1: Getting Started

    WEEK 2
    Exploratory Data Analysis and Modeling
    This week, we move on to the next tasks, exploratory data analysis and modeling. You'll also submit your milestone report and review submissions from your classmates.
    Video · Introduction to Task 2: Exploratory Data Analysis
    Reading · Task 2 - Exploratory Data Analysis
    Video · Introduction to Task 3: Modeling
    Reading · Task 3 - Modeling
    Peer Review · Milestone Report

    WEEK 3
    Prediction Model
    This week, you'll build and evaluate your prediction model. The goal is to make your model efficient and accurate.
    Video · Introduction to Task 4: Prediction Model
    Reading · Task 4 - Prediction Model
    Quiz · Quiz 2: Natural language processing I

    WEEK 4
    Creative Exploration
    This week's goal is to improve the predictive accuracy while reducing computational runtime and model complexity.
    Video · Introduction to Task 5: Creative Exploration
    Reading · Task 5 - Creative Exploration
    Quiz · Quiz 3: Natural language processing II

    WEEK 5
    Data Product
    This week, you'll work on developing the first component of your final project, your data product.
    Video · Introduction to Task 6: Data Product
    Reading · Task 6 - Data Product

    WEEK 6
    Slide Deck
    This week, you'll work on developing the second component of your final project, a slide deck to accompany your data product.
    Video · Introduction to Task 7: Slide Deck
    Reading · Task 7 - Slide Deck

    WEEK 7
    Final Project Submission and Evaluation
    This week, you'll submit your final project and review the work of your classmates.
    Peer Review · Final Project Submission
    Video · Congratulations!

Creators

Johns Hopkins University is recognized as a destination for excellent, ambitious scholars and a world leader in teaching and research. The mission of The Johns Hopkins University is to educate its students and cultivate their capacity for life-long learning, to foster independent and original research, and to bring the benefits of discovery to the world.
The mission of The Johns Hopkins University is to educate its students and cultivate their capacity for life-long learning, to foster independent and original research, and to bring the benefits of discovery to the world.


No comments:

Post a Comment