O.P. Jindal Global University
Statistical Methods and Data Analysis
O.P. Jindal Global University

Statistical Methods and Data Analysis

Subhasish Ray

Instructor: Subhasish Ray

Included with Coursera Plus

Gain insight into a topic and learn the fundamentals.
2 weeks to complete
at 10 hours a week
Flexible schedule
Learn at your own pace
Build toward a degree
Gain insight into a topic and learn the fundamentals.
2 weeks to complete
at 10 hours a week
Flexible schedule
Learn at your own pace
Build toward a degree

What you'll learn

  • • Explain the basic statistical reasoning involved in data analysis.

  • • Explain the applications of data analysis with examples from published research.

  • • Execute the data analysis projects using R.

Details to know

Shareable certificate

Add to your LinkedIn profile

Recently updated!

September 2025

Assessments

29 assignments

Taught in English

See how employees at top companies are mastering in-demand skills

 logos of Petrobras, TATA, Danone, Capgemini, P&G and L'Oreal

There are 8 modules in this course

Statistical methods, by definition, are tools for identifying patterns in large datasets. This module takes the first step towards statistical analysis by exploring various strategies for visualizing data, an increasingly important skill in today’s era of big data. This module explains the different forms of data, types of plots, and charts used to depict the different forms of data. In addition, the module focuses on different visualization techniques appropriate for big data.

What's included

10 videos3 readings4 assignments1 plugin

While data visualization gives us a ‘first cut’ in the empirical world, knowing what the data ‘looks like’ will not take us far towards identifying relationships between variables — the focal point of policymaking. At the minimum, identification requires that the researcher be able to summarize large amounts of information in the form of descriptive statistics. This module explains the measures of central tendency and dispersion for ungrouped data and for grouped data. The measures of central tendency and dispersion for ungrouped data include mean, median, mode, standard deviation, skewness, and kurtosis. The means of central tendency and dispersion for grouped data include grouped mean, grouped standard deviation, grouped mode, and grouped median.

What's included

8 videos1 reading3 assignments

Except in the rarest of cases when data on the entire population is available for all attributes of interest to the researcher, social scientists must draw inferences about a population from a sample drawn from that population. This module focuses on the statistical reasoning involved in studying the uncertainty attached to sample statistics. For making inferences about the population from a sample, the module explains the fundamentals of probability theory. In addition, the module explains the concepts of random variables and function of random variables. Finally, the module covers the concepts and applications of the binomial and normal distributions.

What's included

9 videos1 reading5 assignments

This module discusses the various strategies available to researchers for drawing samples from a population and the first principles involved in determining sample size. The module explains the sampling strategies for sampling from a population. In addition, the module explains how to measure the accuracy of sample estimates. Finally, the module focuses on statistical inference. The goal of statistical inference is to make a statement about something that is not observed based on something that is observed, within a certain level of uncertainty. The module will discuss the Central Limit Theorem (CLT) and the concept of the confidence interval, which allow us to make such statements.

What's included

6 videos1 reading4 assignments

This module introduces the critical distinction between experimental data and observational data. In addition, the module explores statistical inference in the context of experimental data using tests of significance. You will also learn about observational data and the problem of confounding, controlled experiment, and natural experiment. The module focuses on the concepts and methods for analyzing statistical significance, including analytical framework, one sample t-test, two sample t-test, and ANOVA.

What's included

8 videos4 readings3 assignments

This module introduces the foundational model for statistical inference with observational data, namely, the ordinary least squares (OLS) regression, paying particular attention to the conditions under which the OLS estimator is the best linear unbiased estimator (BLUE). You will learn about the concept of association, which helps to understand the relationship between two variables. You will also learn about the measures of association appropriate for each variable type: lambda coefficient for nominal variables, gamma coefficient for ordinal variables, and correlation coefficient for interval-ratio variables. Finally, the module focuses on regression analysis by explaining bivariate OLS and multivariate OLS.

What's included

8 videos2 readings4 assignments

This module focuses on advanced modeling strategies in settings where the best linear unbiased estimator (BLUE) assumptions are violated. You will learn about how to get valid ordinary least squares (OLS) estimates when one or the other key assumption on regression errors for OLS estimates to be BLUE is violated. In particular, you will learn how to detect and correct OLS estimates for reverse causality, heteroscedasticity, and serial correlation. Next, under violations of BLUE assumptions on model and variable specification, you will learn how to model nominal and ordinal dependent variables.

What's included

5 videos4 readings3 assignments

Running regression models on large-scale datasets with millions of observations and thousands of variables can be a daunting task. This module examines the strategies for building regression models when dealing with such datasets. For conducting big data regression analysis with nominal dependent variables, you will learn the concepts of decision tree, pruning, cross-validation, and random forest. You will also learn about the penalized regression approach, which is useful for running big data regressions when the dependent variable is an interval-ratio variable.

What's included

5 videos2 readings3 assignments

Build toward a degree

This course is part of the following degree program(s) offered by O.P. Jindal Global University. If you are admitted and enroll, your completed coursework may count toward your degree learning and your progress can transfer with you.¹

 

Instructor

Subhasish Ray
O.P. Jindal Global University
2 Courses224 learners

Offered by

Why people choose Coursera for their career

Felipe M.
Learner since 2018
"To be able to take courses at my own pace and rhythm has been an amazing experience. I can learn whenever it fits my schedule and mood."
Jennifer J.
Learner since 2020
"I directly applied the concepts and skills I learned from my courses to an exciting new project at work."
Larry W.
Learner since 2021
"When I need courses on topics that my university doesn't offer, Coursera is one of the best places to go."
Chaitanya A.
"Learning isn't just about being better at your job: it's so much more than that. Coursera allows me to learn without limits."
Coursera Plus

Open new doors with Coursera Plus

Unlimited access to 10,000+ world-class courses, hands-on projects, and job-ready certificate programs - all included in your subscription

Advance your career with an online degree

Earn a degree from world-class universities - 100% online

Join over 3,400 global companies that choose Coursera for Business

Upskill your employees to excel in the digital economy

Frequently asked questions