For additional tutorial content by statistics topic, please visit the Brainfuse tutoring platform and select SkillSurfer, as pictured on the left. Resources include:
-Statistics flashcards
-Data collection
-Descriptive statistics
-Displaying data in graphs and charts
-Probability
-Data distributions
-Analyzing data
Access Statistics tutorials in Skillsurfer HERE.
eBooks
Videos
Statistics for Data Science and Business Analysis - video tutorial
Missing Data Problems and Prospects
For questions about Library resources for Inferential Statistics, chat with a librarian!
One of my instructors recommended STATDISK. It's software that comes with the textbook, and it's basically life changing. I've recommended it to so many other students when I hear they're taking Inferential Stats.
I don't know if I would have survived my Inferential Statistics class without the Live Help tutors. I'm not really a math guy, but they helped me break it all down until I got it.
Two words: live lectures. My instructor was amazing, and it was the best place to ask questions because she knew exactly what assignments I was working on.
Sometimes I felt like I needed to see more examples or have things explained to me in different ways. I had luck with the tutorials in SkillSurfer, and I wish I had known about it from the beginning.
Click on the links below to access the full listing of statistics formulas and descriptions pictured in the image below:
Classifying Data
Levels of Measure
Measures of Center
Measures of Variation
Normal Distributions
A bell curve distribution whose shape is determined by a mean and a standard deviation
Standard Normal Deviations and -scores
When the mean is 0 and the standard deviation is 1 it is called a standard normal distribution.Measurements on this scale are identified by the variable
Find a -score
Find a z-score from a probability with =NORM.S.INV(probability)
Area and probability
Find a probability from a z-score with =NORM.S.DIST(z, TRUE)
Parameters and Statistics
A parameter is a measurement (usually a proportion, a mean, or a standard deviation) of an entire population. Usually these values are either impossible or unrealistic to find.
A statistic is a measurement used to estimate a parameter that is based on a sample taken from a population.
Confidence Intervals Overview
A confidence interval is a range of values used to estimate a population parameter.They are made by using a point estimate obtained from a sample and then calculating a margin of error about that point estimate.
The point estimate is a single value used to estimate a population parameter.Each parameter type has its own best point estimate statistic.
Margins of error are computed differently depending on the population parameter being estimated.
A confidence level () is the probability that the results of our construction of the confidence interval will contain the population parameter. This Wikipedia article has a good summary of interpretations with some common misinterpretations.
Formulas for constructing confidence intervals and minimum sample size can be found in the attachments in the links below:
Either a goodness of fit test or a test of independence will always a right-tailed test and use a statistic for the test statistic and critical value.
Test statistic:
is the observed value
is the expected value, this must be computed independently and the type of computation varies by the type of test
Benford’s Law:
Leading Digit |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
Percentage |
30.1 |
17.6 |
12.5 |
9.7 |
7.9 |
6.7 |
5.8 |
5.1 |
4.6 |
Tests of Independence:
Critical Value: The critical value is the same set-up for either type of test
CHISQ.RT.INV(alpha, d.o.f.)
Benford’s Law: d.o.f.
The d.o.f. for Benford’s Law is always 8
Tests of Independence:.
d.o.f
Open the attachments located at the bottom of this box for notes on the following topics:
Steps of Hypothesis Test
Null Hypothesis
Alternative Hypothesis
Critical Value Excel Formulas
P-Value Excel Formulas
Test Statistic and Critical Value Types
Test Statistic Formulas
Dependent Samples
Linear Correlation
A correlation exists if the values of one variable are somehow associated with the values of the other variable
Correlation does not imply causation
Correlation Coefficient
We use the variable to measure the strength of linear correlation.
In Excel use =CORREL(x data, y data)
Whenever we conclude that there is no significant linear correlation.
Critical value is determined by the significance level and the sample size. See Appendix A in textbook for table of critical values of correlation coefficients.
If we say there is a positive linear correlation.
If we say there is a negative linear correlation.
Coefficient of determination
The proportion of variance in the dependent variable that is predictable from the independent variable.
The coefficient of determination is also simply the square of the correlation
coefficient.
Regression Equation
– independent variable
– best predicted value
– y – intercept
=INTERCEPT(y data, x data)
– slope
=SLOPE(y data, x data)
Outliers and Influential Points
An outlier is a point that is far away from other data points
An influential point is a point that strongly affects the regression line