It looks like you're using Internet Explorer 11 or older. This website works best with modern browsers such as the latest versions of Chrome, Firefox, Safari, and Edge. If you continue with this browser, you may see unexpected results.
Rasmussen University Online Library

# STA3215 Inferential Statistics and Analytics Course Guide

## Welcome This course guide has been customized to assist you with both the CBE and traditional online version of Inferential Statistics and Analytics. Did we miss something? Let us know by submitting your question in the Questions box at the bottom of the page for Library and Learning Services to review.

## Inferential Statistics and Analytics Resources - Quick Reference One of my instructors recommended STATDISK. It's software that comes with the textbook, and it's basically life changing. I've recommended it to so many other students when I hear they're taking Inferential Stats. I don't know if I would have survived my Inferential Statistics class without the Live Help tutors. I'm not really a math guy, but they helped me break it all down until I got it. Two words: live lectures. My instructor was amazing, and it was the best place to ask questions because she knew exactly what assignments I was working on. Sometimes I felt like I needed to see more examples or have things explained to me in different ways. I had luck with the tutorials in SkillSurfer, and I wish I had known about it from the beginning. For additional tutorial content by statistics topic, please visit the Brainfuse tutoring platform and select SkillSurfer, as pictured on the left. Resources include:

-Statistics flashcards

-Data collection

-Descriptive statistics

-Displaying data in graphs and charts

-Probability

-Data distributions

-Analyzing data

Access Statistics tutorials in Skillsurfer HERE.

eBooks

## Supplemental Content by Topic

Click on the links below to access the full listing of statistics formulas and descriptions pictured in the image below: Classifying Data

 Qualitative Data Quantitative Data Consists of names or labels, categorical data Consists of numeric values, includes units of measurement Discrete Only takes on countable values Continuous Can take on any value within an interval

Levels of Measure

 Nominal Categories only, data cannot be arranged in order Ordinal Data can be arranged in order, but differences either can’t be found or are meaningless Interval Differences are meaningful but there is no natural zero starting point and ratios are meaningless Ratio There is a natural zero starting point and ratios make sense

Measures of Center

 Mean The arithmetic average of all data points =AVERAGE(data range) Median The middle of an ordered data set =MEDIAN(data range) Mode The most frequently appearing data First highlight several cells in a row, then enter =MODE.MULT(data range) and then press Ctrl+Shift+Enter Mid-Range The half-way point between the lowest and highest values First find the maximum and minimum with =MAX(data range) and =MIN(data range) and then enter =(Maximum+Minimum)/2

Measures of Variation

 Range The distance between the lowest and highest value First find the maximum and minimum with =MAX(data range) and =MIN(data range) and then enter =Maximum-Minimum Variance Equal to the square of the standard deviation =VAR.S(data range) Standard Deviation A measurement of how far the data deviates from the mean =STDEV.S(data range)

Normal Distributions

• A bell curve distribution whose shape is determined by a mean and a standard deviation

• Standard Normal Deviations and -scores

• When the mean is 0 and the standard deviation is 1 it is called a standard normal distribution.Measurements on this scale are identified by the variable • Find a -score

• Find a z-score from a probability with =NORM.S.INV(probability)

• < >: z-scores are not percentages, do not change the decimal place when computing a z-score.

Area and probability

• Find a probability from a z-score with =NORM.S.DIST(z, TRUE)

Parameters and Statistics

A parameter is a measurement (usually a proportion, a mean, or a standard deviation) of an entire population.  Usually these values are either impossible or unrealistic to find.

A statistic is a measurement used to estimate a parameter that is based on a sample taken from a population.

 Measurement Parameter Statistic Mean (average) Mu ( ) x-bar ( ) Proportion p-bar ( ) Standard Deviation Sigma ( ) Variance Sigma squared ( ) Confidence Intervals Overview

• A confidence interval is a range of values used to estimate a population parameter.They are made by using a point estimate obtained from a sample and then calculating a margin of error about that point estimate.

• The point estimate is a single value used to estimate a population parameter.Each parameter type has its own best point estimate statistic.

• Margins of error are computed differently depending on the population parameter being estimated.

• A confidence level ( ) is the probability that the results of our construction of the confidence interval will contain the population parameter.  This Wikipedia article has a good summary of interpretations with some common misinterpretations.

Formulas for constructing confidence intervals and minimum sample size can be found in the attachments in the links below:

Either a goodness of fit test or a test of independence will always a right-tailed test and use a statistic for the test statistic and critical value.

Test statistic: • is the observed value

• is the expected value, this must be computed independently and the type of computation varies by the type of test

Benford’s Law: Leading Digit 1 2 3 4 5 6 7 8 9 Percentage 30.1 17.6 12.5 9.7 7.9 6.7 5.8 5.1 4.6

Tests of Independence: Critical Value: The critical value is the same set-up for either type of test

• CHISQ.RT.INV(alpha, d.o.f.)

• Benford’s Law: d.o.f. • The d.o.f. for Benford’s Law is always 8

• Tests of Independence:.

d.o.f Open the attachments located at the bottom of this box for notes on the following topics:

Steps of Hypothesis Test

Null Hypothesis

Alternative Hypothesis

Critical Value Excel Formulas

P-Value Excel Formulas

Test Statistic and Critical Value Types

Test Statistic Formulas

Dependent Samples

Linear Correlation

A correlation exists if the values of one variable are somehow associated with the values of the other variable

Correlation does not imply causation

Correlation Coefficient

We use the variable to measure the strength of linear correlation.

In Excel use =CORREL(x data, y data)

Whenever we conclude that there is no significant linear correlation.

Critical value is determined by the significance level and the sample size.  See Appendix A in textbook for table of critical values of correlation coefficients.

If we say there is a positive linear correlation.

If we say there is a negative linear correlation.

Coefficient of determination

The proportion of variance in the dependent variable that is predictable from the independent variable. The coefficient of determination is also simply the square of the correlation

coefficient.

Regression Equation  – independent variable – best predicted value – y – intercept

=INTERCEPT(y data, x data) – slope

=SLOPE(y data, x data)

Outliers and Influential Points

An outlier is a point that is far away from other data points

An influential point is a point that strongly affects the regression line