Eco309 Economic Statistics Dr. Robert Jantzen

A Brief Guide to Statistical Inference



 
Univariate Tests: t-test for the mean
z-test for the proportion
Bivariate Tests: Correlation t-test
Separate variance t-test for the difference in two means
t-test for the mean difference between related (paired) samples
One-WayAnalysis of Variance (Anova)
Z test for the difference between two proportions
Chi-squared (c2) test for differences in two or more proportions
Wilcoxan Rank Sum test for 2 medians
Kruskal Wallis Rank for 2 or more medians
Multiple Regression

 
This guide provides a brief description of some "popular" statistical inference tests.  Bear in mind
that the purpose of statistical inference is to test hypotheses about population parameters using sample
data.

Univariate Tests:

Univariate tests involve only one variable and can fall into the following types:
 
  • t-test for the mean (population standard deviation is unknown).  Use this test to assess whether a given sample supports a hypothesis that the mean value of a numerical variable in the population is  =, ¹, > or < a particular number.
     Excel:  <PHStat><OneSampleTests><t test for the mean (sigma unknown)>.  Then type in the null hypothesis mean value and the significance level for the test.  Then either highlight (enter) the data values (sample statistics) or type in the sample mean, standard deviation and sample size.  The specification of the alternative hypothesis determines the "tails" option:  if it includes ¹ use the two-tailed option, if it includes > use the upper-tail option, and if it includes <  use the lower-tail option. 
  • z-test for the proportion.  Use this test to assess whether a given sample supports a hypothesis that the population proportion is =, ¹,  > or < a particular number.
       Excel: <PHStat><OneSampleTests><z test for the proportion>.  Then type in the null hypothesis proportion value, the significance level, the sample size and the number of successes (= the number in the sample that has the characteristic described by the proportion).  The specification of the alternative hypothesis determines the "tails" option:  if it includes ¹ use the two-tailed option, if it includes > use the upper-tail option, and if it includes <  use the lower-tail option. 

Back to top
 

Bivariate Tests:

Bivariate tests involve two variables and include the following types:
 
  • Correlation t-test.  Use this test to assess whether two numerical variables in the population are related (in a linear way) to each other.  Specifically, this test assesses whether the population correlation coefficient is =, ¹, > or < a particular number (usually zero).  The test statistic is computed from a sample containing information on two numerical variables.
     Excel:  First generate the sample correlation coefficient (r) using  <Tools><Data Analysis> <Correlation>.  Then open the correlationttest.xls spreadsheet (@ http://www.sba.uconn.edu/users/
rjantzen/exceladd-ins/correlationttest.xls) and enter the hypothesized population correlation value, the sample r, the sample size and the significance level.
  • Separate variance t-test for the difference in two means (population standard deviations are unknown).  Use this test to assess whether the difference between two population group means is =, ¹, > or <  a specified value.  If the specified difference is set to zero, the test can be used to assess whether one group's mean is =, ¹, > or < the other group's mean.  The test statistic is computed from data for a numerical variable generated by sampling two independent groups.
     Excel:  If you have the actual data values, use <Tools><Data Analysis><t test: two samples assuming unequal variances> and then type in the hypothesized difference between the two population means and the significance level of the test (=alpha).  If you have the means, standard deviations (or variances) and sample sizes for two sampled groups, open the 2samplettest.xls spreadsheet 
(@ http://www.sba.uconn.edu/users/rjantzen/exceladd-ins/2samplettest.xls) and then enter the statistics for each sampled group, the hypothesized difference between the population means and the significance level of the test.
  • t-test for the mean difference between related (paired) samples (population standard deviation of the difference is unknown).  Use this test to assess whether the mean difference, in the population, between two numerical variables is =, ¹, > or < a specified value (usually zero).  The test statistic is computed from data on two numerical variables generated by a sample that matches (pairs) the two variables to each other (i.e., the two variables are not drawn from samples that are independent of each other).   Equivalent results can also be generated using the t-test for hypotheses about the mean (population standard deviation is unknown) if the latter test analyzes the difference between each pair's numerical values.
     Excel: <Tools><Data Analysis><t test: paired sample for means> and then highlight the data and enter the hypothesized mean difference and significance level. 
  • One-WayAnalysis of Variance (Anova).  Use this test to assess whether two or more independent population groups have means that are = or ¹ to each other.  The test statistic is computed from data for a numerical variable generated by sampling two or more independent groups.
      Excel:  If you have the actual data values, use <Tools><Data Analysis><Anova: single factor> and then highlight the data and type in the significance level of the test (=alpha).  If you have the means, standard deviations (or variances) and sample sizes for >= two sampled groups, open the anovatest.xls spreadsheet (@ http://www.sba.uconn.edu/users/rjantzen/exceladd-ins/anovatest.xls) and then enter the statistics for each sampled group and the significance level of the test.
  • Tukey-Kramer Multiple Comparisons.  If (and only if) the ANOVA test indicates that the population means are not equal to each other, than this test can be used to assess which population group means differ from each other.  To conduct the test you must also obtain the studentized range Q statistic from a critical values table (click here).
       Excel:  <PHStat><Multiple Sample Tests><Tukey-Kramer Procedure>, highlight the data and enter the appropriate Q statistic (w/ c-1 & n-c degrees of freedom).
  • Z test for the difference between two proportions.  Use this test to assess whether the population proportions of two groups differ significantly from a specified value (usually zero).  The test statistic is computed from data for a categorical variable generated by sampling two independent groups.
      Excel:  <PHStat><Two Sample Tests><z test for difference in two proportions> and then enter the hypothesized difference, significance level, two samples sizes and the number of successes (= the number in each sampled group that has the characteristic described by the proportions).  The specification of the alternative hypothesis determines the "tails" option:  if it includes ¹ use the two-tailed option; if it includes > use the upper-tail option, and if it includes < , use the lower-tail option. 
  • Chi-squared (c2) test for differences in two or more proportions.  Use this test to assess whether the population proportions of two or more groups are the same.  The test statistic is computed from data generated by sampling two or more independent groups organized into a contingency table.
      Excel:  <PHStat><Multiple Sample Test><Chi Squared Test> and then enter the number of rows and columns for the contingency table that describes the two categorical variables being contrasted.
 
  • Marascuilo Procedure   If (and only if) the Chi squared test indicates that the population proportions across groups are not the same, than the Marascuilo Procedure can be used to identify which group proportions differ.
     Excel:  When doing the Chi squared test, click on the Marascuilo Procedure "button."
  • Wilcoxan Rank Sum test for 2 medians.  Use this test to assess whether one group's median is =, ¹, > or < than another group's median.  The test statistic is computed from data for a numerical variable generated by sampling two independent groups. 
      Excel:  <PHStat><Two Sample Tests><Wilcoxan Rank Sum>, then highlight the data and enter the desired significance level.  The specification of the alternative hypothesis determines the "tails" option:  if it includes ¹ use the two-tailed option; if it includes > use the upper-tail option, and if it includes < , use the lower-tail option. 
  • Kruskal Wallis Rank test for 2 or more medians.  Use this test to assess whether two or more independent population groups have medians that are = or ¹ to each other.  The test statistic is computed from data for a numerical variable generated by sampling two or more independent groups.
       Excel:  <PHStat><Multiple Sample Tests><Kruskal-Wallis Rank test>, then highlight the data and enter the desired significance level.

Back to top
 

Multivariate Analysis:
 
 
  • Multiple Regression.   When the behavior of a particular dependent variable is influenced by more than one explanatory factor, multiple regression is a more appropriate analysis method than bivariate methods.  Using sample data, regression analysis estimates the degree of influence, if any, each explanatory variable has on the dependent variable in the general population, holding the influence of the other explainers constant.  A general description of regression analysis can be found by clicking here
        Excel:  <PHStat><Regression><Multiple Regression> and then highlight the data for the dependent variable (Y) and the explanatory variables (Xs).  Click on <ANOVA & coefficients table> to generate the F test on the overall regression and estimated coefficients and accompanying sample t values.

Back to top