Introduction to Inferential Statistics

Oliver C. Ibe , in Fundamentals of Applied Probability and Random Processes (2d Edition), 2014

nine.4.3 1-Tailed and Two-Tailed Tests

Hypothesis tests are classified equally either one-tailed (also called one-sided) tests or two-tailed (also called two-sided ) tests. One tailed-tests are concerned with one side of a statistic, such as "the hateful is greater than 10" or "the mean is less than 10." Thus, one-tailed tests bargain with only ane tail of the distribution, and the z-score is on simply i side of the statistic.

Ii-tailed tests bargain with both tails of the distribution, and the z-score is on both sides of the statistic. For example, Figure 9.iv illustrates a two-tailed examination. A hypothesis like "the mean is not equal to x" involves a 2-tailed exam considering the claim is that the mean tin can be less than x or it can be greater than 10. Tabular array 9.three shows the critical values, z α , for both the i-tailed test and the two-tailed test in tests involving the normal distribution.

Table ix.3. Critical Points for Different Levels of Significance

Level of Significance (α) 0.10 0.05 0.01 0.005 0.002
z α for 1-Tailed Tests   1.28 or 1.28   1.645 or one.645   2.33 or two.33   2.58 or 2.58   2.88 or 2.88
z α for ii-Tailed Tests   one.645 and 1.645   i.96 and 1.96   2.58 and 2.58   2.81 and 2.81   3.08 and 3.08

In a one-tailed examination, the surface area under the rejection region is equal to the level of significance, α. Also, the rejection region tin be below (i.e., to the left of) the credence region or beyond (i.e., to the right of) the acceptance region depending on how H i is formulated. When the rejection region is below the acceptance region, nosotros say that it is a left-tail exam. Similarly, when the rejection region is above the acceptance region, we say that it is a right-tail test.

In the two-tailed exam, in that location are 2 disquisitional regions, and the surface area under each region is α/ii. Equally stated earlier, the two-tailed examination is illustrated in Figure 9.iv. Figure 9.5 illustrates the rejection region that is across the credence region for the 1-tailed test, or more specifically the correct-tail test.

Effigy 9.5. Critical Region for I-Tailed Tests

Notation that in a i-tailed test, when H i involves values that are greater than μ X , nosotros accept a right-tail test. Similarly, when H ane involves values that are less than μ X , we accept a left-tail test. For example, an alternative hypothesis of the type H 1  : μ X   >   100 is a right-tail test while an culling hypothesis of the type H 1  : μ 10   <   100 is a left-tail test. Effigy nine.6 is a summary of the unlike types of tests. In the figure, μ 0 is the current value of the parameter.

Effigy 9.6. Summary of the Dissimilar Tests

Example 9.nine

The hateful lifetime E[X] of the light bulbs produced by Lighting Systems Corporation is 1570 hours with a standard divergence of 120 hours. The president of the company claims that a new product procedure has led to an increment in the hateful lifetimes of the light bulbs. If Joe tested 100 light bulbs made from the new product procedure and found that their mean lifetime is 1600 hours, exam the hypothesis that East[X] is non equal to 1570 hours using a level of significance of (a) 0.05 and (b) 0.01.

Solution:

The nada hypothesis is

H 0 : μ X = 1570 hours

Similarly, the alternative hypothesis is

H 1 : μ X 1570 hours

Since μ X   1570 includes numbers that are both greater than and less than 1570, this is a two-tailed test. From the available data, the normalized value of the sample hateful is

z = X ¯ μ Ten σ X ¯ = X ¯ μ Ten σ X / due north = 1600 1570 120 / 100 = 30 12 = 2.50

a.

At a level of significance of 0.05, z α   =     1.96 and z α   =   ane.96 for a ii-tailed examination. Thus, our acceptance region is [−   1.96,   1.96] of the standard normal distribution. The rejection and acceptance regions are illustrated in Effigy 9.7.

Figure 9.7. Critical Region for Problem 98.9(a)

Since z  =   2.fifty lies exterior the range [−   one.96,   1.96] (that is, information technology is in a rejection region), we reject H 0 at the 0.05 level of significance and accept H 1, which means that the difference in mean lifetimes is statistically significant.

b.

At the 0.01 level of significance, z α   =     2.58 and z α   =   two.58. The acceptance and rejection regions are shown in Effigy 9.viii. Since z  =   2.50 lies within the range [−   2.58,   ii.58], which is the acceptance region, we accept H 0 at the 0.01 level of significance, which means that the divergence in hateful lifetimes is not statistically significant.

Figure 9.8. Disquisitional Region for Problem 9.ix(b)

Example nine.10

For Example nine.9, exam the hypothesis that the new mean lifetime is greater than 1570 hours using a level of significance of (a) 0.05 and (b) 0.01.

Solution:

Hither we define the null hypothesis and alternative hypothesis equally follows:

H 0 : μ X = 1570 hours H 1 : μ X > 1570 hours

This is a ane-tailed test. Since the z-score is the same as in Case 9.9, nosotros only demand to find the conviction limits for the two cases.

a.

Considering H one is concerned with values that are greater than 1570, nosotros have a right-tail test, which means that we choose the rejection region that is above the acceptance region. Therefore, we choose z α   =   i.645 for the 0.05 level of significance in Table nine.3. Since z  =   two.fifty lies in the rejection region (i.e., two.50   >   1.645), equally illustrated in Figure nine.ix, we turn down H 0 at the 0.05 level of significance and thus accept H one. This implies that the difference in mean lifetimes is statistically significant.

Figure 9.9. Disquisitional Region for Trouble 8.10(a)

b.

From Tabular array 9.3, z α   =   2.33 at the 0.01 level of significance, which is less than z  =   2.50. Thus, we as well reject H 0 at the 0.01 level of significance and accept H i.

Note that we had before accepted H 0 under the ii-tailed test scheme at the 0.01 level of significance in Example nine.ix. This means that decisions made under a 1-tailed exam do not necessarily hold with those made nether a two-tailed test.

Example ix.xi

A manufacturer of a migraine headache drug claimed that the drug is 90% effective in relieving migraines for a period of 24 hours. In a sample of 200 people who accept migraine headache, the drug provided relief for 160 people for a period of 24 hours. Determine whether the manufacturer'southward merits is legitimate at the 0.05 level of significance.

Solution:

Since the success probability of the drug is p  =   0.9, the null hypothesis is

H 0 : p = 0.ix

As well, since the drug is either effective or non, testing the drug on any individual is essentially a Bernoulli trial with claimed success probability of 0.9. Thus, the variance of the trial is

σ p 2 = p ane p = 0.09

Considering the drug provided relief for only 160 of the 200 people tested, the observed success probability is

p ¯ = 160 200 = 0.eight

We are interested in determining whether the proportion of people that the drug was effective in relieving their migraines is too low. Since p ¯ < 0.ix , we choose the culling hypothesis equally follows:

H 1 : p < 0.9

Thus, nosotros have a left-tail examination. Now, the standard normal score of the observed proportion is given by

z = p ¯ p σ p ¯ = p ¯ p σ p / n = 0.8 0.9 0.09 / 200 = 0.i 0.0212 = 4.72

For a left-tail test at the 0.05 level of significance, the critical value is z α   =     ii.33. Since z  =     iv.72 falls within the rejection region, we reject H 0 and have H 1; that is, the company'south merits is false.

Read full chapter

URL:

https://world wide web.sciencedirect.com/science/commodity/pii/B9780128008522000092

Multiple Regression

Rudolf J. Freund , ... Donna Fifty. Mohr , in Statistical Methods (Third Edition), 2010

8.three.v The Equivalent t Statistic for Individual Coefficients

Nosotros noted in Chapter 7 that the F exam for the hypothesis that the coefficient is zero can be performed by an equivalent t test. The same relationship holds for the private partial coefficients in the multiple regression model. The t statistic for testing H 0 : β j = 0 is

t = β ^ j c j j  MSE ,

where c j j is the j th diagonal chemical element of C, and the degrees of freedom are ( n m ane ) . Information technology is easily verified that these statistics are the foursquare roots of the F values obtained earlier and they volition not be reproduced here. As in simple linear regression, the denominator of this expression is the standard mistake (or square root of the variance) of the estimated coefficient, which can be used to construct confidence intervals for the coefficients.

In Chapter 7 we noted that the use of the t statistic immune us to examination for specific (nonzero) values of the parameters, and immune the employ of one-tailed tests and the calculation of confidence intervals. For these reasons, most computers provide the standard errors and t tests. A typical computer output for Example viii.2 is shown in Table eight.6. We tin use this output to compute the confidence intervals for the coefficients in the regression equation every bit follows:

historic period: Std . error = ( 0 . 0 0 0 one 2 ix 3 ) ( 3 0 6 . 0 9 ) = 0 . 1 9 9

0.95 Confidence interval: 0 . 3 4 9 8 ± ( 2 . 0 1 4 1 ) ( 0 . 1 9 9 ) : from 0 . 7 5 0 6 to 0.051

bed: Std . error = ( 0 . half dozen 4 0 2 5 ) ( 3 0 half dozen . 0 9 ) = 4 . 4 2 seven 0.95 Confidence interval: i 1 . 2 3 8 2 ± ( two . 0 1 4 1 ) ( 4 . 4 2 7 ) : from two 0 . 1 5 iv half-dozen to 2 . 3 2 i 8

bath: Std . error = ( 0 . 1 iii 1 4 3 v ) ( iii 0 vi . 0 ix ) = half-dozen . 3 4 three 0.95 Confidence interval: 4 . v 4 0 ane ± ( two . 0 i 4 1 ) ( 6 . 3 four 3 ) : from 1 7 . 3 one 5 5 to 8 . 2 3 5 iii

size: Std . fault = ( 0 . 1 3 2 8 iii 4 ) ( iii 0 vi . 0 9 ) = 6 . three 7 6 0.95 Conviction interval: 6 5 . 9 iv 6 5 ± ( 2 . 0 one 4 1 ) ( 6 . iii vii 6 ) : from five 3 . 1 0 iv 5 to 7 8 . 7 8 viii 4

lot: Std . mistake = ( 8 . 2 3 four i 8 9  E  vi ) ( iii 0 vi . 0 9 ) = 0 . 0 5 0 2 0.95 Conviction interval: 0 . 0 6 2 0 5 ± ( 2 . 0 1 iv 1 ) ( 0 . 0 5 0 2 ) : from 0 . 0 3 ix one to 0 . 1 6 3 2 .

As expected, the conviction intervals of those coefficients deemed statistically meaning at the 0.05 level practise not include zero.

Finally, notation that the tests nosotros have presented are special cases of tests for whatever linear function of parameters. For example, we may wish to exam

H 0 : β iv ane 0 β 5 = 0 ,

which for the home cost data tests the hypothesis that the size coefficient is 10 times larger than the lot coefficient. The methodology for these more general hypothesis tests is presented in Department 11.7.

Read full chapter

URL:

https://www.sciencedirect.com/scientific discipline/article/pii/B9780123749703000081

Multiple Regression

Donna Fifty. Mohr , ... Rudolf J. Freund , in Statistical Methods (Fourth Edition), 2022

viii.3.5 The Equivalent t Statistic for Private Coefficients

We noted in Chapter 7 that the F test for the hypothesis that the coefficient is goose egg can be performed past an equivalent t test. The aforementioned relationship holds for the individual partial coefficients in the multiple regression model. The t statistic for testing H 0 : β j = 0 is

t = β ˆ j c j j MSE ,

where c j j is the j th diagonal element of C, and the degrees of liberty are ( northward thousand ane ) . It is easily verified that these statistics are the square roots of the F values obtained earlier and they will not be reproduced here. As in simple linear regression, the denominator of this expression is the standard fault (or square root of the variance) of the estimated coefficient, which can be used to construct conviction intervals for the coefficients.

In Chapter vii we noted that the use of the t statistic immune us to test for specific (nonzero) values of the parameters, and allowed the use of one-tailed tests and the calculation of confidence intervals. For these reasons, most computers provide the standard errors and t tests. A typical figurer output for Example eight.2 is shown in Table 8.half dozen. We tin use this output to compute the confidence intervals for the coefficients in the regression equation as follows:

historic period: Std . mistake = ( 0.0001293 ) ( 306.09 ) = 0.199

0.95 Confidence interval: 0.3498 ± ( 2.0141 ) ( 0.199 ) : from 0.7506 to 0.051,

bed: Std . error = ( 0.64025 ) ( 306.09 ) = iv.427 0.95 Confidence interval: eleven.2382 ± ( two.0141 ) ( 4.427 ) : from xx.1546 to 2.3218 ,

bath: Std . error = ( 0.131435 ) ( 306.09 ) = 6.343 0.95 Confidence interval: 4.5401 ± ( 2.0141 ) ( half dozen.343 ) : from 17.3155 to 8.2353 ,

size: Std . error = ( 0.132834 ) ( 306.09 ) = 6.376 0.95 Confidence interval: 65.9465 ± ( 2.0141 ) ( half dozen.376 ) : from 53.1045 to 78.7884 , and

lot: Std . mistake = ( eight.234189 E 6 ) ( 306.09 ) = 0.0502 0.95 Confidence interval: 0.06205 ± ( 2.0141 ) ( 0.0502 ) : from 0.0391 to 0.1632 .

As expected, the confidence intervals of those coefficients accounted statistically significant at the 0.05 level do not include zero.

Finally, note that the tests we take presented are special cases of tests for any linear office of parameters. For example, nosotros may wish to test

H 0 : β 4 10 β five = 0 ,

which for the home price data tests the hypothesis that the size coefficient is ten times larger than the lot coefficient. The methodology for these more general hypothesis tests is presented in Section xi.seven.

Example 8.iii

Snow Geese Departure Times Revisited

Instance 7.3 provided a regression model to explain how the difference times (TIME) of lesser snow geese were affected past temperature (TEMP). Although the results were reasonably satisfactory, it is logical to expect that other environmental factors affect difference times.

Solution

Since information on other factors was also collected, nosotros can advise a multiple regression model with the following additional ecology variables:

HUM, the relative humidity,

LIGHT, light intensity, and

Cloud, percent cloud cover.

The data are given in Table 8.4.

Tabular array 8.four. Snowfall goose deviation times data.

DATE TIME TEMP HUM Calorie-free CLOUD
11/10/87 xi eleven 78 12.vi 100
xi/13/87 2 11 88 10.8 80
11/14/87 −2 eleven 100 ix.seven thirty
11/15/87 −11 20 83 12.2 50
11/17/87 −5 8 100 14.two 0
11/18/87 2 12 90 ten.5 ninety
11/21/87 −6 6 87 12.five xxx
11/22/87 22 18 82 12.9 xx
11/23/87 22 nineteen 91 12.3 80
11/25/87 21 21 92 9.4 100
eleven/30/87 eight 10 90 11.7 60
12/05/87 25 xviii 85 11.viii xl
12/14/87 nine xx 93 11.1 95
12/18/87 vii 14 92 8.3 90
12/24/87 8 19 96 12.0 40
12/26/87 eighteen 13 100 eleven.3 100
12/27/87 −14 three 96 4.8 100
12/28/87 −21 4 86 vi.ix 100
12/30/87 −26 3 89 7.1 40
12/31/87 −7 15 93 eight.1 95
01/02/88 −15 xv 43 half-dozen.9 100
01/03/88 −6 6 lx seven.vi 100
01/04/88 −23 v . eight.8 100
01/05/88 −14 ii 92 9.0 60
01/06/88 −6 10 90 . 100
01/07/88 −8 two 96 7.ane 100
01/08/88 −19 0 83 3.9 100
01/x/88 −23 −4 88 8.1 20
01/11/88 −11 −2 eighty 10.iii x
01/12/88 v 5 80 9.0 95
01/14/88 −23 five 61 5.1 95
01/xv/88 −7 viii 81 vii.four 100
01/16/88 ix xv 100 7.ix 100
01/20/88 −27 5 51 3.eight 0
01/21/88 −24 −i 74 vi.three 0
01/22/88 −29 −2 69 6.3 0
01/23/88 −nineteen 3 65 7.8 thirty
01/24/88 −9 half-dozen 73 9.v xxx

An inspection of the data shows that 2 observations have missing values (denoted past ".") for a variable. This ways that these observations cannot be used for the regression analysis. Fortunately, near reckoner programs recognize missing values and will automatically ignore such observations. Therefore all calculations in this case will be based on the remaining 36 observations.

The first step is to compute X X and X Y . We and so compute the inverse and the estimated coefficients. As before, we will allow the reckoner do this with the results given in Table 8.five in the same format equally that of Table 8.3.

Table 8.5. Regression matrices for snowfall goose departure times.

Model Crossproducts X′X X′Y Y′Y
X′X INTERCEP TEMP HUM
INTERCEP 36 319 3007
TEMP 319 4645 27519
HUM 3007 27519 257927
Light 326.two 3270.3 27822
Deject 2280 23175 193085
Fourth dimension −157 1623 −9662
X′X Low-cal CLOUD Fourth dimension
INTERCEP 326.2 2280 −157
TEMP 3270.3 23175 1623
HUM 27822 193085 −9662
LIGHT 3211.9 20079.5 −402.8
Deject 20079.five 194100 −3730
Fourth dimension −402.8 −3730 9097
X′X Inverse, Parameter Estimates, and SSE
INTERCEPT TEMP HUM
INTERCEP i.1793413621 0.0085749149 −0.010464297
TEMP 0.0085749149 0.0010691752 0.0000605688
HUM −0.010464297 0.0000605688 0.0001977643
Lite −0.028115838 −0.00192403 −0.000581237
CLOUD −0.001558842 −0.000089595 −0.000020914
TIME −52.99392938 0.9129810924 0.1425316971
LIGHT Cloud Fourth dimension
INTERCEP −0.028115838 −0.001558842 −52.99392938
TEMP −0.00192403 −0.000089595 0.9129810924
HUM −0.000581237 −0.000020914 0.1425316971
LIGHT 0.0086195605 0.0002464973 ii.5160019069
Deject 0.0002464973 0.0000294652 0.0922051991
Time 2.5160019069 0.0922051991 2029.6969929

The five elements in the last column, labeled Fourth dimension, of the inverse portion contain the estimated coefficients, providing the equation:

TIME ˆ = 52.994 + 0.9130 ( TEMP ) + 0.1425 ( HUM ) + 2.5160 ( Lite ) + 0.0922 ( Deject ) .

Unlike the case of the regression involving only TEMP, the intercept now has no real pregnant since zero values for HUM and Calorie-free cannot exist. The remainder of the coefficients are positive, indicating afterwards difference times for increased values of TEMP, HUM, LIGHT, and Deject. Because of the different scales of the independent variables, the relative magnitudes of these coefficients accept little meaning and also are not indicators of relative statistical significance.

Note that the coefficient for TEMP is 0.9130 in the multiple regression model, while it was i.681 for the uncomplicated linear regression involving only the TEMP variable. In this case, the and then-called total coefficient for the simple linear regression model includes the indirect effect of other variables, while in the multiple regression model, the coefficient measures only the upshot of TEMP past belongings constant the furnishings of other variables.

For the second pace we compute the sectionalisation of the sums of squares. The residual sum of squares

SSE = y two B ˆ ' X ' Y = 9097 [ ( 52.994 ) ( 157 ) + ( 0.9123 ) ( 1623 ) + ( 0.1425 ) ( 9662 ) + ( 2.5160 ) ( 402.8 ) + ( 0.09221 ) ( 3730 ) ] ,

which is bachelor in the computer output as the last chemical element of the inverse portion and is 2029.70. The estimated variance is MSE = 2029.70 ( 36 v ) = 65.474 , and the estimated standard difference is 8.092. This value is somewhat smaller than the 9.96 obtained for the simple linear regression involving only TEMP.

The model sum of squares is

SSR ( regression model ) = B ˆ X Y y 2 / n = 7067.thirty 684.69 = 6382.61 .

The degrees of liberty for this sum of squares is 4; hence the model mean square is 6382.61 four = 1595.65 . The resulting F statistic is 1595.65 65.474 = 24.371 , which clearly leads to the rejection of the null hypothesis of no regression. These results are summarized in an analysis of variance tabular array shown in Table 8.7 in Section 8.5.

In the final step we utilize the standard errors and t statistics for inferences on the coefficients. For the TEMP coefficient, the estimated variance of the estimated coefficient is

var ˆ ( β ˆ TEMP ) = c TEMP , TEMP MSE = ( 0.001069 ) ( 65.474 ) = 0.0700 ,

which results in an estimated standard fault of 0.2646. The t statistic for the null hypothesis that this coefficient is zippo is

t = 0.9130 0.2646 = 3.451 .

Assuming a desired significance level of 0.05, the hypothesis of no temperature effect is clearly rejected. Similarly, the t statistics for HUM, LIGHT, and CLOUD are one.253, iii.349, and 2.099, respectively. When compared with the tabulated two-tailed 0.05 value for the t distribution with 31 degrees of freedom of two.040, the coefficient for HUM is not pregnant, while Calorie-free and CLOUD are. The p values are shown later in Table 8.7, which presents calculator output for this problem. Basically this ways that departure times appear to exist affected by increasing levels of temperature, light, and cloud comprehend, but there is bereft evidence to country that adding humidity to this list would amend the prediction of difference times.

We have presented the calculations in detail and then the reader can see that the answers are non "magic" simply are in fact the consequence of the normal equations and their solutions. Fortunately, statistical software performs these calculations for us, equally shown in Department viii.5.

Read full affiliate

URL:

https://www.sciencedirect.com/science/commodity/pii/B9780128230435000084

Nonparametric Statistics

Kandethody Yard. Ramachandran , Chris P. Tsokos , in Mathematical Statistics with Applications in R (Third Edition), 2021

12.iv.1 Median test

Allow k 1 and k two be the medians of two populations one and 2, respectively, both with continuous distributions. Assume that we have a random sample of size northward one from population 1 and a random sample of size n 2 from population 2. The median test can exist summarized as follows.

Hypothesis-Testing Procedure Using Median Examination

Nosotros examination

H 0 : m 1 = 1000 2 versus H a : m 1 > m 2 , upper k 1 < 1000 2 , lower tailed test chiliad 1 m 2 , two-tailed test .

one.

Combine the 2 samples into a single sample of size n   =   n i  + n two, keeping track of each observation's original population. Suit the due north 1  + n ii observations in increasing social club and find the median of this combined sample. If the median is 1 of the sample values, discard those observations and conform the sample size appropriately.

2.

Define N 1b to be the number of observations of a sample from population 1.

3.

Decision: If H 0 is true, then we would wait Due north aneb to exist equal to some number around northward 1 /2. For H a : m 1>   thousand 2, rejection region is N 1b     c, where P(North 1b     c)   = α, for H a : m 1<   m 2, rejection region is N 1b     c, where P(N 1b   c)   = α, and for H a : m ane  = m two, rejection region is N oneb   c ane, or N 1b     c 2, where

P ( Due north 1 b c one ) = α 2 and P ( North 1 b c 2 ) = α 2 .

Assumptions: (1) Population distribution is continuous. (2) Samples are independent.

Note that since some observations tin can be equal to the overall median, and those values will exist discarded, N i b need non exist equal to due north 1 . Allow due north 1 + n two  =   2k. Under H 0, N 1b has a hypergeometric distribution given past

P ( Due north 1 b = n i b ) = ( n 1 n i b ) ( n ii k n ane b ) ( due north one + n 2 thousand ) , northward i b = 0,1,2 , , n one ,

with the assumption that ( i j ) = 0 , if j > i . Note that the hypergeometric distribution is a discrete distribution that describes the number of "successes" in a sequence of n draws from a finite population without replacement. Thus, we can find the values of c, c i, and c 2, required earlier. This calculation can be slow. To overcome this, we can utilize the post-obit large sample approximation valid for n 1  >   v and n 2  >   5. Commencement classify each observation as above or beneath the sample median as shown in Table 12.iv.

Table 12.4. Data Classification With Respect to Median.

Below Higher up Totals
Sample i N 1b N 1a n 1
Sample 2 N 2b N 2a due north ii
Total N b N a due north 1  + n two  = due north

It tin can exist verified that the expected value and variance of North 1a (similarly for N 1 b ) are given by

E ( North 1 a ) = Northward a n i n , and V a r ( N 1 a ) = N a north 1 n 2 N b n 2 ( northward one ) .

Thus, for a large sample nosotros tin can write

z = North 1 a East ( Due north 1 a ) V a r ( N 1 a ) N ( 0,1 ) .

Hence, we can follow the usual large sample rejection region procedure, which is summarized next.

Summary of big sample median sum test (n 1  >   5 and n two  >   five)

Nosotros test

H 0 : one thousand one = 1000 2 versus H a : { m ane > k 2 , upper tailed test m i < chiliad 2 , lower tailed test thou 1 m 2 , two-tailed test .

The examination statistic:

z = N 1 a East ( N i a ) V a r ( N i a ) ,

where

Due east ( N 1 a ) = North a northward 1 n

and

5 a r ( N 1 a ) = N a n ane n ii N b north 2 ( n 1 ) .

Rejection region:

{ z > z α , upper tail RR z < z α , lower tail RR | z | > z α / 2 , two tail RR

Determination: Reject H 0, if the test statistic falls in the RR, and conclude that H a is true with (1   α)100% confidence. Otherwise, practice not pass up H 0, because there is not plenty evidence to conclude that H a is true for a given α and more than data are needed.

Assumptions: (1) Population distributions are continuous. (2) n 1  >   5 and northward 2  >   5.

We illustrate this process with the following example.

Example 12.four.i

Given below are the mileages (in thousands of miles) of two samples of automobile tires of two different brands, say I and Ii, before they wear out.

Tire I: 34 32 37 35 42 43 47 58 59 62 69 71 78 84 Tire II: 39 48 54 65 70 76 87 xc 111 118 126 127

Use the median test to encounter whether the tire Two gives more median mileage than tire I. Utilize α   = 0.05.

Solution

Nosotros will examination

H 0 : m i = g two v e r south u s H 0 : m i < m 2 .

Because the sample size supposition is satisfied, we will use the big sample normal approximation. The results of steps i and 2, using the note A for above the median and B for below the median, are given in Table 12.five .

Table 12.5. Mileage Data Classification.

Sample values Population To a higher place/below the median
32 I B
34 I B
35 I B
37 I B
39 Ii B
42 I B
43 I B
47 I B
48 2 B
54 II B
58 I B
59 I B
62 I B
65 II A
69 I A
70 Two A
71 I A
76 II A
78 I A
84 I A
87 II A
ninety 2 A
111 II A
118 Ii A
126 Ii A
127 II A

The median is 63.5. Thus, we obtain Table 12.half dozen.

Tabular array 12.six. Summary of Mileage Data for Motorcar Tires.

Beneath Above Totals
Sample 1 N 1b   =   ten N 1a   =   iv n 1  =   14
Sample 2 N twob   =   3 N twoa   =   9 n 2  =   12
Total N b   =   xiii N a   =   13 n 1  + n two  = n  =   26

Also,

Eastward N ane a = Due north a n 1 northward = ( xiii ) ( 14 ) 26 = 7 ,

and

V a r ( N 1 a ) = North a n 1 n 2 North b n 2 ( north 1 ) = ( 13 ) ( 13 ) ( 14 ) ( 12 ) 16,900 = 1.68.

Hence, the exam statistic is

z = N 1 a E ( N 1 a ) Five a r ( N 1 a ) = four 7 1.68 = 2.31.

For α   =   0.05, z0.05  =   one.645. Hence, the rejection region is {z   <   −1.645}. Because the observed value of z does fall in the rejection region, we reject H0 and conclude that there is enough evidence to conclude that in that location is a difference in the median mileage for the two types of tires.

Read full chapter

URL:

https://world wide web.sciencedirect.com/science/article/pii/B9780128178157000129

Inferential Statistics Two: Parametric Hypothesis Testing

Andrew P. Rex , Robert J. Eckersley , in Statistics for Biomedical Engineers and Scientists, 2019

v.8 1-tailed vs. two-tailed Tests

All of the hypothesis test examples that we have seen so far accept tested for whatever departure between the samples (or the sample and an expected value). We chose non to investigate which of the two was greater than the other. Sometimes we may exist interested in this, and hypothesis tests can be practical in two dissimilar ways to reflect this need.

Hypothesis tests tin be either 1-tailed or ii-tailed. Put simply, if we are interested in whatsoever deviation between our 2 samples (or our one sample and an expected value), and so we apply a ii-tailed test. If we are interested in determining if a particular sample is either only greater than or simply less than the other, and so we use a ane-tailed exam. Fig. five.9 illustrates why these names are used. The curves stand for a t-distribution for x degrees of freedom, and the shaded area in both cases corresponds to v% of the total area under the curve. The left-hand plot shows the two-tailed instance, and the correct-hand plot shows the 1-tailed case. The critical t-value for x degrees of freedom and α = 0.05 is shown every bit ii.228 for a 2-tailed t-test (nosotros tin can look this value up in Table A.one). Therefore whatsoever computed t-value inside the shaded area will event in rejection of the null hypothesis. The right-hand plot shows how the critical t-value changes when nosotros are only interested in one tail. Nosotros nonetheless need to have 5% of the full expanse outside of the disquisitional t-value, so the critical value must exist smaller in magnitude. For this reason, information technology is easier to show statistical significance when using a 1-tailed test than a 2-tailed examination. The critical t-values given in Table A.1 are for ii-tailed t-tests. To use these same values for a 1-tailed test, we should double the significance level (east.g. if we desire a significance level of 0.05, so we await up the disquisitional value from the 0.ane column).

Figure 5.9

Figure v.ix. An illustration of ane-tailed and 2-tailed t-tests. The curve shown in both figures is a t-distribution for 10 degrees of freedom. (A) A 2-tailed test: the shaded area corresponds to the range of calculated t-values that would upshot in rejection of the null hypothesis. In this instance, we are interested in finding whatsoever difference between our ii samples. (B) A 1-tailed test, in which we are only interested in finding if a particular one of our samples is greater than the other. Note that the total shaded area is the same in both cases and corresponds to v% of the full surface area (i.e. 95% confidence). Yet, for a 1-tailed t-exam, a lower critical t-value results, making it easier to testify significance.

To illustrate the application of a ane-tailed exam, we render to Professor A's original one-sample Student'due south t-test from Department 5.5. Call up that the absolute t-value was computed as 2.053, and we could not reject the zip hypothesis considering this was not greater than the (two-tailed) critical t-value of two.776. Looking again at Tabular array A.1, we notice the 1-tailed critical t-value under the column for 0.i significance level (this is double the actual significance level of 0.05). This is equal to 2.132, and then in this example, it does non change the effect of the examination.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780081029398000141

Hypothesis Testing II

B.R. Martin , in Statistics for Physical Science, 2012

11.3.1 Sign Exam

In Section 10.five we posed the question of whether it is possible to test hypotheses about the average of a population when its distribution is unknown. One simple test that can do this is the sign test, and as an example of its use nosotros volition test the naught hypothesis H 0 : μ = μ 0 confronting some culling, such every bit H a : μ = μ a or H a : μ > μ 0 using a random sample of size n in the example where n is small, so that the sampling distribution may not be normal. In general if we make no assumption about the class of the population distribution, and so in the sign examination and those that follow, μ refers to the median, simply if we know that the population distribution is symmetric, so μ is the arithmetic mean. For simplicity the note μ will be used for both cases.

We first by assigning a plus sign to all data values that exceed μ 0 and a minus sign to all those that are less than μ 0 . We would expect the plus and minus signs to be approximately equal and any deviation would lead to rejection of the null hypothesis at some significance level. In principle, considering we are dealing with a continuous distribution, no ascertainment can in principle exist exactly equal to μ 0 , but in practice approximate equality volition occur depending on the precision with which the measurements are fabricated. In these cases the points of 'practical equality' are removed from the data set and the value of north reduced accordingly. The test statistic X is the number of plus signs in the sample (or as we could use the number of minus signs). If H 0 is true, the probabilities of obtaining a plus or minus sign are equal to ½ and then 10 has a binomial distribution with p = p 0 = 1 / ii . Significance levels can thus be obtained from the binomial distribution for one-sided and two-sided tests at whatsoever given level α .

For example, if the alternative hypothesis is H a : μ > μ 0 , then the largest critical region of size non exceeding α is obtained from the inequality 10 k α , where

(eleven.22a) x = one thousand α n B ( ten : n , p 0 ) α ,

and B is the binomial probability with p 0 = p = ½ if H 0 is true. Similarly, if H a : μ < μ 0 , nosotros class the inequality x k α , where k α is divers by

(xi.22b) x = 0 yard α B ( 10 : n , p 0 ) α

Finally, if H a : μ μ 0 , i.due east., we have a two-tailed examination, then the largest disquisitional region is defined by

(xi.22c) x k α / 2 and x m α / ii .

For sample sizes greater than almost 10, the normal approximation to the binomial may be used with mean μ = n p and σ ii = due north p ( 1 p ) .

Instance 11.7

A mobile phone battery needs to be regularly recharged even if no calls are made. Over 12 periods when charging was required, it was found that the intervals in hours between chargings were:

50 35 45 65 39 38 47 52 43 37 44 40

Use the sign test to exam at a x% significance level the hypothesis that the battery needs recharging on average every 45 hours.

Nosotros are testing the null hypothesis H 0 : μ 0 = 45 against the alternative H a : μ 0 45 . Start we remove the data point with value 45, reducing n to eleven, and then assign a plus sign to those measurements greater than 45 and a minus sign to those less than 45. This gives x = 4 as the number of plus signs. As this is a ii-tailed examination, we demand to discover the values of k 0.05 and k 0.05 for due north = 11 . From Table C.2, these are k 0.05 = 3 and k 0.05 = 9 . Since ten = 4 lies in the credence region, we accept the null hypothesis at this significance level.

The sign test tin be extended in a straightforward mode to ii-sample cases, for example, to examination the hypothesis that μ 1 = μ 2 using samples of size n fatigued from two non-normal distributions. In this instance the differences d i ( i = ane , ii , , northward ) of each pair of observations is replaced past a plus or minus sign depending on whether d i is greater than or less than zero, respectively. If the cipher hypothesis instead of existence μ 1 μ 2 = 0 is instead μ one μ ii = d , then the process is the same, but the quantity d is subtracted from each d i earlier the test is made.

Read full chapter

URL:

https://world wide web.sciencedirect.com/scientific discipline/article/pii/B9780123877604000111

Principles of Inference

Donna L. Mohr , ... Rudolf J. Freund , in Statistical Methods (Quaternary Edition), 2022

Concept Questions

This section consists of some true/false questions regarding concepts of statistical inference. Indicate whether a statement is true or false and, if false, indicate what is required to brand the statement truthful.

i.

______ In a hypothesis exam, the p value is 0.043. This ways that the naught hypothesis would be rejected at α = 0.05 .

2.

______ If the null hypothesis is rejected by a one-tailed hypothesis examination, then information technology volition also be rejected past a two-tailed test.

3.

______ If a null hypothesis is rejected at the 0.01 level of significance, it will likewise be rejected at the 0.05 level of significance.

4.

______ If the test statistic falls in the rejection region, the null hypothesis has been proven to exist true.

5.

______ The chance of a blazon Ii error is directly controlled in a hypothesis exam by establishing a specific significance level.

6.

______ If the zippo hypothesis is true, increasing merely the sample size will increment the probability of rejecting the null hypothesis.

7.

______ If the cipher hypothesis is false, increasing the level of significance ( α ) for a specified sample size volition increase the probability of rejecting the null hypothesis.

8.

______ If nosotros decrease the confidence coefficient for a fixed n , we decrease the width of the confidence interval.

9.

______ If a 95% confidence interval on μ was from 50.5 to sixty.6, we would turn down the zilch hypothesis that μ = sixty at the 0.05 level of significance.

10.

______ If the sample size is increased and the level of conviction is decreased, the width of the confidence interval volition increase.

eleven.

______ A inquiry article reports that a 95% conviction interval for mean reaction time is from 0.25 to 0.29 seconds. Nearly 95% of individuals will have reaction times in this interval.

Read total affiliate

URL:

https://www.sciencedirect.com/scientific discipline/article/pii/B9780128230435000035

Hypothesis testing

Kandethody M. Ramachandran , Chris P. Tsokos , in Mathematical Statistics with Applications in R (Third Edition), 2021

6.5.1.i Equal variances

Given adjacent is the process we follow to compare the true means from 2 independent normal populations when n 1 and due north 2 are pocket-size (n one  <   thirty or n 2  <   30) and nosotros tin assume homogeneity in the population variances, that is, σ 1 ii = σ 2 2 . In this case, we pool the sample variances to obtain a point estimate of the common variance.

Comparison of two population means, small sample case (pooled t-test)

Nosotros desire to test:

H 0 : μ 1 μ 2 = D 0

versus

μ 1 μ 2 > D 0 , upper tailed test H a : μ 1 μ two < D 0 , lower tailed exam μ 1 μ 2 D 0 , two - tailed test .

The TS is:

T = X ¯ 1 X ¯ ii D 0 S p ane north 1 + i due north ii .

Hither the pooled sample variance is:

S p 2 = ( n 1 one ) S 1 ii + ( n ii ane ) Southward ii 2 northward 1 + n two 2 .

Then the RR is:

R R : { t > t α , upper tailed exam t < t α , lower tail exam | t | > t α / 2 , two - tailed exam

where t is the observed TS and t α is based on (due north one

+

northward two

 

2) degrees of liberty, and such that P(T   >   t α )

 

=

α.

Decision: Refuse H 0, if TS falls in the RR, and conclude that H a is true with (1   α)100% confidence. Otherwise, do not decline H 0 because in that location is non enough evidence to conclude that H a is true for a given α.

Assumptions: The samples are independent and come up from normal populations with means μ one and μ 2, and with (unknown) equal variances, that is, σ 1 2 = σ 2 2 .

Read total affiliate

URL:

https://www.sciencedirect.com/science/article/pii/B9780128178157000063

Nonparametric Methods

Donna L. Mohr , ... Rudolf J. Freund , in Statistical Methods (Fourth Edition), 2022

14.iii Two Independent Samples

The Mann–Whitney test (also called the Wilcoxon rank sum or Wilcoxon ii-sample examination) is a rank-based nonparametric test for comparing the location of ii populations using contained samples. Note that this examination does not specify an inference to any particular parameter of location. Using contained samples of n 1 and n 2 , respectively, the test is conducted as follows:

1.

Rank all ( due north 1 + n 2 ) observations every bit if they came from one sample, adjusting for ties.

ii.

Compute T , the sum of ranks for the smaller sample.

3.

Compute T = ( n ane + n 2 ) ( n i + due north 2 + 1 ) 2 T , the sum of ranks for the larger sample. This is necessary to assure a ii-tailed test.

4.

For small samples ( n 1 + n 2 30 ) , compare the smaller of T and T with the rejection region consisting of values less than or equal to the critical values given in Appendix Table A.nine. If either T or T falls in the rejection region, we reject the nothing hypothesis. Note that even though this is a two-tailed exam, we just use the lower quantiles of the tabled distribution.

five.

For big samples, the statistic T or T (whichever is smaller) has an approximately normal distribution with

μ = n 1 ( n 1 + n 2 + 1 ) 2 and σ 2 = northward one n 2 ( n 1 + n 2 + ane ) 12 .

The sample size n 1 should exist taken to correspond to whichever value, T or T , has been selected equally the test statistic.

These parameter values are used to compute a test statistic having a standard normal distribution. We then reject the goose egg hypothesis if the value of the test statistic is smaller than z α 2 . Modifications are available when in that location are a large number of ties (for example, Conover, 1999).

The procedure for a one-sided alternative hypothesis depends on the direction of the hypothesis. For example, if the alternative hypothesis is that the location of population i has a smaller value than that of population 2 (a one-sided hypothesis), then we would sum the ranks from sample 1 and employ that sum as the test statistic. We would refuse the null hypothesis of equal distributions if this sum is less than the α 2 quantile of the table. If the one-sided alternative hypothesis is the other direction, nosotros would use the sum of ranks from sample 2 with the same rejection criteria.

Case 14.4

Tasting Scores

Because the sense of taste of food is impossible to quantify, results of tasting experiments are oftentimes given in ordinal class, commonly expressed as ranks or scores. In this experiment ii types of hamburger substitutes were tested for quality of taste. V sample hamburgers of type A and five of blazon B were scored from all-time (1) to worst (ten). Although these responses may announced to be ratio variables (and are often analyzed using this definition), they are more than accordingly classified as existence in the ordinal scale. The results of the gustatory modality test are given in Table 14.4. The hypotheses of interest are

H 0 : the types of hamburgers accept the same quality of gustation, and H 1 : they have different quality of taste .

Table 14.4. Hamburger taste test.

Blazon of Burger Score
A 1
A 2
A 3
B 4
A v
A half-dozen
B 7
B 8
B 9
B 10

Solution

Because the responses are ordinal, nosotros use the Mann–Whitney exam. Using these data nosotros compute

T = one + 2 + iii + 5 + 6 = 17 and T = x ( eleven ) 2 17 = 38 .

Choosing α = 0.05 and using Appendix Table A.9, we reject H 0 if the smaller of T or T is less than or equal to 17. The computed value of the test statistic is 17; hence nosotros decline the null hypothesis at α = 0.05 , and conclude that the 2 types differ in quality of taste. If nosotros had to choose one or the other, we would choose burger type A based on the fact that it has the smaller rank sum.

Randomization Approach to Instance 14.4

Since this information set does non contain any ties, Appendix Table A.nine is accurate. If we wished a p value, we could enumerate all the 10 ! ( five ! five ! ) = 252 means the ranks ane through 10 could exist split into two groups of five each. List the corresponding pseudo-value of T would evidence that there were 3.17% of them at or less than 17. Hence, the exact p value is 0.0317, which agrees with the value from SAS Organization's PROC NPAR1WAY. Using the normal asymptotic approximation gives z = 2.193 , with a p value of 0.028, which is surprisingly close given the small-scale sample size.

Read full affiliate

URL:

https://www.sciencedirect.com/science/article/pii/B978012823043500014X