2013년 5월 7일 화요일

Sample Distribution Quiz 2 - No21

Last post, we talked about the way of figuring out the sampling probability from the population distribution.But this time, we will take a look at the method that finds out the population probability using the sample normal distribution.
Most of cases, it is impossible to figure out the population mean. we want to guess the population mean from the sample we got.

I think, this quiz might be happen in your daily lives

Quiz) you sampled 50 students from 3,000 students in your community high schools.
        average height is 150 with a 20 meters sample standard deviation.
        Calculate  the probability  that the average height of all student is
        between 145 and 155.



As you can see above, we actually  don't know the population mean, but we know that sample distribution with sample size 50 will have a normal distribution and also their sample distribution mean is the same value of population mean.




Back to our question, this question is finding a probability that average height of all students are within 5 of sample mean. In other words, the probability that sample mean is within 5 of population mean.
Because the population mean is the same value of the mean of sample distribution,
our question can be rewritten like this.
The probability that sample mean is within 5 of mean of sample distribution.


In order to find out  the z-score, we need to know standard deviation of sample distribution with below formula

SD_\bar{x}\ = \frac{\sigma}{\sqrt{n}}

However, we encounter a big problem,  we do not know the population standard deviation.So what we gonna do?
It is better idea to use a best estimate to get a population standard deviation.
In this case , sample standard deviation (20) is the best estimate for calculating the population standard deviation.


SD_\bar{x}\ = \frac{\sigma}{\sqrt{n}}  = 20/sqrt(50)

Once you get a sample standard deviation, you can also get a z-score.

As you know, z-score tells you how many standard deviation are away from the mean. In this case,
> 0+5/(20/sqrt(50))
[1] 1.767767
> (0-5)/(20/sqrt(50))
[1] -1.767767
Normal distribution is symmetric, so each one has a same value but is heading to the opposite direction.

Back to our question again, our question can be rewritten like this.
The probability that sample mean is within 1.767 standard deviation of the mean of sample distribution.

Because our interesting area is within 1.76 standard deviation of the mean of sample distribution, we have to careful to calculate the area.

1) get a cumulative probability up to 1.76 standard deviation.
> pnorm(1.7675, 0,1)
[1] 0.9614277
2) Extract 50% probability to get an half of our interest area.
> pnorm(1.7675, 0,1) - 0.5
[1] 0.4614277
3) Normal distribution is symmetric,so you can get a whole area by multiplying 2
> (pnorm(1.7675, 0,1) - 0.5 )  * 2
[1] 0.9228555


As a result, the probability  that the average height of all student is
between 145 and 155 is 92.28% .


















댓글 없음:

댓글 쓰기