ADMN 210

Review for Midterm #1

As mentioned earlier, the mid-term will have conceptual and quantitative multiple-choice questions. You need to read all 4 chapters and you need to be able to solve problems in all 4 chaptersin order to do well in this test.

The following are for review and learning purposes only. I am not indicating that identical or similar problems will be in the test. As I have indicated many times, all the exams in this course will have multiple-choice questions and problems.

Suggestion: treat this review set as you would an actual test. Sit down with your one page of notes and your calculator, and give it a try. That way you will know what areas you still need to study.

1) Classify each of the following as nominal, ordinal, interval, or ratio data.

a. The time required to produce each tire on an assembly line

b. The number of quarts of milk a family drinks in amonth

c. The ranking of four machines in your plant after they have been designated as excellent, good, satisfactory, and poor

d. The telephone area code of clients in the United States

e. The age of each of your employees

f. The dollar sales at the local pizza house each month

g. An employee’s identification number

h. The response time of an emergency unit

2) True or False: The highest level of data measurement is the ratio-level measurement.

3) True or False: Interval- and ratio-level data are also referred to as categorical data.

4) A small portion or a subset of the population on which data is collected for conducting statistical analysis is called __________.

5) One of the advantages for taking a sample instead of conducting a census is this:

a sample is more accurate than census

a sample is difficult to take

a sample cannot be trusted

a sample can save money when data collection process is destructive

6) Selection of the winning numbers is a lottery is an example of __________.

convenience sampling

random sampling

nonrandom sampling

regulatory sampling

7) A type of random sampling in which the population is divided into non-overlapping subpopulations is called __________.

stratified random sampling

cluster sampling

systematic random sampling

regulatory sampling

8) A type of random sampling in which every kth item (where k is some number) in the population is selected for inclusion in the sample is called __________.

stratified random sampling

cluster sampling

systematic sampling

regulatory sampling

9) Judgment sampling is an example of __________.

convenience sampling

random sampling

nonrandom (non-probabilistic) sampling

justice department sampling

10) For the following data, construct a frequency distribution with six classes.

57 23 35 18 21

26 51 47 29 21

46 43 29 23 39

50 41 19 36 28

31 42 52 29 18

28 46 33 28 20

11) What type of graph would be most appropriate for the frequency distribution above?

Pie chart

Bar chart

Pareto diagram

Histogram

12) For the following frequency distribution, determine the relative frequency, percent, and the cumulative frequency.

*Round your answer to 3 decimal places, the tolerance is +/-0.001.

Class Interval Frequency

20–under 25 17

25–under 30 20

30–under 35 16

35–under 40 15

40–under 45 8

45–under 50 6

TOTAL 82

13) True or False: Frequency distribution is a summary of data presented in the form of class intervals and frequencies.

14) True or False: The range of a data set is defined as the difference between the mean and the median.

15) True or False: The sum of the relative frequencies of a grouped data set is always equal to one.

16) The U.S. Department of the Interior releases figures on mineral production. Following are the values (in billions of dollars) of the 15 leading states in nonfuel mineral production in the United States in 2008.

1.68, 1.81, 1.85, 1.89, 2.05, 2.05, 2.08, 2.74, 3.21, 3.30, 4.00, 4.17, 4.20, 6.48, 7.84

a. Calculate the mean, median, and mode.

b. Calculate the range, interquartile range, sample variance, and sample standard deviation.

c. Compute the coefficient of skewness for these data and interpret.

17) The following graphic of residential housing data (selling price and size in square feet) indicates:

a correlation close to -1

a correlation close to 0 (no relation between the two variables)

a correlation close to 1

a negative relationship between the two variables

18) The Polk Company reported that the average age of a car on U.S. roads in a recent year was 7.5 years.

a) Suppose the distribution of ages of cars on U.S. roads is approximately bell-shaped. If 99.7% of the ages are between 1 year and 14 years, what is the standard deviation of car age?

b) Suppose the standard deviation is 1.7 years and the mean is 7.5 years. Between what two values would 95% of the car ages fall?

19) A large manufacturing firm tests job applicants who recently graduated from college. The test scores are bell shaped with a mean of 500 and a standard deviation of 50.

a) What proportion of people get scores between 400 and 600?

b) What proportion of people get scores higher than 450?

c) Management is considering placing a new hire in an upper level management position if the person scores in the upper 0.15% of the distribution. What is the lowest score a college graduate can earn to qualify for the position?

20) According to the Bureau of Labor Statistics, the average annual salary of a worker in Detroit, Michigan, is $35,748. Suppose the median annual salary for a worker in this group is $31,369 and the mode is $29,500.

a) Is the distribution of salaries for this group skewed? If so, how and why?

b) Which of these measures of central tendency would you use to describe these data? Why?

21) True or False: The median is the most frequently occurring value in a set of data.

22) True or False: A disadvantage of the mean as the measure of central tendency is that it is affected by extremely large or extremely small values in the data set.

23) True or False: The variance is the average of the squared deviations about the arithmetic mean for a set of numbers.

24) What is the median for the following five numbers? 223, 264, 216, 218, 229

25) The second quartile of a data set is always equal to its ________.

26) The sum of deviations from the mean for a data set is equal to __________.

27) Scores obtained by students in an advanced placement test has a symmetric mound shaped (bell shaped) distribution witha mean of 70 and a standard deviation of 10. What is the proportion of students who received between 60 and 80 points.

28) For the previous problem, what is the proportion of students who received less than 50 points?

29) The following joint probability table contains a breakdown on the age and gender of U.S. physicians in a recent year, as reported by the American Medical Association.

Age of U.S. Physicians

< 35 35 - 44 45 - 54 55 - 64 > 65 TOTAL

Male 0.11 0.20 0.19 0.12 0.16 0.78

Female 0.07 0.08 0.04 0.02 0.01 0.22

TOTAL 0.18 0.28 0.23 0.14 0.17 1.00

a) What is the probability that one randomly selected physician is 35–44 years old?

b) What is the probability that one randomly selected physician is both a woman and 45–54 years old?

c) What is the probability that one randomly selected physician is a man or is 35–44 years old?

d) What is the probability that one randomly selected physician is less than 35 years old or 55–64 years old?

e) What is the probability that one randomly selected physician is a woman if she is 45–54 years old?

f) What is the probability that a randomly selected physician is neither a woman nor 55–64 years old?

30) Purchasing Survey asked purchasing professionals what sales traits impressed them most in a sales representative. Seventy-eight percent selected “thoroughness.” Forty percent responded “knowledge of your own product.” The purchasing professionals were allowed to list more than one trait. Suppose 27% of the purchasing professionals listed both “thoroughness” and “knowledge of your own product” as sales traits that impressed them most. A purchasing professional is randomly sampled.

a) Make a probability table including the above information.

b) What is the probability that the professional selected “thoroughness” or “knowledge of your own product”?

c) What is the probability that the professional selected neither “thoroughness” nor “knowledge of your own product”?

d) If it is known that the professional selected “thoroughness,” what is the probability that the professional selected “knowledge of your own product”?

e) What is the probability that the professional did not select “thoroughness” and did select “knowledge of your own product”?

31) From a previous midterm:The table below contains data from a sample of 200 people regarding opinion about the latest congressional plan to eliminate anti-trust exemptions for professional baseball (broken down by gender).

OPINION ABOUT THE PLAN

For Neutral Against Totals

Female 38 54 12 104

Male 12 36 48 96

Totals 50 90 60 200

Please show your work for parts “a” through “e” or no credit will be given!

a) What is the probability that a person selected at random is for the plan?

b) If we know that the person is a female, what is the probability that the person is for the plan?

c) What is the probability that the person is male and is against the plan?

d) What is the probability that the person is male or is neutral about the plan?

e) Is opinion about the plan related to gender, or are opinion and gender independent? Please use statistical concepts and numerical calculations in your answer.

32) True or False: If two events are independent, the joint probability of the two events is always equal to the product of the marginal probabilities of two events.

33) True or False: If the conditional probability of an event A given another event B is same as the marginal probability of the event A, then events A and B are mutually exclusive.

34) If the occurrence or non-occurrence of one event does not affect the occurrence or non-occurrence of another event, the two events are ________________________.

35) A listing of all elementary outcomes (i.e. the outcomes which cannot be broken down into other events) of an experiment (i.e. a decision making situation under uncertainty) is called a __________.

36) How many different combinations of a 3-member debating team can be formed from a group of 16 qualified students?

Business questions

As mentioned earlier, the mid-term will have conceptual and quantitative multiple-choice questions. You need to read all 4 chapters and you need to be able to solve problems in all 4 chapters in order to do well in this test.

The following are for review and learning purposes only. I am not indicating that identical or similar problems will be in the test. As I have indicated in the class syllabus, all the exams in this course will have multiple-choice questions and problems.

Suggestion: treat this review set as you would an actual test. Sit down with your one page of notes and your calculator, and give it a try. That way you will know what areas you still need to study.

ADMN 210

Answers to Review for Midterm #1

1) Classify each of the following as nominal, ordinal, interval, or ratio data.

a. The time required to produce each tire on an assembly line – ratio since it is numeric with a valid 0 point meaning “lack of”

b. The number of quarts of milk a family drinks in amonth -ratio since it is numeric with a valid 0 point meaning “lack of”

c. The ranking of four machines in your plant after they have been designated as excellent, good, satisfactory, and poor – ordinal since it is ranking data only

d. The telephone area code of clients in the United States – nominal since it is a label

e. The age of each of your employees – ratio since it is numeric with a valid 0 point meaning “lack of”

f. The dollar sales at the local pizza house each month -ratio since it is numeric with a valid 0 point meaning “lack of”

g. An employee’s identification number – nominal since it is a label

h. The response time of an emergency unit -ratio since it is numeric with a valid 0 point meaning “lack of”

2) True or False: The highest level of data measurement is the ratio-level measurement.

True (you can do the most powerful analysis with this kind of data)

3) True or False: Interval- and ratio-level data are also referred to as categorical data.

False (Interval and ratio level data are numeric and therefore quantitative, NOT qualitative….Nominal is qualitative)

4) A small portion or a subset of the population on which data is collected for conducting statistical analysis is called __________.

A sample! A population is the total group, a census IS the population, and a data set can be either a sample or a population.

5) One of the advantages for taking a sample instead of conducting a census is this:

a sample is more accurate than census

a sample is difficult to take

a sample cannot be trusted

a sample can save money when data collection process is destructive

6) Selection of the winning numbers is a lottery is an example of __________.

convenience sampling

random sampling

nonrandom sampling

regulatory sampling

7) A type of random sampling in which the population is divided into non-overlapping subpopulations is called __________.

stratified random sampling

cluster sampling

systematic random sampling

regulatory sampling

8) A type of random sampling in which every kth item (where k is some number) in the population is selected for inclusion in the sample is called __________.

stratified random sampling

cluster sampling

systematic sampling

regulatory sampling

9) Judgment sampling is an example of __________.

convenience sampling

random sampling

nonrandom(non-probabilistic) sampling

justice department sampling

10) For the following data, construct a frequency distribution with six classes.

57 23 35 18 21

26 51 47 29 21

46 43 29 23 39

50 41 19 36 28

31 42 52 29 18

28 46 33 28 20

Class width = (high – low)/6 = (57 – 18)/6 = 6.5. Let’s round up to 7 for convenience. NOTE: each student will have something slightly different!

Class Interval Frequency

18 – under 25 8 just count up how many observations are 18 through 24

25 – under 32 8

32 – under 39 3

39 – under 46 4

46 – under 53 6

53 – under 60 1

TOTAL 30

11) What type of graph would be most appropriate for the frequency distribution above?

Pie chart

Bar chart

Pareto diagram

Histogram

12) For the following frequency distribution, determine the relative frequency, percent, and the cumulative frequency.

*Round your answer to 3 decimal places, the tolerance is +/-0.001.

Class Interval Frequency Relative Frequency Percent Cumulative Frequency

20–under 25 17 17/82 = .207* 20.7% 17

25–under 30 20 20/82 = .244* 24.4% 17 + 20 = 37

30–under 35 16 .195* 19.5% 37 + 16 = 53

35–under 40 15 .183* 18.3% 53 + 15 = 68

40–under 45 8 .098* 9.8% 68 + 8 = 76

45–under 50 6 .073* 7.3% 76 + 6 = 82

TOTAL 82 1.000 100.0%

13) True or False: Frequency distribution is a summary of data presented in the form of class intervals and frequencies.

True – that’s the definition of a frequency distribution!

14) True or False: The range of a data set is defined as the difference between the mean and the median.

False – Range is the difference between the highest and lowest numbers in the data!

15) True or False: The sum of the relative frequencies of a grouped data set is always equal to one.

True – don’t forget, relative frequencies are just decimal versions of percentages, and percentages have to add up to 100%.

16) The U.S. Department of the Interior releases figures on mineral production. Following are the values (in billions of dollars) of the 15 leading states in nonfuel mineral production in the United States in 2008.

1.68, 1.81, 1.85, 1.89, 2.05, 2.05, 2.08, 2.74, 3.21, 3.30, 4.00, 4.17, 4.20, 6.48, 7.84

a. Calculate the mean, median, and mode.

Mean = sum of all data/15 = $3.29 billion

Median: the position = 2*(15+1)/4 = 8th location = $2.74 billion

Mode: 2.05 since it is the only value that appears more than once

b. Calculate the range, interquartile range, sample variance, and sample standard deviation.

Range = 7.84 – 1.68 = 6.16

Interquartile range = Q3 – Q1.

Q1 is at the following location: (15+1)/4 = 4thlocation = $1.89 billion

Q3 is at the following location: 3*(15+1)/4 = 12thlocation = $4.17 billion

So Interquartile range = 4.17 – 1.89 = 2.28

NOTE: make sure you understand what quartiles mean!

Sample variance = 3.3321 (See below)

Sample standard deviation = 1.8254 (see below)

Value ($ billions) X-mean squared

1.68 -1.61 2.5921

1.81 -1.48 2.1904

1.85 -1.44 2.0736

1.89 -1.40 1.9600

2.05 -1.24 1.5376

2.05 -1.24 1.5376

2.08 -1.21 1.4641

2.74 -0.55 0.3025

3.21 -0.08 0.0064

3.30 0.01 0.0001

4.00 0.71 0.5041

4.17 0.88 0.7744

4.20 0.91 0.8281

6.48 3.19 10.1761

7.84 4.55 20.7025

TOTAL 49.35 46.6496

MEAN 3.29 Variance 3.3321 =46.6496/(15-1)

SD 1.8254 =sqrt(3.3321)

c. Compute the coefficient of skewness for these data and interpret. [Ignore]

Just use the Data Analysis portion of Excel and interpret. It is 1.48, so there is a right skew of the data (slightly long right hand tail)

17) The following graphic of residential housing data (selling price and size in square feet) indicates:

a correlation close to -1

a correlation close to 0 (no relation between the two variables)

a correlation close to 1

a negative relationship between the two variables

18) The Polk Company reported that the average age of a car on U.S. roads in a recent year was 7.5 years.

a) Suppose the distribution of ages of cars on U.S. roads is approximately bell-shaped. If 99.7% of the ages are between 1 year and 14 years, what is the standard deviation of car age?

We know that 99.7% of the data are within 3 standard deviations of the mean = 6.5 years (I found that from 14 – 7.5 or 7.5 – 1). So 6.5/3 = 2.167.

b) Suppose the standard deviation is 1.7 years and the mean is 7.5 years. Between what two values would 95% of the car ages fall?

95% of the data falls within 2 standard deviations of the mean.

So 7.5 + 2 * 1.7 = 10.9, and 7.5 – 2 * 1.7 = 4.1.

19) A large manufacturing firm tests job applicants who recently graduated from college. The test scores are bell shaped with a mean of 500 and a standard deviation of 50.

a) What proportion of people get scores between 400 and 600?

Points are 2 standard deviations away, so 95%

b) What proportion of people get scores higher than 450?

Point is 1 standard deviation away, so 68/2 + 50 = 84%

c) Management is considering placing a new hire in an upper level management position if the person scores in the upper 0.15% of the distribution. What is the lowest score a college graduate can earn to qualify for the position?

(X – 500)/50 = 3 SDs, so X = 500 + 3 * 50 = 650

20) According to the Bureau of Labor Statistics, the average annual salary of a worker in Detroit, Michigan, is $35,748. Suppose the median annual salary for a worker in this group is $31,369 and the mode is $29,500.

a) Is the distribution of salaries for this group skewed? If so, how and why?

Since these three measures are not equal, the distribution is skewed. The distribution is skewed to the right because the mean is greater than the median.

b) Which of these measures of central tendency would you use to describe these data? Why?

Often, the median is preferred in reporting income data because it yields information about the middle of the data while ignoring extremes.

21) True or False: The median is the most frequently occurring value in a set of data.False – the MODE is the most frequently occurring, not the median

22) True or False: A disadvantage of the mean as the measure of central tendency is that it is affected by extremely large or extremely small values in the data set.

True – that’s why you use the median for data sets with outliers!

23) True or False: The variance is the average of the squared deviations about the arithmetic mean for a set of numbers.

True

24) What is the median for the following five numbers? 223, 264, 216, 218, 229

Put the data in order: 216, 218, 223, 229, 256

The center number is the median = 223

25) The second quartile of a data set is always equal to its ________.

Median (by definition)

26) The sum of deviations from the mean for a data set is equal to __________.

Zero…that’s why we have to square the deviations to find the variance and standard deviation!

27) Scores obtained by students in an advanced placement test has a symmetric mound shaped (bell shaped) distribution witha mean of 70 and a standard deviation of 10. What is the proportion of students who received between 60 and 80 points.

60 is 1 standard deviation to the left of center and 80 is 1 standard deviation to the right, so by the empirical rule the answer is about 68%

28) For the previous problem, what is the proportion of students who received less than 50 points?

Find the Z point for 50: (50 – 70)/10 = -2. The area between -2 and +2 is 95%, so the area “less than 50” is (100% – 95%)/2 = 2.5%

29) The following joint probability table contains a breakdown on the age and gender of U.S. physicians in a recent year, as reported by the American Medical Association.

Age of U.S. Physicians

< 35 35 - 44 45 - 54 55 - 64 > 65 TOTAL

Male 0.11 0.20 0.19 0.12 0.16 0.78

Female 0.07 0.08 0.04 0.02 0.01 0.22

TOTAL 0.18 0.28 0.23 0.14 0.17 1.00

a) What is the probability that one randomly selected physician is 35–44 years old?

P(35 – 44) = .28/1.00 = .28

NOTE: in a probability table (as opposed to a frequency table like the one in example #31), you don’t really have to be dividing by the total since the total is 1.00. I write it in to remind you that you MUST divide by something when you are finding probabilities!

b) What is the probability that one randomly selected physician is both a woman and 45–54 years old?

P(woman and 45 – 54) = intersection = 0.04/1.00 = .04

c) What is the probability that one randomly selected physician is a man or is 35–44 years old?

P(man or 35 – 44) = .78 + .28 – .20 = .86/1.00 = .86

d) What is the probability that one randomly selected physician is less than 35 years old or 55–64 years old?

P(< 35 or 55 – 64) = .18/1.00 + .14/1.00 = .32 (NOTE: no need to subtract anything since there are no “common points”…that is, those two categories are mutually exclusive)
e) What is the probability that one randomly selected physician is a woman if she is 45–54 years old?
P(woman | 45 – 54) = .04/.23 = 0.1739
f) What is the probability that a randomly selected physician is neither a woman nor 55–64 years old?
P(not woman and not 55 – 64) = P(man and <54 or >65)

= (.11+.2+.19+.16)/1.00 = .66

30) Purchasing Survey asked purchasing professionals what sales traits impressed them most in a sales representative. Seventy-eight percent selected “thoroughness.” Forty percent responded “knowledge of your own product.” The purchasing professionals were allowed to list more than one trait. Suppose 27% of the purchasing professionals listed both “thoroughness” and “knowledge of your own product” as sales traits that impressed them most. A purchasing professional is randomly sampled.

a) Make a probability table including the above information.

b) What is the probability that the professional selected “thoroughness” or “knowledge of your own product”?

Mentioned knowledge Didn’t mention knowledge TOTAL

Mentioned thoroughness .27 .78 – .27 = .51 .78

Didn’t mention thoroughness .40 – .27 = .13 .60 – .51 = .09 1 – .78 = .22

TOTAL .40 1 – .40 = .60 1.00

So P(thorough or knowledge) = (.78 + .40 – .27)/1.00 = .91

c) What is the probability that the professional selected neither “thoroughness” nor “knowledge of your own product”?

P(neither thorough nor knowledge) = P(not thorough and not knowledge)

= intersection = 0.09/1.00 = 0.09

d) If it is known that the professional selected “thoroughness,” what is the probability that the professional selected “knowledge of your own product”?

P(knowledge | thorough) = .27/.78 = 0.346

e) What is the probability that the professional did not select “thoroughness” and did select “knowledge of your own product”?

P(didn’t mention thoroughness and did mention knowledge) = intersection

= 0.13/1.00 = 0.13

31) The table below contains data from a sample of 200 people regarding opinion about the latest congressional plan to eliminate anti-trust exemptions for professional baseball (broken down by gender).

OPINION ABOUT THE PLAN

For Neutral Against Totals

Female 38 54 12 104

Male 12 36 48 96

Totals 50 90 60 200

Please show your work for parts “a” through “e” or no credit will be given!

a) What is the probability that a person selected at random is for the plan?

P(for) = 50/200 = .25

b) If we know that the person is a female, what is the probability that the person is for the plan?

P(for | female) = 38/104 = .365

c) What is the probability that the person is male and against the plan?

P(male and against) = 48/200 = .24

d) What is the probability that the person is male or is neutral about the plan?

P(male or neutral) = (96+90-36)/200 = .75

e) Is opinion about the plan related to gender, or are opinion and gender independent? Please use statistical concepts and numerical calculations in your answer, or no credit will be given.

Check to see if P(A) = P(A|B) = P(A|C) etc.

Is P(for the plan) = P(for | female)? .25 ≠ .365 so NOT independent

32) True or False: If two events are independent, the joint probability of the two events is always equal to the product of the marginal probabilities of two events.

True – Think about it…P(A and B) = P(A) * P(B | A). But if A and B are independent, then P(B | A) is the same as P(B). In other words, if A and B are independent, the P(A and B) = P(A) * P(B). We will use that in chapter 5 and more!

33) True or False: If the conditional probability of an event A given another event B is same as the marginal probability of the event A, then events A and B are mutually exclusive.

False – as I just said, if P(A | B) = P(A), that means that A and B are independent…that doesn’t mean that A and B are mutually exclusive. Remember: if A and B are mutually exclusive, then if one happens, the other can’t…in other words, P(A and B) = 0.

34) If the occurrence or non-occurrence of one event does not affect the occurrence or non-occurrence of another event, the two events are ________________________. Independent (by definition)

35) A listing of all elementary outcomes (i.e. the outcomes which cannot be broken down into other events) of an experiment (i.e. a decision making situation under uncertainty) is called a __________.

sample space

36) How many different combinations of a 3-member debating team can be formed from a group of 16 qualified students?

16C3 = 16!/3!(16-3)! = 16 * 15 * 14/(3 * 2 * 1) = 560