Date: June 23, 2016

Question 1

It is believed that the age at which a person finish their first marathon depends on gender. To check it, a sample of 180 marathon runners was drawn. For every runner it was recorded the gender, the age (in years) when they finish the first marathon and if they finish with tendinitis. The data are summarized in the table below.

Males  Females
Age Finished With tendinitis   Finished Width tendinitis
(10,20] 7 2   3 1
(20,30] 35 12   22 5
(30,40] 30 6   29 4
(40,50] 15 2   22 3
(50,60] 9 1   3 0
(60,70] 4 0   1 0
1. Calculate the average age at which it is finished the first marathon, both of males and females. Which mean is more representative? Justify the answer.

2. Calculate the interquartile range of the age for the joint distribution (joining males and females) and interpret it.

3. What age distribution is more asymmetric, males or females distribution. Justify the answer.

4. Taking into account the relative spread in each group, who finished a marathon before, a man that finished his first marathon at the age of 48 or a woman that finished her first marathon at the age of 47? Justify the answer.

5. Using frequencies to approximate probabilities, compute the following probabilities:

• Probability that a runner finish their first marathon with tendinitis.
• Probability that a man 40 or less years old finish their first marathon with tendinitis.
• Probability that a woman who finish her first marathon with tendinitis is between 20 and 30 years old.

Use the following sums for the calculations: Males: $\sum n_i = 100$, $\sum x_i n_i = 3460$, $\sum x_i^2 n_i= 134700$, $\sum(x_i-\bar x)^3 n_i =121987$, $\sum(x_i-\bar x)^4 n_i =6480792$ Females: $\sum n_i = 80$, $\sum x_i n_i = 2830$, $\sum x_i^2 n_i= 107800$, $\sum(x_i-\bar x)^3 n_i =18346$, $\sum(x_i-\bar x)^4 n_i =2175992$

Question 2

A study tries to determine if the number of muscular injuries of professional athletes depends on stress. The study lasted four years and measured the average level of stress and the number of muscular injuries suffered by a group of athletes. The collected data is shown in the table below.

 Stress ($X$) 2.3 3.8 5.1 1.4 6.9 7.2 3.2 8.3 Injuries ($Y$) 3 6 7 2 6 8 4 8
1. Calculate the linear regression model of the number of injuries on stress.

2. According to the most appropriate linear model, what stress level is expected for an athlete that suffered 4 injuries in that period?

3. Calculate the logarithmic regression model of the number of injuries on stress.

4. Which regression model is better, the linear or the logarithmic? Justify the answer.

Use the following sums for the calculations: $\sum x_i = 38.2$, $\sum y_j=44$, $\sum \log(x_i)=11.3186$, $\sum \log(y_j)=12.8664$, $\sum x_i^2 = 226.28$, $\sum y_j^2=278$, $\sum \log^2(x_i)=18.7028$, $\sum \log^2(y_j)=22.4647$, $\sum x_iy_j = 246.4$, $\sum x_i\log(y_j)=69.2607$, $\sum \log(x_i)y_j=71.5508$, $\sum \log(x_i)\log(y_j)=20.2895$.

Question 3

A diagnostic test with a sensitivity of 96% and a specificity of 93% is used to determine a disease with a prevalence of 10%.

1. What are the positive and negative predictive values of the test?

2. If the test is applied to 15 persons, what is the probability of having more than one positive outcomes?

3. If the test is applied to 50 persons, what is the probability of having a wrong diagnosis in more than two persons?

Question 4

It is known from previous studies that the hours of study of Statistics for students that pass the subject follows a normal distribution with mean 50 hours and standard deviation unknown; while for students that fail the subject follows a normal distribution with mean unknown and standard deviation 10 hours. If 20% of students that pass study more than 70 hours and 30% of students that fail study less than 25 hours,

1. Calculate the standard deviation of the hours of study distribution for students that pass and the mean of the distribution for students that fail.

2. If a year there are 200 students enrolled in the subject and 150 of them pass, how many of the total students have studied more than 55 hours?