Physiotherapy exam 2019-05-27
Degrees: Physiotherapy
Date: May 27, 2019
Descriptive Statistics and Regression
Question 1
A study tries to determine the effect of smoking during the pregnancy in the weight of newborns.
The table below shows the daily number of cigarretes smoked by mothers (
-
Give the equation of the regression line of the weight of newborns on the daily number of cigarettes and interpret the slope.
-
Which regression model is better to predict the weight of newborns, the logarithmic or the exponential?
-
Use the best of the two previous regression models to predict the weight of a newborn whose mother smokes 12 cigarettes a day. Is this prediction reliable?
Use the following sums for the computations:
-
cigarettes, cigarettes . kg, kg . cigarettes kg Regression line: . The slope of the regression line is . That means that the weight of the newborn will decrease 0.0724 kg per daily cigarette smoked by the mother. -
log(cigarettes), log(cigarettes) . log(kg), log(kg) . cigarettes log(kg), log(cigarettes) kg Logarithmic coef. determination: Exponential coef. determination: Therefore, the logarithmic models fits better the data and is better to predict the weight. -
Logarithmic regression model:
. Prediction: kg. The coefficient of determination is high but the sample size small, so the prediction is not enterely reliable.
Question 2
The table below summarize the time that took to the runners to reach the finish in a long-distance race in Madrid:
In a another race in Paris, the mean of time was 40 minutes, the standard deviation 5 minutes and the coefficient of skewness
-
What percentage of runners took less than 42 minutes to reach the finish in Madrid?
-
Compute and interpret the interquartile range of the time for Madrid race.
-
In which race the mean of the time is more representative?
-
In which race the time have a more symmetric distribution?
-
In which race a time of 39 minutes to reach the finish is relatively smaller?
Use the following sums for the computations:
-
, thus approximately of runners finished before 42 minutes. -
min, min and min. The central 50% of times fall in a range of minutes. -
Madrid statistics:
min, min , min and . Paris statistics: . Thus, the mean of time in Madrid is a little bit more representative since the coef. of variation is smaller. -
, that is closer to 0 than the distribution of times in Paris, thus the distribution of times in Madrid is more symmetric. -
The standard score of the Madrid sample is
and the standard score of the Paris one , thus a time of 39 min is relatively smaller in the sample of Paris.
Probability and Random Variables
Question 1
It has been observed that the concentration of a metabolite in urine can be used as a diagnostic test for a disease. The concentration (in mg/dl) in healthy individuals follows a normal distribution with mean 90 and standard deviation 8, while in sick individuals follows a normal distribution with mean 120 and standard deviation 10.
-
If the cut-off point is set at 105 mg/dl (positive above and negative below), what is the sensitivity and the specificity of the test?
-
If the cut-off point is set at 105 mg/dl and we assume a prevalence of 10%, what is the probability of a correct diagnostic?
-
If we want a sensitivity of 95%, where must we set the cut-off point? What would the specificity of the test be?
Let
-
Sensitivity:
. Specificity: . -
. -
Cut-off point
mg/dl. Specificity: .
Question 2
Let
-
Compute
and . -
Compute
and . -
Compute
and . -
Compute
and . -
Are
and independent?
-
and . -
and . -
and . -
and . -
No, they are dependent since
.
Question 3
The employees of a courier company send an average of
-
Compute the probability that a random person of the company sends 5 messages in a period of half an hour.
-
If we draw randomly 10 women of this company, what is the probability that at least 3 of them sends more than one message in a period of one hour?
-
If we draw randomly 100 men of this company, what is the probability that none of them sends less than 2 messages in a period of a quarter of an hour?
-
Let
be the number of messages sent in 1 hour. Then and . -
Let
be the number of women in a sample of 10 that sent more than 1 message in 1 hour. Then and . -
Let
be the number of men in a sample of 100 that sent less than 2 messages in a quarter of hour. Then and .