Physiotherapy exam 2020-05-25

Degrees: Physiotherapy
Date: May 25, 2020

Descriptive Statistics and Regression

Question 1

In a course there are 150 students, of which 50 are working students and the other 100 non-working students. The table below shows the frequency distribution of the grade in an exam of these two groups:

GradeNum non-working studentsNum working students028224159462519683811810149

  1. Compute the percentage of students that passed the exam (a grade 5 or above) in both groups, working and non-working students.

  2. In which group is there a higher relative dispersion of the grade with respect to the mean?

  3. Which grade distribution is more asymmetric, the distribution of working students, or the non-working students one?

  4. To apply for a scholarship to go abroad, the grade must be transformed applying the linear transformation Y=0.5+X1.45. Compute the mean of Y for the two groups. How changes the asymmetry of the two groups?

  5. Which grade is relatively higher, 6 in the working students group, or 7 in the non-working students group?

Use the following sums for the computations:
Non-working students: xini=570, xi2ni=3764, (xix¯)3ni=547.8 and (xix¯)4ni=6475.73.
Working students: yini=282, yi2ni=1826, (yiy¯)3ni=1.31 and (yiy¯)4ni=2552.14.

  1. 35.5% of non-working students passed and 41% of working students passed.
  2. Non-working students: x¯=5.7, s2=5.15, s=2.2694 and cv=0.3981. Working students: y¯=5.64, s2=4.7104, s=2.1703 and cv=0.3848. The sample of non-working students has a slightly higher relative dispersion with respect to the mean as the coefficient of variation is greater.
  3. Non-working students: g1=0.4687. Working students: g1=0.0026. Thus, the sample of non-working students is more asymmetric as the coefficient os skewness is further from 0.
  4. Non-working students: y¯=8.765. Working students: x¯=8.678. The coefficient of skewness does not change as the slope of the linear transformation is positive.
  5. Non-working students: z(7)=0.5728. Working students: z(6)=0.1659. Thus, a 7 in the sample of non-working students is relatively higher than than a 6 in the sample of working students, as its standard score is greater.

Question 2

The effect of a doping substance on the response time to a given stimulus was analyzed in a group of patients. The same amount of substance was administered in successive doses, from 10 to 80 mg, to all the patients. The table below shows the average response time to the stimulus, expressed in hundredths of a second:

Dose (mg)1020304050607080Response time (102 s)28466281100132195302

  1. According to the linear regression model, how much will the response time increase or decrease for each mg we increase the dose?

  2. Based on the exponential model, what will be the expected response time for a 75 mg dose?

  3. If a response time greater than one second is considered dangerous for health, from what level should the administration of the doping substance be regulated, or even prohibited, according to the logarithmic model?

Use the following sums for the computations:
xi=360 mg, log(xi)=29.0253 log(mg), yj=946 102 s, log(yj)=36.1538 log(102 s),
xi2=20400 mg2, log(xi)2=108.7717 log(mg)2, yj2=169958 102 s2, log(yj)2=167.5694 log(102 s)2,
xiyj=57030 mg102 s, xilog(yj)=1758.6576 mglog(102 s), log(xi)yj=3795.4339 log(mg)102 s, log(xi)log(yj)=134.823 log(mg)log(102 s).

  1. x¯=45 mg, sx2=525 mg2. y¯=118.25 102 s, sy2=7261.6875 104 s2. sxy=1807.5 mg102 s. byx=3.4429 102 s/mg. Therefore, the response time increases 3.4429 hundredths of a second for each mg the dose is increased.
  2. log(y)=4.5192 log(102 s), slog(y)2=0.5227 log(102 s)2. sxlog(y)=16.4669 mglog(102 s). Exponential regression model: y=e3.1078+0.0314x. Prediction: y(75)=235.1434 102 s. Exponential coefficient of determination: r2=0.988 Thus, the exponential model fits almost perfectly to the cloud of points of the scatter plot, but the sample is too small to get reliable predictions.
  3. Logarithmic regression model: x=97.3603+31.501ln(y). Prediction: x(100)=47.7072 mg.

Probability and Random Variables

Question 3

A hospital orders a DNA compatibility test to three labs A, B and C. Lab A performs 40 test a day, lab B 50, and lab C 60. It is known that the probability of a wrong diagnose is 20% in lab A, 18% in lab B and 22% in lab C. If we select a random test of the hospital,

  1. Compute the probability of wrong diagnose in that test.

  2. If the test is wrong, what is the probability that it has been performed by lab B?

  3. If the test is right, which lab is more likely to have performed the test?

Let A, B and C be the events of performing the test in labs A, B and C respectively, and R the event of getting a right diagnose. According to the statement P(A)=0.2667, P(B)=0.3333, P(C)=0.4, P(R|A)=0.8, P(R|B)=0.82 and P(R|C)=0.78.

  1. P(R)=0.2013.
  2. P(B|R)=0.298.
  3. P(A|R)=0.2671, P(B|R)=0.3422 and P(C|R)=0.3907, thus, it is more likely that it has been performed in lab C.

Question 4

An epidemiological study tries to determine the effectiveness of face masks to prevent the COVID19. In a sample 4000 persons without the virus and 1000 persons with it were selected. I was observed that in the group of infected people 120 had used face masks in the two previous weeks, while in the non-infected group, 1250 had used face masks in the two previous weeks.

  1. Compute the relative risk of been infected with face masks.

  2. Compute the odds ratio of been infected with face masks.

  3. Which association measure is more reliable?

Let D be the event of being infected.

  1. RR(D)=0.3613. Thus, the risk of being infected with face mask is almost one third of the likelihood of been infected without face mask.
  2. OR(D)=0.3. Thus, the odds of being infected with face mask is less than one third of the likelihood of been infected without face mask.
  3. As we can not compute the prevalence of D, the odds ratio is more reliable.

Question 5

During the COVID19 quarantine a telephone exchange with 4 telephone operators received an average of 12 calls per day. Assuming that the calls are equally distributed among the operators,

  1. Compute the probability that an operator received more than 3 calls a day.

  2. Compute the probability that all the the operators received some call a day.

  1. Let X be the number of calls that arrive to one operator, then XP(3), and P(X>3)=0.3528.
  2. Let Y be the number of operators that receive some call, then YB(4,0.9502), and P(Y=4)=0.8152.

Question 6

In a course with 200 students the score of a test to measure the intelligence quotient follows a normal distribution. After applying the test to the students 10 of them got a score above 130 and 30 of them a score below 60.

  1. Compute the mean and the standard deviation of the score.

  2. How many students will have a score between 90 and 95?

  3. Compute the limits of the interval centered at the mean that accumulates 95% of the scores.

  1. Let X be the score of the test then XN(87.058,26.1069)
  2. P(90X95)=0.0747, that is, around 14.9309 students.
  3. Interval with 95% of probability (35.8895,138.2265).
Previous
Next