Pharmacy exam 2016-11-28 Degrees: Pharmacy, Biotechnology Date: November 28, 2016 Question 1 The table below gives the distribution of points obtained by students in the MIR exam last year. $$ \begin{array}{|c|r|r|r|r|r|} \hline x & n_i & x_in_i & x_i^2n_i & (x_i-\bar x)^3n_i & (x_i-\bar x)^4n_i\newline \hline (0,40] & 84 & 1680 & 33600 & -12155062.50 & 638140781.25\newline (40,80] & 185 & 11100 & 666000 & -361328.13 & 4516601.56\newline (80,120] & 72 & 7200 & 720000 & 1497375.00 & 41177812.50\newline (120,160] & 40 & 5600 & 784000 & 12301875.00 & 830376562.50\newline (160,200] & 19 & 3420 & 615600 & 23603640.63 & 2537391367.19\newline \hline \sum & 400 & 29000 & 2819200 & 24886500.00 & 4051603125.00\newline \hline \end{array} $$ Compute the interquartile range and explain your result. Are there outliers in the sample? The minimum number of points to pass the exam is 150; what percentage of students passed the exam? Study the representativity of the mean. According to the values of skewness and kurtosis, can we assume that the sample has been taken from a normally distributed population? Compute the standardized points of a student that got 150 points in the MIR. Solution $Q_1=43.48$ points, $Q_3=97.78$ points and $IQR=54.3$ points. Fences: $F_1=-37.97$ points and $F_2=179.23$ points. Thus, there are outliers. $F_{150}=0.925$, so the percentage of students that passed the exam is $7.5%$. $\bar x=72.5$ points, $s^2=1791.75$ points², $s=42.3291$ points, $cv=0.5838$. As the coefficient of variation is greater than 0.5 but not too much there is a moderate variability and the mean is moderately representative. $g_1=0.8203$, so the distribution is right-skewed. $g_2=0.1551$, so the distribution is a little bit more peaked than a bell curve (leptokurtic). As $g_1$ and $g_2$ are between -2 and 2 we can assume that the sample has been taken from a normaly distributed population. $z(150)=1.83$. Question 2 The table show the data of the GDP (Gross Domestic Product) per capita (thousands of euros) and infant mortality (children per thousand) from 1993 till 2000. Year GDP Mortality 1993 17 6.0 1994 17 5.6 1995 18 5.2 1996 18 4.9 1997 19 4.6 1998 20 4.3 1999 21 4.1 2000 22 4.0 Estimate the value of the GDP for an infant mortality of 3.8 children per thousand using the linear regression model. Which regression model explains better the GDP as a function of the infant mortality, a linear model or an exponential one? If we assume that the GPD per capita in year 2001 was 23 thousand €, what will be the expected infant mortality, according to the exponential regression model? Consider the linear models of GDP on infant mortality, and infant mortality on GDP; which of the two is more reliable? Use the following sums for the computations ($X$=GDP and $Y$=Infant mortality): $\sum x_i=152$, $\sum \log(x_i)=23.5229$, $\sum y_j=38.7$, $\sum \log(y_j)=12.5344$, $\sum x_i^2=2912$, $\sum \log(x_i)^2=69.2305$, $\sum y_j^2=190.87$, $\sum \log(y_j)^2=19.7912$, $\sum x_iy_j=726.5$, $\sum x_i\log(y_j)=236.3256$, $\sum \log(x_i)y_j=113.3308$, $\sum \log(x_i)\log(y_j)=36.76$. Solution Linear model of GDF on infant mortality: $\bar x=19$ 10³€, $s_x^2=3$ 10⁶€. $\bar y=4.8375$ children per thousand, $s_y^2=0.4573$ (children per thousan)². $s_{xy}=-1.1$ 10³€⋅children per thousand. Regression line of GDP on infant mortality: $x=30.6351 + -2.4052y$. $x(3.8) =21.4954$. $\overline{\log(x)}=2.9404$ log(10³€), $s_{\log(x)}^2=0.0081$ log(10³€)². $s_{\log(x)y}=-0.0577$ log(10³€)•children per thousand. Linear coefficient of determination of GDP on infant mortality $r^2=0.8819$. Exponential coefficient of determination of GDP on infant mortality $r^2=0.9002$. Thus, the exponential model explains a little bit better the relation between the GDP and the infant mortality. $\overline{\log(y)}=1.5668$ log(children per thousand), $s_{\log(y)}^2=0.019$ log(children per thousand)². $s_{x\log(y)}=-0.2284$ 10³€⋅log(children per thousand). Exponential model of infant mortality on GDP: $y=e^{3.0135 + -0.0761x}$. $y(23)=3.5332$. The reliability of both models is the same as they have the same coefficient of determination. Question 3 Consider two variables $X$ and $Y$. Assume that the regression lines of the linear models intersect at the point $(2,3)$, and that, according to the appropriate linear model, the expected value of $Y$ for $x=3$ is $y=1$. How much will $Y$ change, according to the linear model, when $X$ increases by one unit? If the coefficient of linear correlation is $-0.8$, how much will $X$ change, according to the linear model, when $Y$ increases by one unit? Solution $\bar x=2$ and $\bar y=3$. $b_{yx}=-2$, so $Y$ decreases 2 units when $X$ increases by one unit. $b_{xy}=-0.32$, so $X$ decreases 0.32 units when $Y$ increases by one unit. Exam Statistics Biostatistics Previous Pharmacy exam 2017-01-10