Problems of Non Linear Regression

Exercise 1

A dietary center is testing a new diet in sample of 12 persons. The data below are the number of days of diet and the weight loss (in kg) until them for every person.

(33,3.9)  (51,5.9)  (30,3.2)  (55,6)  (38,4.9)  (62,6.2)  (35,4.5)  (60,6.1)  (44,5.6)  (69,6.2)  (47,5.8)  (40,5.3) 
  1. Draw the scatter plot. According to the point cloud, what type of regression model explains better the relation between the weight loss and the days of diet?
  2. Construct the linear regression model and the logarithmic regression model of the weight loss on the number of days of diet.
  3. Use the best model to predict the weight that will lose a person after 40 and 100 days of diet. Are these predictions reliable?

Use the following sums (X=days of diet and Y=weight loss): xi=564 days, log(xi)=45.8086 log(days), yj=63.6 kg, xi2=28234 days2, log(xi)2=175.6603 log(days)2, yj2=347.7 kg2, xiyj=3108.5 dayskg, log(xi)yj=245.4738 log(days)kg.

plot of chunk diet_scatterplot 2. Linear model
x¯=47 days, sx2=143.8333 days2.
y¯=5.3 kg, sy2=0.885 kg2.
sxy=9.9417 dayskg.
Regression line of weight loss on days of diet: y=2.0514+0.0691x.
r2=0.7765.

Logartihmic model
log(x)=3.8174 log(days), slog(x)2=0.0659 log(days)2.
slog(x)y=0.224 log(days)kg.
Logartihmic model of weight loss on days of diet: y=7.6678+3.397log(x).
r2=0.8599.
3. y(40)=4.8635 kg and y(100)=7.9761 kg. The predictions are reliable because the coefficient of determination is close to 1, but the last one is less reiable as 100 is far from the observed range of values in the sample.

Exercise 2

The concentration of a drug in blood, in mg/dl, depends on time, in hours, according to the data below.

Time2345678Drug concentration2536486486114168

  1. Construct the linear regression model of drug concentration on time.
  2. Construct the exponential regression model of drug concentration on time.
  3. Use the best regression model to predict the drug concentration after 4.8 hours? Is this prediction reliable? Justify your answer.

Use the following sums (C=Drug concentration and T=time): ti=35 h, log(ti)=10.6046 log(h), cj=541 mg/dl, log(cj)=29.147 log(mg/dl), ti2=203 h2, log(ti)2=17.5206 log(h)2, cj2=56937 (mg/dl)2, log(cj)2=124.0131 log(mg/dl)2, ticj=3328 hmg/dl, tilog(cj)=154.3387 hlog(mg/dl), log(ti)cj=951.6961 log(h)mg/dl, log(ti)log(cj)=46.08046 log(h)log(mg/dl).

  1. x¯=5 hours, sx2=4 hours2.
    y¯=77.2857 mg/dl, sy2=2160.7755 (mg/dl)2.
    sxy=89 hoursmg/dl.
    Regression line of drug concentration on time: y=33.9643+22.25x.
    r2=0.9165.

log(y)=4.1639 log(mg/dl), slog(y)2=0.3785 log(mg/dl)2.
sxlog(y)=1.2291 hourslog(mg/dl).
Exponential model of drug concentration on time: y=e2.6275+0.3073x.
r2=0.9979.
3. y(4.8)=60.4853 mg/dl.

Exercise 3

A researcher is studying the relation between the obesity and the response to pain. The obesity is measured as the percentage over the ideal weight, and the response to pain as the nociceptive flexion pain threshold. The results of the study appears in the table below.

Obesity89907730517562459020Pain threshold101211.54.55.5798153

  1. According to the scatter plot, what model explains better the relation of the response to pain on the obesity?
  2. According to the best regression model, what is the response to pain expected for a person with an obesity of 50%? Is this prection reliable?
  3. According to the best regression model, what is the expected obesity for a person with a pain threshold of 10? Is this prediction reliable?

Use the following sums (X=Obesity and Y=Pain threshold): xi=629, log(xi)=40.4121, yj=92.2, log(yj)=21.339, xi2=45445, log(xi)2=165.6795, yj2=960.14, log(yj)2=47.6231, xiyj=6537.7, xilog(yj)=1443.1275, log(xi)yj=387.5728, log(xi)log(yj)=88.3696.

plot of chunk obesity_pain_scatterplot 2. Linear model
x¯=62.9, sx2=588.09.
y¯=9.22, sy2=11.0056.
sxy=82.0356.
Regression line of pain threshold on obesity: y=1.3232+0.1255x.
r2=0.8422.

Logartihmic model
log(x)=4.0412, slog(x)2=0.2366.
slog(x)y=1.4973.
Logartihmic model of pain threshold on obesity: y=16.3578+6.3293log(x).
r2=0.8611.
y(50)=8.4023.
3.

Exponential model of obesity on pain threshold: x=e2.7868+0.1361y.
x(10)=63.2648.

Exercise 4

A blood bank keeps plasma at a temperature of 0ºF. When it is required for a blood transfusion, it is heated in an oven at a constant temperature of 120ºF. In an experiment it has been measured the temperature of plasma at different times during the heating. The results are in the table below.

Time (min)58152530374560Temperature (ºF)255086102110114118120

  1. Plot the scatter plot. Which type of regression model do you think explains better relationship between temperature and time?
  2. Which transformation should we apply to the variables to have a linear relationship?
  3. Compute the logarithmic regression of the temperature on time.
  4. According to the logarithmic model, what will the temperature of the plasma be after 15 minutes of heating? Is this prediction reliable? Justify your answer.

Use the following sums (X=Time and Y=Temperature): xi=225 min, log(xi)=24.5289 log(min), yj=725 ºF, log(yj)=35.2051 log(ºF), xi2=8833 min², log(xi)2=80.4703 log²(min), yj2=74345 ºF², log(yj)2=157.1023 log²(ºF), xiyj=24393 min⋅ºF, xilog(yj)=1048.0142 min⋅log(ºF), log(xi)yj=2431.7096 log(min)⋅ºF, log(xi)log(yj)=111.1165 log(min)log(ºF).

plot of chunk temperature_time_scatterplot A logarithmic model.
2. Apply a logarithmic transformation to time z=log(x).
  1. z¯=28.125 log(min), sz2=0.6577 log²(min).
    y¯=90.625 ºF, sy2=1080.2344 ºF².
    szy=26.0969 log(min)ºF.
    Logarithmic model of temperature on time: y=31.0325+39.6781log(x).
  2. y(15)=76.4176 ºF.
    r2=0.9586, that is close to 1, so the prediction is reliable.

Exercise 5

The activity of a radioactive substance depends on time according to the data in the table below.

t (hours)010203040506070A (107 disintegrations/s)25.98.162.570.810.250.080.030.01

  1. Represent graphically the data of radioactivity as a function of time. Which type of regression model explains better the relationship between radioactivity and time?
  2. Represent graphically the data of radioactivity as a function of time in a semi-logarithmic paper.
  3. Compute the regression line of the logarithm of radioactivity on time.
  4. Taking into account that radioactivity decay follows the formula \newline[ A(t) = A_0 e^{-\lambda t} \newline] where A0 is the number of disintegrations at the begining and λ is a disintegration constant, different for each radioactive substance, use the slope of the previous regression line to compute the disintegration constant for the substance.

Use the following sums (X=Time and Y=Radioactivity): xi=280 hours, yj=37.81 10⁷ disintegrations/s, log(yj)=5.9371 log(10⁷ disintegrations/s), xi2=14000 hours², yj2=744.7265 10⁷ disintegrations/s², log(yj)2=57.7369 log²(10⁷ disintegrations/s), xiyj=173.8 hours⋅10⁷ disintegrations/s, xilog(yj)=680.9447 hours⋅log(10⁷ disintegrations/s).

plot of chunk radioactivity_time_scatterplot 2. plot of chunk log_radioactivity_time_scatterplot
  1. x¯=35 hours, sx2=525 hours².
    z¯=0.7421 log(10⁷ disintegrations/s), sz2=6.6664 log(10⁷ disintegrations/s)^2.
    sxz=59.1434 hours⋅log(10⁷ disintegrations/s)
    Regression line of logarithm of radioactivity on time: z=3.2008+0.1127x.
  2. λ=0.1127.

Exercise 6

For oscillations of small amplitude, the oscillation period T of a pendulum is given by the formula \newline[ T = 2\pi\sqrt{\frac{L}{g}} \newline] where L is the length of the pendulum and g is the gravitational constant. In order to check if the previous formula is satisfied, an experiment has been conducted where it has been measured the oscillation period for different lengths of the pendulum.The measurements are shown in the table below.

L (cm)52.568.099.0116.0146.0P (seg)1.4491.6391.9992.1532.408

  1. Represent graphically the data of the period versus the length of the pendulum.
    Does a linear model fit well to the points cloud?
  2. Represent graphically the data of the period versus the length in a logarithmic paper. Which type of model fits better to the points cloud?
  3. Compute the regression line of the logarithm of period on the logarithm of length.
  4. Taking in to account the independent term of the previous regression line, compute the value of g.

plot of chunk period_length_scatterplot The linear model fits well to the points cloud. plot of chunk log_period_length_scatterplot 2. The model that best fits the points cloud is linear. 3. Let X be the logarithm of length and Y to the logarithm of period,

x¯=4.5025 log(cm), sx2=0.1353 log(cm)².
y¯=0.6407 log(s), sy2=0.0339 log(s)².
sxy=0.0677 log(cm)log(s)
Regression line of Y on X: y=1.6132+0.5006x.
4. $g=994.4579 cm/s².

Exercise 7

A study tries to determine the relationship between two substances X and Y in blood. The concentrations of these substances have been measured in seven individuals (in μg/dl) and the results are shown in the table below.

X2.14.99.811.75.98.49.2Y1.31.51.71.81.51.71.7

  1. Are Y and X linearly related?
  2. Are Y and X potentially related?
  3. Use the best of the previous regression models to predict the concentration in blood of Y for x=8 μgr/dl.Is this prediction reliable. Justify your answer.

Use the following sums: xi=52 μg/dl, log(xi)=13.1955 log(μg/dl), yj=11.2 μg/dl, log(yj)=3.253 log(μg/dl), xi2=451.36 (μg/dl)², log(xi)2=26.9397 log(μg/dl)², yj2=18.1 (μg/dl)², log(yj)2=1.5878 log(μg/dl)², xiyj=86.57 (μg/dl)², xilog(yj)=26.3463 μg/dl⋅log(μg/dl), log(xi)yj=21.7087 log(μg/dl)⋅μg/dl, log(xi)log(yj)=6.5224 log(μg/dl)².

x¯=7.4286 μg/dl, sx2=9.2963 (μg/dl)².
z¯=0.7421 μg/dl, sz2=6.6664 (μg/dl)².
sxz=0.4147 (μg/dl)²
Linear relation: r2=0.9696, that is close to 1, so there is a strong linear relation.
2. Naming u=log(x) and v=log(y),

u¯=1.8851 log(μg/dl), su2=0.295 log(μg/dl)².
v¯=0.4647 log(μg/dl), sv2=0.0109 log(μg/dl)².
suv=0.0558 (μg/dl)²
Potential relation: r2=0.9688, that is close to 1, so there is a strong potential relation, although the linear relation is a little bit stronger.
3. Regression line of Y on X: y=1.2153+0.0518x.
y(8)=1.6296 μg/dl. The prediction is reliable since the linear coefficient of determination is close to 1.

Previous
Next