Problems of Descriptive Statistics

Exercise 1

The number of injuries suffered by the members of a soccer team in a league were

0 1 2 1 3 0 1 0 1 2 0 1 1 1 2 0 1 3 2 1 2 1 0 1

Calculate the following statistics and interpret them.

  1. Mean.
  2. Median.
  3. Mode.
  4. Quartiles.
  5. Percentile 32.

  1. x¯=1.125 injuries.
  2. Me=1 injury.
  3. Mo=1 injury.
  4. Q1=1 injury, Q2=1 injury and Q3=2 injuries.
  5. P32=1 injury.

Exercise 2

The chart below shows the cumulative distribution of the time (in min) required by 66 students to do an exam.

plot of chunk time_exam
  1. At what time have half of the students finished? And 90% of students?
  2. What percentage of students have finished after 100 minutes?
  3. What is the time that best represent the time required by students in the sample to finish the exam? Is this value representative or not?

  1. Me=94.62 min. P90=132 min.
  2. 57.08 of students.
  3. x¯=85.9091 min, s=37.5268 min and cv=0.4368.

Exercise 3

In a study about children’s growth, two samples were drawn, one for newborn babies and the other for one year old infants. The heights in cm of children in each of the samples were

Newborn children: 51 50 51 53 49 50 53 50 47 50 
One year old children: 62 65 69 71 65 66 68 69

In which group is the mean more representative? Justify your answer.

Newborn children: x¯=50.4 min, sx=1.6852 min and cvx=0.0334.
One year old children: y¯=66.875 min, sy=2.7128 min and cvy=0.0406.

Exercise 4

To determine the accuracy of a method for measuring hematocrit in blood, the measurement was repeated 8 times on the same blood sample. The results of hematocrit in plasma, in percentage, were

42.2 42.1 41.9 41.8 42 42.1 41.9 42

What do you think about the accuracy of the method?

x¯=42 min, s=0.1225 min and cv=0.0029.

Exercise 5

The histogram below shows the frequency distribution of the body mass index (BMI) of a group of people by gender.

plot of chunk bmi_gender
  1. Draw the pie chart for the gender.
  2. In which group is more representative the mean of the BMI?
  3. Calculate the mean for the whole sample.

Use the following sums Females: xi=1160 kg/m2 xi2=29050 kg2/m4 Males: xi=1002.5 kg/m2 xi2=22781.25 kg2/m4

plot of chunk piechart_bmi_gender
  1. Females: x¯=24.1667 min, sx=4.6022 min and cvx=0.1904.
    Males: y¯=22.2778 min, sy=3.1545 min and cvy=0.1416.
  2. z¯=23.2527.

Exercise 6

The following table represents the frequency distribution of ages at which a group of people suffered a heart attack.

age persons
[40,50) 6
[50,60) 12
[60,70) 23
[70,80) 19
[80,90) 5

Could we assume that the sample comes from a normal population?

Use the following sums: xi=4275 years, (xix¯)2=7461.5385 years2, (xix¯)3=18248.5207 years3, (xix¯)4=2099635.8671 years4.

g1=0.2283 and g2=0.5487.

Exercise 7

To compare two rehabilitation treatments A and B for an injury, every treatment was applied to a different group of people. The number of days required to cure the injury in each group is shown in the following table:

Days A B
20-40 5 8
40-60 20 15
60-80 18 20
80-100 7 7
  1. In which treatment is more representative the mean?
  2. In which treatment the distribution of days is more skew?
  3. In which treatment the distribution is more peaked?

Use the following sums: A: xi=3040 days, (xix¯)2=14568 days2, (xix¯)3=17011.2 days3, (xix¯)4=9989602.56 days4 B: yj=3020 days, (yjy¯)2=16992 days2, (yjy¯)3=42393.6 days3, (yjy¯)4=12551516.16 days4

  1. A: a¯=60.8 days, sa=17.0693 days and cva=0.2807.
    B: b¯=60.4 days, sb=18.4347 days and cvb=0.3052.
  2. g1a=0.0684 and g1b=0.1353.
  3. g2a=0.6465 and g2b=0.8264, so the distribution of treatment A is more peaked than the one of treatment B as g2a>g2b.

Exercise 8

The systolic blood pressure (in mmHg) of a sample of persons is

135 128 137 110 154 142 121 127 114 103
  1. Calculate the central tendency statistics.
  2. How is the relative dispersion with respect to the mean?
  3. How is the skewness of the sample distribution?
  4. How is the kurtosis of the sample distribution?
  5. If we know that the method used for measuring the blood pressure is biased, and, in order to get the right values, we have to apply the linear transformation y=1.2x5, what are the statistics values of parts (a) to (d) for the new, corrected distribution?

Use the following sums: xi=1271 mmHg, (xix¯)2=2188.9 mmHg2, (xix¯)3=2764.32 mmHg3, (xix¯)4=1040079.937 mmHg4.

  1. x¯=127.1 mmHg, Me=127.5 mmHg, Mo all the values.
  2. s=14.7949 mmHg and cv=0.1164.
  3. g1=0.0854.
  4. g2=0.8292.
  5. x¯=147.52 mmHg, Me=148 mmHg, Mo=157 mmHg, s=17.7539 mmHg, cv=0.1203, g1=0.0854 and g2=0.8292.

Exercise 9

The table below contains the frequency of pregnancies, abortions and births of a sample of 999 women in a city.

Num Pregnancies Abortions Births
0 61 751 67
1 64 183 80
2 328 51 400
3 301 10 300
4 122 2 90
5 81 2 62
6 29 0 0
7 11 0 0
8 2 0 0
  1. How many birth outliers are in the sample?
  2. Which variable has lower spread with respect to the mean?
  3. Which value is relatively higher, 7 pregnancies or 4 abortions? Justify your answer.

Use the following sums: Pregnancies: xi=2783, xi2=9773. Abortions: yj=333, yj2=559. Births: zk=2450, zk2=7370.

  1. 129 outliers.
  2. Pregnancies: x¯=2.7858, sx=1.422 and cvx=0.5105.
    Abortions: y¯=0.3333, sy=0.6697 and cvy=2.009.
    Births: z¯=2.4525, sz=1.1674 and cvz=0.476.
  3. Standard score of 7 pregnancies is 2.9635, and standard score of 4 abortions is 5.4754.
Previous
Next