Paper 1

Question 1

1a)i) Hint 1: draw a tree diagram with survive/no survive first month as its first branches

1a)i) Hint 2: continue tree diagram with branches for reaching adulthood/not reaching adulthood

1a)i) Hint 3: continue tree diagram with branches for returning to wild/not returning to wild

1a)i) Hint 4: for required probability, it will be calculated along three branches of your tree diagram

1a)ii) Hint 5: for required probability, it will also be calculated along three branches of your tree diagram

1b) Hint 6: recognise that this is a conditional probability, conditional upon them not reaching adulthood

Question 2

2a) Hint 1: draw out a table for the five stated values of x, with a constant probability for all of them

2a) Hint 2: know that all the probabilities should sum to 1, if this is to be a valid random variable

2b)i) Hint 3: use standard formulae for E(X) and V(X)

2b)i) Hint 4: you need to work out E(X²) to then calculate V(X)

2b)ii) Hint 5: use your answer from part (b)(i) for the value of μ

Question 3

3a)i) Hint 1: bring your worldly experience of knowledge of how exam marks are - or are not - made public

3a)ii) Hint 2: for a random sample method, be sure to mention how random numbers are used

3a)ii) Hint 3: be sure to communicate as much detail as possible, being clear how your chosen method minimises the travelling distance

3b) Hint 4: recognise that you have two samples of non-paired data

3b) Hint 5: know that this is a t-test for a difference in population means

3b) Hint 6: know that this will involve pooling the samples to obtain the best estimate for the (equal) standard deviations

3b) Hint 7: make it clear what level of significance you have chosen for your test

3b) Hint 8: be sure to state your conclusion in terms of the context

Question 4

4a) Hint 1: standard process to calculate lower and upper fences: Q1 - 1.5×(Q3 - Q1) and Q3 + 1.5×(Q3 - Q1)

4b)i) Hint 2: recognise that the shoe/height data set is paired data

4b)ii) Hint 3: know what type of graph ought to be used to display paired data

4b)ii) Hint 4: know what sort of processes/tests could be used with a scatterplot of data

Question 5

5a)i) Hint 1: note that the sample size, n = 5

5a)i) Hint 2: state the mean and standard deviation of X̄ based upon the mean and standard deviation of X

5a)i) Hint 3: use a process similar to calculating 1σ, 2σ and 3σ limits, but use the number 6 instead

5a)ii) Hint 4: use the provided control chart to identify the first point that is higher than the 6σ limit line

5b)i) Hint 5: if the process is in control, then it should be equally likely to be either above or below the centre line

5b)i) Hint 6: recognise that this is a Binomial distribution with n = 9 and p = ½

5b)i) Hint 7: be aware that the 9 points could either be all above, or all below, the centre line

5b)ii) Hint 8: scrutinise the provided control chart to identify which point(s) meet the stated conditions

Question 6

Hint 1: recognise that we have a total of 260 random variables, all being added together

Hint 2: know that the formula for the variance of the total requires all 260 random variables to be independent of each other

Hint 3: you should not be squaring 130, as it's not V(130X) but rather V(X₁ + X₂ + … + X130)

Question 7

7a) Hint 1: know that a confidence interval is based upon the assumption that the population is normally distributed

7a) Hint 2: know that by estimating the population variance, we will therefore be using a t-distribution

7a) Hint 3: know that the t-distribution will have 17 degrees of freedom

7b)i) Hint 4: determine if the value 5.87 is within the confidence interval from part (a)

7b)i) Hint 5: communicate the meaning of the location of 5.87, relative to the confidence interval

7b)ii) Hint 6: consider whether there are any other possible influencing factors that might affect the result

Question 8

8a) Hint 1: know that 4 is the variance, so the standard deviation is not 4

8b) Hint 2: again, know that 4 is the variance, so the standard deviation is not 4

8c) Hint 3: use standard formulae to calculate E(Y) and V(Y) to allow you know SD(Y)

8c) Hint 4: draw a diagram of a uniform distribution to help visualise the interval required and the constant value of the probability density over that interval

Question 9

9a) Hint 1: consider the practical issues that might occur when dividing land into square grids and what may or may not be in each grid

9a) Hint 2: provide a practical suggestion, based upon common sense

9b) Hint 3: recognise that we do not know the type of distribution of X, but that we do know E(X) and V(X)

9b) Hint 4: notice that we have a sample size of 25, which is greater than 20

9b) Hint 5: recognise that this will require the Central Limit Theorem

9b) Hint 6: state that the distribution is approximately normal, and what its parameters are

9c) Hint 7: realise that we could assume the standard deviation here to be the same as that from part (b)

9c) Hint 8: recognise that we now have a single sample z-test of a mean

Question 10

10a) Hint 1: recognise that we have 20 repetitions of an action that had a probability of success to be 0.2

10b) Hint 2: from part (a) we know we have a binomial distribution, so this part requires you to first specify the new binomial distribution

10b) Hint 3: note that the 'appropriate approximation' means that you need to approximate a binomial with a normal

10b) Hint 4: as we are going from a discrete distribution to a continuous distribution, we need to use continuity correction

10c) Hint 5: the word 'association' is a clue to perform a chi-squared test, along with the obvious table of 2 rows and 2 columns

10c) Hint 6: be sure to state your conclusion in terms of the context

Question 11

11a) Hint 1: know that a random variable for a proportion stems from a random variable for a simple count, which is binomial

11a) Hint 2: therefore the binomial needs to be approximated by a normal distribution, to allow a division to occur to give a proportion

11a) Hint 3: know the expression for the standard deviation of this normal distribution, as it is not given in the formula booklet.

11b) Hint 4: recognise that this part requires you to 'work backwards' from a condition, to a sample size

11b) Hint 5: know that for a majority to support a claim, you need to be confident that more than 50% support it

11b) Hint 6: realise that we are seeking to ensure that the entire confidence interval is located above 50%,

11b) Hint 7: proceed with the lower value of the confidence interval is the proportion of 0.5 to obtain a lower bound for the sample size

Question 12

12a) Hint 1: know that a Mann Whitney test requires the assumption that the populations have the same shape and same spread, and that it is not paired data

12b)i) Hint 2: realise that with m = 20 and n = 20 we can use the data booklet tables, and that the situation is a 'symmetrical' one due to equal sample sizes

12b)i) Hint 3: perform a standard Mann Whitney test, using the provided rank sum value of 480

12b)ii) Hint 4: be sure to explain the different conclusion in terms of the context of the study

12b)iii) Hint 5: think of a practical step that could be done to improve the quality of the random sample, so that it is more representative of the population

Question 13

13a) Hint 1: make a comment about where most of the dots are, and how they are aligned

13a) Hint 2: make a second comment about the outlier dot and what it represents in terms of the data set

13b) Hint 3: use a standard process and formulae from the data booklet to obtain the value of r from the summary statistics provided

13b) Hint 4: square the value of r to obtain R² and know what this means in terms of a percentage of variation explained by the regression line

13c) Hint 5: note that the interval being asked for is not for a mean, but rather for an individual case

13c) Hint 6: proceed with calculating a prediction interval derived from the transformed x value of 2.5441 being substituted into the regression line equation

13c) Hint 7: note that the sample size, n = 13, which means that the t-distribution will have 11 degrees of freedom

13c) Hint 8: know that the resulting prediction interval is for the transformed number of species, and therefore its values need to be converted back to simply the number of species

Did this hint help?