Assignment I for Biostats Course VHM 801 at AVC - Fall semester 2022

The assignment is worth 10% of the final course mark. Please be aware that by handing in the home assignment you implicitly acknowledge to have read and accepted the instructions for home assignments as described on the VHM 801 homepage.

This assignment is based on data collected on 443 cattle at the time that they entered a feedlot for 'fattening' prior to slaughter. The data consist of demographic information plus readings obtained from an ultrasonic evaluation of the animal. Ultrasound measurements of backfat thickness, ribeye area and the percentage of intramuscular fat were obtained. The objective of the study was to determine if ultrasound examination of the animal at the time of entry into the feedlot was able to predict final carcass grade (A, AA, or AAA, where AAA is the highest grade and A the lowest grade in terms of price). Carcass grade depends primarily on the amount of 'marbling' (intramuscular fat in the loin region) in the carcass at the time of slaughter. The data were compiled from the 'beef_ultra' dataset of the textbook Dohoo et al. (2009), Veterinary Epidemiologic Research, 2nd ed., by omitting some breed categories and farms only sparsely represented in the original data, but including the same selection of variables:

Our use of these data for the home assignment is unrelated to its use both in the textbook and in the paper: Keefe et al. (2004), Ultrasonic imaging of marbling at feedlot entry as a predictor of carcass quality grade, Canadian Journal of Animal Science, 84, 165-170. However, the paper gives additional details and background information about the data, even if the present description should suffice. The dataset is available in Minitab format and as a comma-separated file, for import into Stata and other statistical software

The home assignment has three questions which should all be answered.

  1. First, briefly describe the variable type of all the variables in the dataset, e.g. using one or several of the descriptors: nominal, ordinal, discrete, continuous. Next, select four variables in the dataset: two quantitative variables, one categorical variable (with more than two categories), and one dichotomous (or binary) variable. Apart from this restriction on the variable types you are free to select the variables as you want. Carry out a descriptive analysis of your four selected variables including both a graphical representation and descriptive statistics. Choose the graphical representation and the statistics you find most useful to show each of the distributions, in consideration of the variable's type and range of values. Where appropriate, comment specifically on the distribution's center, spread and shape, as well as potential outliers. If you note potential outliers, include also an assessment of whether these should be considered truly outlying observations, in the sense that they don't really belong to the distribution, or whether the values should be considered as part of the distribution.

  2. For each of your selected two quantitative variables, examine further whether it would seem reasonable to assume the data to be normally distributed. Describe carefully the tools you use for this, and how you arrive at your conclusions. If you conclude that a variable is not normally distributed, describe how its distribution seems to differ from a normal distribution. Explore also (briefly) whether the square-root or log-transformed (natural or base 10) values can be approximated better by a normal distribution.

  3. For this last question we consider the variable implant and its possible association with carcass grade and weight (some information about the use of hormone growth implants can be found here). Other studies have discussed how hormonal implants affect growth and carcass grade. Let us pretend that we would want to use the present data to discuss or evaluate this question.

    First, should the present study be considered as observational or an experiment? If you think the study is an experiment, describe the experimental design and discuss how the randomization of the use of hormonal implants might have been done (or should have been done). Make sure to consider the role of farms in the study design.

    If you on the other hand think the study is observational, use a diagram similar to those used in the course (Session 2) to discuss whether any association between implant and carcass characteristic(s) in these data may be considered as most likely a causal effect or may have been caused by or influenced by some lurking variable(s). Make any suggestions for lurking variables as specific as possible. As you would not have the information or tools to definitively conclude about causal effects, the focus in your answer should be on the discussion and explanation of plausible scenarios.


Henrik Stryhn (hstryhn@upei.ca) 2022-09-29