Assignment III for Biostats Course VHM 801 at AVC - Fall semester 2024

The assignment is worth either 10% or 15% of the final course mark. Questions 1-3 constitute an assignment for 10%, whereas Questions 1-5 constitute an assignment for 15%. You must choose the version (percentage) you did not choose for Home assignment II. Please be aware that by handing in the home assignment you implicitly acknowledge to have read and accepted the instructions for home assignments as described on the VHM 801 homepage.

The characteristics of diagnostic tests for diseases or conditions in subjects (e.g. animals or humans) are important to aid the interpretation of the test results. The most commonly used characteristics are the (diagnostic) test sensitivity and test specificity. The test sensitivity is the chance (probability) of detecting a truly diseased subject by the test, whereas the test specificity is the chance (probability) of getting a negative test result for a truly non-diseased subject. Ideally a test should have high values of both the sensitivity and specificity but this may be difficult to achieve in practice. Thus it is important to know these characteristics; for example, if a test has a low specificity (say 0.80) it means that the test will give a quite high (in the example, 20%) rate of false positives.

The data for the assignment were collected as part of the PhD project at AVC by Dr. Carol McClure. Several diagnostic tests for infectious salmon anemia (ISA) were evaluated using fish sampled from New Brunswick producers. Note that the different diagnostic tests were applied to the same fish. For the production season from April 1999 to January 2000 detailed information on clinical ISA outbreaks in the cages sampled was available, and this information was used as a reference ("gold standard") for the evaluation of the tests. That is, fish from cages that experienced clinical outbreaks in the period were considered clinically infected (i.e., truly diseased), and fish from cages that did not experience clinical outbreaks were considered clinically non-infected (i.e., truly non-diseased). We consider here only the data for two of the diagnostic tests: histopathology on fish tissues and an indirect fluorescent antibody technique (IFAT). For the histopathology test, we consider samples classified as either "suspect" or "positive" as positive. For the IFAT test, we consider all results reported as 1+ to 4+ on a fluorescence intensity scale as positive. The IFAT test was carried out at two diagnostic laboratories but for the data shown here the results have been combined (by the rule that if at least one test tested positive, the test outcome was considered positive).

The table below shows the counts of fish for the different combinations of clinical status and test outcomes. A disease-positive status/result is indicated by a "1", and a disease-negative status/result by a "0".

IFAT resultHistopath resultClinical status
01
0033212
0112618
101831
11698

A dataset corresponding to the table is available in Minitab format and as a comma-separated file, for import into Stata and other statistical software.

The home assignment has five questions (1-5) of which you need to answer either 1-3 or 1-5, as described above. Recall that all answers should be accompanied by text explaining the procedures used; in particular, all statistical models and assumptions should be specified and motivated/justified.

Question 1.
For disease status defined by clinical infection, estimate for each of the diagnostic tests the sensitivity and specificity, and give also corresponding interval estimates. In conclusion, do these calculations suggest one of the two tests to be superior to the other one? (Note: You are not expected to conduct any additional analysis to discuss this question.)

Question 2.
The outcomes of two diagnostic tests may be independent or dependent (correlated). The immediate form of assessing a dependence is in the table of overall counts of positive and negative outcomes by the two tests. Construct this table for the ISA data and the IFAT and histopathology tests, and use a significance test to assess whether there is such a dependence between the outcomes of the two tests. Draw conclusions, and discuss what your finding actually tells us, in practical terms, about the two tests. For this discussion, it may be useful to think about what we could (should) conclude if there was such an independence between two tests in a population containing both truly positive and negative subjects.

Question 3.
Another, and potentially more useful, way of defining independence or dependence considers the performance of the two tests separately in the subpopulations of truly positive and negative subjects (this form of dependence is often called conditional (in)dependence). For the ISA data, use significance tests to assess whether the two tests are conditionally independent in each of these two subpopulations. Draw conclusions, and discuss what your finding actually tells us, in practical terms, about the two tests. For this discussion, it may be useful to think about what we could (should) conclude if there was a strong conditional dependence between two tests.

Question 4.
It is often of interest in studies with multiple tests is to assess their agreement when applied to the same samples. The so-called kappa statistic can be used to quantify agreement (specifically, "beyond chance agreement") between two dichotomous ratings or tests. The procedure is described below in terms of the agreement between two tests called A and B.

  1. From a cross-tabulation of the results of the two tests, determine the observed proportion of samples in agreement (i.e., either both testing negative or both testing positive).
  2. Determine the "expected agreement" (the proportion of samples expected to agree) if the outcomes of tests A and B were independent. (Hint: Compute - separately - the expected proportion of samples where both tests are negative, and the expected proportion of samples testing positive on both tests.)
  3. Finally compute the kappa statistic as:
    [observed agreement (from 1.) - expected agreement (from 2.)] / [1 - expected agreement (from 2.)].
  4. Interpret the value of kappa according to the following commonly used (but not uncontroversial) guidelines (Landis JR, Koch GG (1977), The measurement of observer agreement for categorical data, Biometrics 33:159-174):
For each of the two subpopulations of clinically infected and clinically non-infected fish, compute the kappa statistic for the agreement between the IFAT and histopathology tests, and draw conclusions about their agreement. The kappa statistic is available in standard statistical software (e.g., Minitab and Stata), but you (also) need to demonstrate your calculation using the above steps (you may however limit your demonstration to one of the subpopulations).

Question 5.
Another question of potential interest is whether the proportion of positive results is the same for dichotomous two tests when applied to the same samples. A statistical answer to that question uses McNemar's test (described in Session 7). For this part, you are allowed to use the McNemar test result (i.e., its P-value) obtained from standard statistical software without demonstrating the calculation, but you are also welcome to include a demonstration if you want.

Carry out statistical comparisons of the proportions of positive results by the IFAT and histopathology tests in each of the two above subpopulations (i.e., clinically infected and clinically non-infected fish), and draw conclusions. In an overall conclusion, summarize and discuss what your findings in Questions 4-5 tell us, in practical terms, about the two tests and their performance.


Henrik Stryhn (hstryhn@upei.ca) 2024-10-30