Assignment III for Biostats Course VHM 801 at AVC - Fall semester 2023

The assignment is worth 15% of the final course mark. Please be aware that by handing in the home assignment you implicitly acknowledge to have read and accepted the instructions for home assignments as described on the VHM 801 homepage.

Using survey data from the Atlantic Seniors' Housing Research Alliance, a study was carried out to examine the current social support needs of Atlantic Canadian seniors living in their home, and explore how these needs relate to age and gender (sex). In this research, social support was defined as the "emotional and informational resources that persons perceive to be available or that are actually provided to them by nonprofessionals in the context of both formal support groups and informal helping relationships". The premise of the research question is that social support is an important factor for how long seniors can live in their home or in the community, as opposed to going to a nursing home.

The criteria for participation included community-dwelling adults age 65 years of age and over. Community-dwelling was defined as not living in an institutional setting such as a nursing home, prison, or hospital. Participants were selected randomly in each of the four provinces (Prince Edward Island, Nova Scotia, New Brunswick, and Newfoundland and Labrador). For this assignment we will concentrate on the answers received to one of the social support survey questions, in which the respondent was asked to indicate how often the following kind of support was available to her/him: "Someone to do things with to help you get your mind off things''. The allowed response categories for the question were: "None of the time" (0), "A little of the time" (1), "Some of the time" (2), "Most of the time" (3), and "All of the time" (4). In the data, the corresponding variable is denoted support, with the response categories 0-4 as indicated. The data additionally contains the variables gender (0 = Male, 1 = Female) and age (0 = 65-69 years, 1 = 70-74 years, 2 = 75 years and above), as well as the counts of responses in the individual categories.

A data set is available in two layouts, as tabular counts (Minitab format, comma-separated value format) and as individual responses (Minitab format, csv format). Due to confidentiality restrictions on access to the real data, the data supplied is a simulated data set adapted to some of the characteristics of the real data. Disclaimer: From the nature of the random process involved in generating the data it follows that some features of the simulated data are not reflections of the real data.

The home assignment has five questions (a)-(e) which should all be answered. Recall that all solutions should be accompanied by text explaining the procedures used; in particular, all statistical models and assumptions should be specified and motivated/justified.

  1. For each gender, tabulate the support responses and estimate the corresponding distribution (i.e., the proportions for the 5 response categories). Compare informally these distributions with the overall response distribution. Use statistical inference to assess whether the response distributions for women and men appear to be similar, and draw conclusions. If you detect dissimilarities, describe carefully their character. Explain the statistical model and setup you use for the analysis, and motivate your choice.

  2. For this question we will consider an alternative method of analysis to compare the responses for women and men. The focus here should not be on the entire distribution across the response categories, but instead on the ordered responses (represented by the values 0-4) and on whether one of the two distributions is systematically larger than the other. Use a nonparametric test (with many ties) for the assessment hereof, and draw conclusions. Compare your conclusion with the one from (a) and try to explain any differences in your findings. (Hint: the concept of one distribution being systematically larger than another is explained in Chapter 27 of PSLS and Chapter 15 of IPS.)

  3. Turning now to the variable Age, carry out a similar analysis as in (a) to compare the response distributions between the 3 age groups. Draw conclusions and try to describe any differences you may find between the groups.

  4. As an addition to your analysis and conclusions in (a)-(c), discuss (briefly) the utility and validity of assessing the effects of gender and age separately. Specifically, can you describe or outline a scenario under which such separate analyses might be misleading? (Hint: It may be helpful to think of an example discussed in the course where analysis disregarding an additional factor led to misleading results, and to try to relate such an example to the actual data.)

  5. Continue the analyses from (a)-(b) or (c) by further analysis, in order to assess whether your conclusions from above were consistent across age and gender groups, respectively. Specifically, you are requested to perform one of the following two analyses (only one of i. and ii. are required):
    1. For each age group separately, compare the gender distributions as above, and assess whether your findings are consistent across age groups and agree with the results from (a)-(b).
    2. For each gender group separately, compare the age distributions as above, and assess whether your findings are consistent across gender groups and agree with the results from (c).
    Summarize your conclusions from all analyses into a final conclusion, worded non-technically, about the effects of age and gender on the responses to the support question.

Henrik Stryhn (hstryhn@upei.ca) 2013-11-01