Etd

Bayesian Predictive Inference for a Study Variable Without Specifying a Link to the Covariates

Public Deposited

Downloadable Content

open in viewer

We perform Bayesian predictive inference of a finite population mean for a study variable without specifying a link between the study variable and covariates, consequently overcoming some limitations of traditional regression analysis. Not specifying the relationship between the response variable and the covariates adds flexibility and robustness to our models and allows for more applications. For real applications, we take care of three effects (spatial, heterogeneity, and clustering) simultaneously. We have explored several multinomial-Dirichlet models with stick-breaking representation on the mean vector to address a polychotomous regression problem. We also present a continuous regression problem addressed by including a spatial component in our Bayesian hierarchical model. Finally, we demonstrate a solution to the binary predictive inference problem while also incorporating a clustering stick-breaking prior. We illustrate all the aforementioned models using an application with BMI data. First, to avoid defining the relationship between the study variable and covariates, we use a hierarchical Bayesian multinomial-Dirichlet model with stick-breaking representation on the mean vector to make inference about the characteristics of a finite population. Using this type of model provides us with an approach to make inference about the study variable without having to estimate regression coefficients, unlike the logistic regression model, which is used as a baseline for comparison. The multinomial-Dirichlet model uses polychotomous data from a contingency table, instead of binary data used by logistic regression. Several versions of multinomial-Dirichlet model are explored: an unordered and ungrouped model, an ordered and grouped model, a pooled area model, and a model with survey weights included. All versions of this model use the same general setup, but each is edited slightly to incorporate new techniques. The unordered and ungrouped model shows a good performance compared to the logistic regression model, resulting in a tighter population prediction interval, and requiring less assumptions. We also show how survey weights can be included in our unordered and ungrouped model. The ordered and grouped model is introduced to reduce the number of parameters drawn in the Gibbs sampler. Here ordering refers to the cells of the multinomial table with adjustments to the Dirichlet prior. While the ordered and grouped multinomial-Dirichlet model does speed up computation by requiring less parameters to be drawn in the Gibbs sampler, this model depends almost entirely on the data resulting in a tighter prediction interval. Lastly, the pooled area model is used to introduce small area estimation techniques. Second, while we avoid specifying the parametric relationship between the study variable and covariates, we illustrate the advantage of including a spatial component to better account for the covariates in our models to make Bayesian predictive inference. We treat each unique covariate combination as an individual stratum, then we use small area estimation techniques to make inference about the finite population mean of the continuous response variable. The two spatial models used are the conditional autoregressive (CAR) and simple conditional autoregressive (SCAR) models. We include the spatial effects by creating the incidence matrix via the Mahalanobis distance between covariates. We also show how to incorporate survey weights into the spatial models when dealing with probability survey data. We compare the results of two non-spatial models including the Scott-Smith model and the Battese, Harter, and Fuller model to the spatial models. Our goal is to have neighboring strata yield similar predictions, and to increase the difference between strata that are not neighbors. Ultimately, using the spatial models shows less global pooling compared to the non-spatial models, which was the desired outcome. Third, in order to gain robustness, we combine spatial, heterogeneity, and clustering components. After finding success with the CAR model, we use this spatial model joined with a clustering stick-breaking prior to gain more information from the covariates. The main advantage of including a stick-breaking component in our model is that the number of clusters of the strata is determined by the algorithm and is subject to change at every iteration of the blocked Gibbs sampler. This is unlike the spatial component that manually defines neighborhood relationships before the Gibbs sampler. Allowing the number of clusters of the strata to fluctuate gives the model and the data the opportunity to select more relevant clusters. We compare our spatial stick-breaking model to the Fay-Herriot model that contains covariates directly in the model. The results show that the spatial stick-breaking model outperforms the Fay-Herriot model in both accuracy and precision. We address concerns with traditional regression analysis by providing multiple Bayesian hierarchical models that allow for inference to be made about a study variable including covariates and without the need for estimating regression coefficients. We have successful results that pave the way for further extensions of alternate regression models. The progression from the multinomial-Dirichlet model to the spatial CAR and SCAR models reduces the amount of global pooling seen in our predictions. Then by adding a clustering component via the stick-breaking prior in our binary spatial model, we are able to extract even more information from the covariates without directly including them in the model. Our models expand the scope of applications we can explore with minimal assumptions, when compared to traditional regression models.

Creator
Contributors
Degree
Unit
Publisher
Identifier
  • etd-104401
Keyword
Advisor
Orcid
Committee
Defense date
Year
  • 2023
Date created
  • 2023-04-23
Resource type
Source
  • etd-104401
Rights statement
Last modified
  • 2023-08-10

Relations

In Collection:

Items

Items

Permanent link to this page: https://digital.wpi.edu/show/dj52w8259