Multivariate normal data with Bayesian Lasso to do variable selection

Hi,

I am university student learning Bayesian statistics currently, and I am a first time user here. I would like to ask some support, please.

I intend to create a model for variable selection using Bayesian Lasso approach on the publicly available ‘wine quality data’. The data is available by this link: https://archive.ics.uci.edu/ml/datasets/wine+quality.

I am using Metropolis-Hastings sampling implementing it with base R codes. I thought you might forgive me to bring this to here. Please redirect me in case it is better to fit to the framework of PyMC.

Back to the Metropolis Hastings sampling by base R, I am aware there are other methods, but this is what I would like to use now.

My current aim is to have converging chains, and it is monitored by Gelman-Rubin statistic.

I used the following steps to reach convergences for all parameters:

1, I created the first variance-covariance (cov.prop) matrix with random values in all diagonal positions, and scaler (s=1) for this.

2, I run the sampler 3 times to get 1000 samples, and I checked the acceptance ratio in each cycle; I maintained the acceptance ratio by adjusting the ‘s’ scaling factor.

3, After the fourth run, I created a new variance-covariance matrix with the last 1000 samples.

4, I repeated step2 and step3 many times.

I expected the chains to converge; it happened with only the first 12 columns of ‘draws’ until reaching around 120000 samples then even those become unconverging again. I would like to find out why it is happening that (a) only the some parameters were converging, and (b) for what reasons even those became unconverging again.

The code I used:

  • for the very first cov matrix:

cov.prop <- matrix(rnorm(25*25, 0.1, 0.01), nrow = 25, ncol = 25)
cov.prop <- diag(cov.prop)

  • for the updated cov matrix

int = c((step1-1000):step1)
M <- (draws[int, , 1] + draws[int, , 2] + draws[int, , 3] + draws[int, , 4])/4
cov.prop <- cov(M)

  • for sampling:

d <- winequality_red

d <- as.data.frame(d)
y <- d$quality
X <- d[,-12]
X <- matrix(unlist(X), ncol = dim(X)[2], byrow = FALSE)
X <- cbind(rep(1,nrow(X)),scale(X))

n <- length(y)
k <- ncol(X)
npars <- k + (k-1) + 2

logprior <- function(pars) {
b <- as.matrix(pars[1:k])
tau <- pars[(k+1):(2 k-1)]
lambda2 <- pars[2
k]
sigma2 <- pars[npars]

if (lambda2 <= 0)
return(log(0))
if (sigma2 <= 0)
return(log(0))
if (any(tau<0))
return(log(0))
dens <- (sum(dnorm(abs(b[2:k]),0,sqrt(sigma2*tau),log=TRUE))+
dgamma(1/sigma2,4,60,log=TRUE) +
dgamma(lambda2,0.025,0.1,log=TRUE)+
sum(dexp(tau,lambda2/2,log=TRUE)))
return(dens)
}

loglikelihood <- function(y,X,par) {
b <- as.matrix(par[1:k])
sigma2 <- par[npars]
if (sigma2 <= 0)
return(log(0))
dens <- sum(dnorm(y,t(X%*%b),sqrt(sigma2),log=TRUE))
return(dens)
}

draws <- array(0, dim=c(200000, npars, 4))
draws[1,npars, 1:4] <- 1/rgamma(1,4,60)
draws[1,npars-1, 1:4] <- 1
draws[1,1, 1:4] <- mean(y)
draws[1,2:k, 1:4] <- rep(0,k-1)
draws[1,(k+1):(2*k-1), 1:4] <- rep(1,k-1)

converged <- c(1:npars)
converged[1:npars] <- 0
step1 <- 2
proposed <- c(1:npars)
proposed[1:npars] <- -1

accepted.prop <- rep(1, 4)

while (any(converged<1)) {

for (i in 1:4) {
proposed <- rmvnorm(1,draws[step1-1, , i], s*cov.prop)
r <- loglikelihood(y,X,proposed)+
logprior(proposed)-
loglikelihood(y,X,draws[step1-1, , i])-
logprior(draws[step1-1, , i])
u <- runif(1)
if (log(u) < r) {
draws[step1, , i] <- proposed
accepted.prop[i] <- accepted.prop[i]+1
} else {
draws[step1, , i] <- draws[step1-1, , i]
}
}
step1 <- step1 + 1
if (any(step1 == as.vector(seq(1000, 200000, by = 1000))) == TRUE) {
for ( z in 1:npars) {
chainlist <- mcmc.list(Chain1=mcmc(draws[1:(step1-1), z, 1]),
Chain2=mcmc(draws[1:(step1-1), z, 2]),
chain3=mcmc(draws[1:(step1-1), z, 3]),
chain4=mcmc(draws[1:(step1-1), z, 4]))
if ((gelman.diag(chainlist)$psrf[1,2]) < 1.4) {
converged[z] <- converged[z]+1

  }
}
print(converged)
print((accepted.prop-accepted.prop.print))
accepted.prop.print <- accepted.prop

}
}

1 Like

I would recommend you to take a look at https://avehtari.github.io/modelselection/winequality-red.html, which is a more modern approach to variable/model selection from @avehtari https://github.com/avehtari/modelselection

I have not seen a pymc3 port of the wine quality example, but it should be straightforward for the Horseshoe prior model in the bottom of the post.

1 Like

Thank you! I have a look at it!

1 Like