Cross Validated Asked by WalterB on December 8, 2020
I am unsure on how to interpret credible interval results. How can credible intervals consist of negative numbers when the collected data only consists of positive numbers? I would expect that, given data ranging anywhere from 1 to 20, the credible interval would tell me (with 95% certainty) that a value would be between x and y – where x and y are in between 1 and 20.
Should I add the general mean to the produced results to obtain what I am looking for?
In my reproducible example below, I generate the following data points:
X – dependent variable, random between 0 and 20
Y – condition variable
Z – study participant ID
I am then looking at the Y-A, Y-B, and Y-C rows of the “Quantiles for each variable” and find the following credible interval for Y-A: [-5.749, 0.495], as opposed to an interval between 1 and 20. Am I simply looking at the wrong data? Thank you very much in advance for your help.
Reproducible example:
library(BayesFactor)
Data <- data.frame(
X = sample(1:20),
Y = sample(c("A", "B", "C"), 20, replace = TRUE),
Z = sample(c("P1", "P2", "P3", "P4"), 20, replace = TRUE)
)
Data$Y <- as.factor(Data$Y)
Data$Z <- as.factor(Data$Z)
bayesfactor = anovaBF(X ~ Y + Z, data = Data, whichRandom = c("Z"))
bayesfactor
bayesfactor_posterior <- posterior(bayesfactor, iterations = 10000)
summary(bayesfactor_posterior)
My results:
Iterations = 1:10000
Thinning interval = 1
Number of chains = 1
Sample size per chain = 10000
1. Empirical mean and standard deviation for each variable,
plus standard error of the mean:
Mean SD Naive SE Time-series SE
mu 10.63868 2.918 0.02918 0.02918
Y-A -2.40507 1.616 0.01616 0.02169
Y-B -0.47454 1.419 0.01419 0.01473
Y-C 2.87961 1.916 0.01916 0.02677
Z-P1 0.04134 3.245 0.03245 0.03199
Z-P2 -1.56066 3.111 0.03111 0.03111
Z-P3 2.96385 3.117 0.03117 0.03189
Z-P4 -1.59330 3.500 0.03500 0.03573
sig2 27.99127 10.851 0.10851 0.15682
g_Y 1.20403 8.990 0.08990 0.08990
g_Z 1.02143 1.652 0.01652 0.02174
2. Quantiles for each variable:
2.5% 25% 50% 75% 97.5%
mu 4.83704 9.0158 10.66341 12.2694 16.4573
Y-A -5.74906 -3.4504 -2.32914 -1.2821 0.4952
Y-B -3.37089 -1.3441 -0.44655 0.4431 2.2909
Y-C -0.49649 1.5046 2.75880 4.1339 6.8901
Z-P1 -6.34286 -1.8575 0.02995 1.9121 6.5346
Z-P2 -7.77506 -3.3115 -1.52214 0.2191 4.3945
Z-P3 -2.82042 1.1496 2.83577 4.7147 9.3355
Z-P4 -9.05500 -3.5174 -1.46395 0.5629 4.9008
sig2 14.07844 20.5537 25.78330 32.7843 54.7558
g_Y 0.05299 0.1812 0.39724 0.9168 6.1523
g_Z 0.14551 0.3438 0.59676 1.1107 4.3755
Unfortunately I have not been able to find a satisfactory answer through the posterior generated by the BayesFactor package (though it is most likely because I am doing something wrong). However, I did find an alternative approach which might prove useful for others coming across this question.
Using the library 'bayesboot', I am able to get the posterior(95% HDI) as desired relatively easily. The code below demonstrates this approach. Simply refer to the 'hdi.low' and 'hdi.high' columns.
library(bayesboot)
bayes_A <- bayesboot(Data[Data$Y == "A",]$X, weighted.mean, use.weights = TRUE)
summary(bayes_A)
Summary of the posterior (with 95% Highest Density Intervals):
statistic mean sd hdi.low hdi.high
V1 13.60795 1.875804 9.951562 17.23397
Answered by WalterB on December 8, 2020
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP