The post Objective Bayes conference in June appeared first on Statistical Modeling, Causal Inference, and Social Science.

]]>

A quick search seems to imply that you haven’t discussed the Fermi equation for a while.

This looks to me to be in the realm of Miller and Sanjurjo: a simple probabilistic explanation sitting right under everyone’s nose. Comment?

“This” is a article, Dissolving the Fermi Paradox, by Anders Sandberg, Eric Drexler and Toby Ord, which begins:

The Fermi paradox is the conflict between an expectation of a high ex ante probability of intelligent life elsewhere in the universe and the apparently lifeless universe we in fact observe. The expectation that the universe should be teeming with intelligent life is linked to models like the Drake equation, which suggest that even if the probability of intelligent life developing at a given site is small, the sheer multitude of possible sites should nonetheless yield a large number of potentially observable civilizations. We show that this conflict arises from the use of Drake-like equations, which implicitly assume certainty regarding highly uncertain parameters. . . . When the model is recast to represent realistic distributions of uncertainty, we find a substantial ex ante probability of there being no other intelligent life in our observable universe . . . This result dissolves the Fermi paradox, and in doing so removes any need to invoke speculative mechanisms by which civilizations would inevitably fail to have observable effects upon the universe.

I solicited thoughts from astronomer David Hogg, who wrote:

I have only skimmed it, but it seems reasonable. Life certainly could be rare, and technological life could be exceedingly rare. Some of the terms do have many-order-of-magnitude uncertainties.

That said, we now know that a large fraction of stars host planets and many host planets similar to the Earth, so the uncertainties on planet-occurrence terms in any Drake-like equation are now much lower than order-of-magnitude.

And Hogg forwarded the question to another astronomer, Jason Wright, who wrote:

The original questioner’s question (Thomas Basbøll’s submission from December) is addressed explicitly here.

In short, only the duration of transmission matters in steady-state, which is the final L term in Drake’s famous equation. Start time does not matter.

Regarding Andrew’s predicate “given that we haven’t hard any such signals so far” in the OP: despite the high profile of SETI, almost no actual searching has occurred because the field is essentially unfunded (until Yuri Milner’s recent support). Jill Tarter analogizes the idea that we need to update our priors based on the searching to date as being equivalent to saying that there must not be very many fish in the ocean based on inspecting the contents of a single drinking glass dipped in it (that’s a rough OOM, but it’s pretty close). And that’s just searches for narrowband radio searches; other kinds of searches are far, far less complete.

And Andrew is not wrong that the amount of popular discussion of SETI has gone way down since the ’90’s. A good account of the rise and fall of government funding for SETI is Garber (1999).

I have what I think is a complete list of NASA and NSF funding since the (final) cancellation of NASA’s SETI work in 1993, and it sums to just over $2.5M (not per year—total). True, Barnie Oliver and Paul Allen contributed many millions more, but most of this went to develop hardware and pay engineers to build the (still incomplete and barely operating) Allen Telescope Array; it did not train students or fund much in the way of actual searches.

So you haven’t heard much about SETI because there’s not much to say. Instead, most of the literature is people in their space time endlessly rearranging, recalculating, reinventing, modifying, and critiquing the Drake Equation, or offering yet another “solution” to the Fermi Paradox in the absence of data.

The central problem is that for all of the astrobiological terms in the Drake Equation we have a sample size on 1 (Earth), and since that one is us we run into “anthropic principle” issues whenever we try to use it to estimate those terms.

The recent paper by Sandberg calculates reasonable posterior distributions on N in the Drake Equation, and indeed shows that they are so wide that N=0 is not excluded, but the latter point has been well appreciated since the equation was written down, so this “dissolution” to the Fermi Paradox (“maybe spacefaring life is just really rare”) is hardly novel. It was the thesis of the influential book Rare Earth and the argument used by Congress as a justification for blocking essentially all funding to the field for the past 25 years.

Actually, I would say that an equally valid takeaway from the Sandberg paper is that very large values of N are possible, so we should definitely be looking for them!

So make of that what you will.

**P.S.** I posted this in July 2018. The search for extraterrestrial intelligence is one topic where I don’t think much is lost in our 6-month blog delay.

The post “Dissolving the Fermi Paradox” appeared first on Statistical Modeling, Causal Inference, and Social Science.

]]>

Back by popular demand . . . The Greatest Seminar Speaker contest!最先出现在天天牛牛炸金花 棋牌。

]]>Here was our bracket, back in 2015:

And here were the 64 contestants:

– Philosophers:

Plato (seeded 1 in group)

Alan Turing (seeded 2)

Aristotle (3)

Friedrich Nietzsche (4)

Thomas Hobbes

Jean-Jacques Rousseau

Bertrand Russell

Karl Popper

– Religious Leaders:

Mohandas Gandhi (1)

Martin Luther King (2)

Henry David Thoreau (3)

Mother Teresa (4)

Al Sharpton

Phyllis Schlafly

Yoko Ono

Bono

– Authors:

William Shakespeare (1)

Miguel de Cervantes (2)

James Joyce (3)

Mark Twain (4)

Jane Austen

John Updike

Raymond Carver

Leo Tolstoy

– Artists:

Leonardo da Vinci (1)

Rembrandt van Rijn (2)

Vincent van Gogh (3)

Marcel Duchamp (4)

Thomas Kinkade

Grandma Moses

Barbara Kruger

The guy who did Piss Christ

– Founders of Religions:

Jesus (1)

Mohammad (2)

Buddha (3)

Abraham (4)

L. Ron Hubbard

Mary Baker Eddy

Sigmund Freud

Karl Marx

– Cult Figures:

John Waters (1)

Philip K. Dick (2)

Ed Wood (3)

Judy Garland (4)

Sun Myung Moon

Charles Manson

Joan Crawford

Stanley Kubrick

– Comedians:

Richard Pryor (1)

George Carlin (2)

Chris Rock (3)

Larry David (4)

Alan Bennett

Stewart Lee

Ed McMahon

Henny Youngman

– Modern French Intellectuals:

Albert Camus (1)

Simone de Beauvoir (2)

Bernard-Henry Levy (3)

Claude Levi-Strauss (4)

Raymond Aron

Jacques Derrida

Jean Baudrillard

Bruno Latour

We did single elimination, one match per day, alternating with the regular blog posts. See here and here for the first two contests, here for an intermediate round, and here for the conclusion.

**2019 edition**

Who would be the ultimate seminar speaker? I’m not asking for the most popular speaker, or the most relevant, or the best speaker, or the deepest, or even the coolest, but rather some combination of the above.

Our new list includes eight current or historical figures from each of the following eight categories:

– Wits

– Creative eaters

– Magicians

– Mathematicians

– TV hosts

– People from New Jersey

– GOATs

– People whose names end in f

All these categories seem to be possible choices to reach the sort of general-interest intellectual community that was implied by the [notoriously hyped] announcement of ~~Slavoj Zizek~~ Bruno Latour’s visit to Columbia a few years ago.

**The rules**

I’ll post one matchup each day at noon, starting sometime next week or so, once we have the brackets prepared.

Once each pairing is up, all of you can feel free (indeed, are encouraged) to comment. I’ll announce the results when posting the next day’s matchup.

I’ll decide each day’s winner not based on a popular vote but based on the strength and amusingness of the arguments given by advocates on both sides. So give it your best!

As with our previous contest four years ago, we’re continuing the regular flow of statistical modeling, causal inference, and social science posts. They’ll alternate with these matchup postings.

The post Back by popular demand . . . The Greatest Seminar Speaker contest! appeared first on Statistical Modeling, Causal Inference, and Social Science.

Back by popular demand . . . The Greatest Seminar Speaker contest!最先出现在天天牛牛炸金花 棋牌。

]]>- R-squared for Bayesian regression models. {\em American Statistician}. (Andrew Gelman, Ben Goodrich, Jonah Gabry, and Aki Vehtari)
- Voter registration databases and MRP: Toward the use of large scale databases in public opinion research. {\em Political Analysis}. (Yair Ghitza and Andrew Gelman)
- Limitations of “Limitations of Bayesian leave-one-out cross-validation for model selection.” {\em Computational Brain and Behavior}. (Aki Vehtari, Daniel P. Simpson, Yuling Yao, and Andrew Gelman)
- Post-hoc power using observed estimate of effect size is too noisy to be useful. {\em Annals of Surgery}. (Andrew Gelman)
- Abandon statistical significance. {\em American Statistician}. (Blakeley B. McShane, David Gal, Andrew Gelman, Christian Robert, and Jennifer L. Tackett)
- The statistical significance filter leads to overconfident expectations of replicability. {\em Journal of Memory and Language} {\bf 103}, 151–175. (Shravan Vasishth, Daniela Mertzen, Lena A. Jäger, and Andrew Gelman)
- Large scale replication projects in contemporary psychological research. {\em American Statistician}. (Blakely B. McShane, Jennifer L. Tackett, Ulf Bockenholt, and Andrew Gelman)
- Do researchers anchor their beliefs on the outcome of an initial study? Testing the time-reversal heuristic. {\em Experimental Psychology} {\bf 65}, 158–169. (Anja Ernst, Rink Hoekstra, Eric-Jan Wagenmakers, Andrew Gelman, and Don van Ravenzwaaij)
- Ethics in statistical practice and communication: Five recommendations. {\em Significance}. (Andrew Gelman)
- Bayesian inference under cluster sampling with probability proportional to size. {\em Statistics in Medicine}. (Susanna Makela, Yajuan Si, and Andrew Gelman)
- Yes, but did it work?: Evaluating variational inference. {\em Proceedings of the 35th International Conference on Machine Learning}. (Yuling Yao, Aki Vehtari, Daniel Simpson, and Andrew Gelman)
- Why high-order polynomials should not be used in regression discontinuity designs. {\em Journal of Business and Economic Statistics}. (Andrew Gelman and Guido Imbens)
- Gaydar and the fallacy of decontextualized measurement. {\em Sociological Science}. (Andrew Gelman, Greggor Matson, and Daniel Simpson)
- Global shifts in the phenological synchrony of species interactions over recent decades. {\em Proceedings of the National Academy of Sciences}. (Heather M. Kharouba, Johan Ehrlén, Andrew Gelman, Kjell Bolmgren, Jenica M. Allen, Steve E. Travers, and Elizabeth M. Wolkovich)
- The Millennium Villages Project: A retrospective, observational, endline evaluation. {\em Lancet Global Health} {\bf 6}. (Shira Mitchell, Andrew Gelman, Rebecca Ross, Joyce Chen, Sehrish Bari, Uyen Kim Huynh, Matthew W. Harris, Sonia Ehrlich Sachs, Elizabeth A. Stuart, Avi Feller, Susanna Makela, Alan M. Zaslavsky, Lucy McClellan, Seth Ohemeng-Dapaah, Patricia Namakula, Cheryl A. Palm, and Jeffrey D. Sachs)

Supplementary appendix. - Don’t calculate post-hoc power using observed estimate of effect size. {\em Annals of Surgery}. (Andrew Gelman)
- Visualization in Bayesian workflow (with discussion). {\em Journal of the Royal Statistical Society A}. (Jonah Gabry, Daniel Simpson, Aki Vehtari, Michael Betancourt, and Andrew Gelman)
- Disentangling bias and variance in election polls. {\em Journal of the American Statistical Association}. (Houshmand Shirani-Mehr, David Rothschild, Sharad Goel, and Andrew Gelman)
- Don’t characterize replications as successes or failures. Discussion of “Making replication mainstream,” by Rolf A. Zwaan et al. {\em Behavioral and Brain Sciences}. (Andrew Gelman)
- Using stacking to average Bayesian predictive distributions (with discussion). {\em Bayesian Analysis} {\bf 13}, 917–1003. (Yuling Yao, Aki Vehtari, Daniel Simpson, and Andrew Gelman)
- Review of {\em New Explorations into International Relations: Democracy, Foreign Investment, Terrorism, and Conflict}, by Seung-Whan Choi. {\em Perspectives on Politics}. (Andrew Gelman)
- Benefits and limitations of randomized controlled trials. Discussion of “Understanding and misunderstanding randomized controlled trials,” by Angus Deaton and Nancy Cartwright. {\em Social Science \& Medicine}. (Andrew Gelman)
- The failure of null hypothesis significance testing when studying incremental changes, and what to do about it. {\em Personality and Social Psychology Bulletin} {\bf 44}, 16–23. (Andrew Gelman)
- Bayesian aggregation of average data: An application in drug development. {\em Annals of Applied Statistics} {\bf 12}, 1583–1604.

(Sebastian Weber, Andrew Gelman, Daniel Lee, Michael Betancourt, Aki Vehtari, and Amy Racine-Poon) - How to think scientifically about scientists’ proposals for fixing science. {\em Socius}. (Andrew Gelman)
- Learning from and responding to statistical criticism. {\em Observational Studies}. (Andrew Gelman)
- Donald Rubin. In {\em Encyclopedia of Social Research Methods}, ed.\ Paul Atkinson, Sara Delamont, Melissa Hardy, and Malcolm Williams. Thousand Oaks, Calif.: Sage Publications. (Andrew Gelman)

Enjoy. They’re listed in approximate reverse chronological order of publication date, so I guess some of the articles at the top of the list will be officially published in 2019.

The post Published in 2018 appeared first on Statistical Modeling, Causal Inference, and Social Science.

]]>

What to do when you read a paper and it’s full of errors and the author won’t share the data or be open about the analysis?最先出现在天天牛牛炸金花 棋牌。

]]>I would like to ask you for an advice regarding obtaining data for reanalysis purposes from an author who has multiple papers with statistical errors and doesn’t want to share the data.

Recently, I reviewed a paper that included numbers that had some of the reported statistics that were mathematically impossible. As the first author of that paper wrote another paper in the past with one of my collaborators, I have checked their paper and also found multiple errors (GRIM, DF, inappropriate statistical tests, etc.). I have enquired my collaborator about it and she followed up with the first author who has done the analysis and said that he agreed to write an erratum.

Independently, I have checked further 3 papers from that author and all of them had a number of errors, which sheer number is comparable to what was found in Wansink’s case. At that stage I have contacted the first author of these papers asking him about the data for reanalysis purposes. As the email was unanswered, after 2 weeks I have followed up mentioning this time that I have found a number of errors in these papers and included his lab’s contact email address. This time I received a response swiftly and was told that these papers were peer-reviewed so if there were any errors they would have been caught (sic!), that for privacy reasons the data cannot be shared with me and I was asked to send a list of errors that I found. In my response I sent the list of errors and emphasized the importance of independent reanalysis and pointed out that the data comes from lab experiments and any personally identifiable information can be removed as it is not needed for reanalysis. After 3 weeks of waiting, and another email sent in the meantime, the author wrote that he is busy, but had time to check the analysis of one of the papers. In his response, he said that some of the mathematically impossible DFs were wrongly copied numbers, while the inconsistent statistics were due to wrong cells in the excel file selected that supposedly don’t change much. Moreover, he blamed the reviewers for not catching these mistypes (sic!) and said that he found the errors only after I contacted him. The problem is that it is the same paper for which my collaborator said that they checked the results already, so he must have been aware of these problems even before my initial email (I didn’t mention that I know that collaborator).

So here is my dilemma how to proceed. Considering that there are multiple errors, of multiple types across multiple papers it is really hard to trust anything else reported in them. The author clearly does not intend to share the data with me so I cannot verify if the data exists at all. If it doesn’t, as I have sent him the list of errors, he could reverse engineer what tools I have used and come up with numbers that will pass the tests that can be done based solely on the reported statistics.

As you may have more experience dealing with such situations, I thought that I may ask you for an advice how to proceed. Would you suggest contacting the involved publishers, going public or something else?

My reply:

I hate to say it, but your best option here might be to give up. The kind of people who lie and cheat about their published work may also play dirty in other ways. So is it really worth it to tangle with these people? I have no idea about your particular case and am just speaking on general principles here.

You could try contacting the journal editor. Some journal editors really don’t like to find out that they’ve published erroneous work; others would prefer to sweep any such problems under the rug, either because they have personal connections to the offenders or just because they don’t want to deal with cheaters, as this is unpleasant.

Remember: journal editing is a volunteer job, and people sign up for it because they want to publish exciting new work, or maybe because they enjoy the power trip, or maybe out of a sense of duty—but, in any case, they typically aren’t in it for the controversy. So, if you do get a journal editor who can help on this, great, but don’t be surprised if the editors slink away from the problem, for example by putting the burden in your lap by saying that your only option is to submit your critique in the form of an article for the journal, which can then be sent to the author of the original paper for review, and then rejected on the grounds that it’s not important enough to publish.

Maybe you could get Retraction Watch to write something on this dude?

Also is the paper listed on PubPeer? If so, you could comment there.

The post What to do when you read a paper and it’s full of errors and the author won’t share the data or be open about the analysis? appeared first on Statistical Modeling, Causal Inference, and Social Science.

What to do when you read a paper and it’s full of errors and the author won’t share the data or be open about the analysis?最先出现在天天牛牛炸金花 棋牌。

]]>Mikhail Shubin has this great post from a few years ago on Bayesian visualization. He lists the following principles:

Principle 1: Uncertainty should be visualized

Principle 2: Visualization of variability ≠ Visualization of uncertainty

Principle 3: Equal probability = Equal ink

Principle 4: Do not overemphasize the point estimate

Principle 5: Certain estimates should be emphasized over uncertain

And this caution:

These principles (as any visualization principles) are contextual, and should be used (or not used) with the goals of this visualization in mind.

And this is not just empty talk. Shubin demonstrates all these points with clear graphs.

Interesting how this complements our methods for visualization in Bayesian workflow.

The post “Principles of posterior visualization” appeared first on Statistical Modeling, Causal Inference, and Social Science.

]]>

Authority figures in psychology spread more happy talk, still don’t get the point that much of the published, celebrated, and publicized work in their field is no good (Part 2)最先出现在天天牛牛炸金花 棋牌。

]]>And here’s Part 2. Jordan Anaya reports:

Uli Schimmack posted this on facebook and twitter.

I [Anaya] was annoyed to see that it mentions “a handful” of unreliable findings, and points the finger at fraud as the cause. But then I was shocked to see the 85% number for the Many Labs project.

I’m not that familiar with the project, and I know there is debate on how to calculate a successful replication, but they got that number from none other than the “the replication rate in psychology is quite high—indeed, it is statistically indistinguishable from 100%” people, as Sanjay Srivastava discusses here.

Schimmack identifies the above screenshot as being from Myers and Twenge (2018); I assume it’s this book, which has the following blurb:

Connecting Social Psychology to the world around us. Social Psychology introduces students to the science of us: our thoughts, feelings, and behaviors in a changing world. Students learn to think critically about everyday behaviors and gain an appreciation for the world around us, regardless of background or major.

But according to Schimmack, there’s “no mention of a replication failure in the entire textbook.” That’s fine—it’s not necessarily the job of an intro textbook to talk about ideas that didn’t work out—but then why mention replications in the first place? And why try to minimize it by talking about “a handful of unreliable findings”? A handful, huh? Who talks like that. This is a “Politics and the English Language” situation, where sloppy language serves sloppy thinking and bad practice.

Also, to connect replication failures to “fraud” is just horrible, as it’s consistent with two wrong messages: (a) that to point out a failed replication is to accuse someone of fraud, and (b) that, conversely, honest researchers can’t have replication failures. As I’ve written a few zillion times, honesty and transparency are not enuf. As I wrote here, it’s a mistake to focus on “p-hacking” and bad behavior rather than the larger problem of researchers expecting routine discovery.

So, the blurb for the textbook says that students learn to think critically about everyday behaviors—but they won’t learn to think critically about published research in the field of psychology.

Just to be clear: I’m *not* saying the authors of this textbook are bad people. My guess is they just want to believe the best about their field of research, and enough confused people have squirted enough ink into the water to confuse them into thinking that the number of unreliable findings really might be just “a handful,” that 85% of experiments in that study replicated, that the replication rate in psychology is statistically indistinguishable from 100%, that elections are determined by shark attacks and college football games, that single women were 20 percentage points more likely to support Barack Obama during certain times of the month, that elderly-priming words make you walk slower, that Cornell students have ESP, etc etc etc. There are lots of confused people out there, not sure where to turn, so it makes sense that some textbook writers will go for the most comforting possible story. I get it. They’re not trying to mislead the next generation of students; they’re just doing their best.

There are no bad guys here.

Let’s just hope 2019 goes a little better.

A good start would be for the authors of this book to send a public note to Uli Schimmack thanking them for pointing out their error, and then replacing that paragraph with something more accurate in their next printing. They could also write a short article for Perspectives on Psychological Science on how they got confused on this point, as this could be instructive for other teachers of psychology. They don’t have to do this. They can do whatever they want. But this is my suggestion how they could get 2019 off to a good start, in one small way.

The post Authority figures in psychology spread more happy talk, still don’t get the point that much of the published, celebrated, and publicized work in their field is no good (Part 2) appeared first on Statistical Modeling, Causal Inference, and Social Science.

Authority figures in psychology spread more happy talk, still don’t get the point that much of the published, celebrated, and publicized work in their field is no good (Part 2)最先出现在天天牛牛炸金花 棋牌。

]]>The topic is the combination of apparently contradictory evidence.

Let’s start with a simple example: you have some ratings on a 1-10 scale. These could be, for example, research proposals being rated by a funding committee, or, umm, I dunno, gymnasts being rated by Olympic judges. Suppose there are 3 judges doing the ratings, and consider two gymnasts: one receives ratings of 8, 8, 8; the other is rated 6, 8, 10. Or, forget about ratings, just consider students taking multiple exams in a class. Consider two students: Amy, whose three test scores are 80, 80, 80; and Beth, who had scores 80, 100, 60. (I’ve purposely scrambled the order of those last three so that we don’t have to think about trends. Forget about time trends; that’s not my point here.)

How to compare those two students? A naive reader of test scores will say that Amy is consistent while Beth is flaky; or you might even say that you think Beth is better as she has a higher potential. But if you have some experience with psychometrics, you’ll be wary of overinterpreting results from three exam scores. Inference about an average from N=3 is tough; inference about *variance* from N=3 is close to impossible. Long story short: from a psychometrics perspective, there’s very little you can say about the relative consistency of Amy and Beth’s test-taking based on just three scores.

Academic researchers will recognize this problem when considering reviews of their own papers that they’ve submitted to journals. When you send in a paper, you’ll typically get a few reviews, and these reviews can differ dramatically in their messages.

Here’s a hilarious example supplied to me by Wolfgang Gaissmaier and Julian Marewski, from reviews of their 2011 article, “Forecasting elections with mere recognition from small, lousy samples: A comparison of collective recognition, wisdom of crowds, and representative polls.”

Here are some positive reviewer comments:

– This is a very interesting piece of work that raises a number of important questions related to public opinion. The major finding — that for elections with large numbers of parties, small non-probability samples looking only at party name recognition do as well as medium-sized probility samples looking at voter intent — is stunning.

– There is a lot to like about this short paper… I’m surprised by the strength of the results… If these results are correct (and I have no real reason to suspect otherwise), then the authors are more than justified in their praise of recognition-based forecasts. This could be an extremely useful forecasting technique not just for the multi-party European elections discussed by the authors, but also in relatively low-salience American local elections.

– This is concise, high-quality paper that demonstrates that the predictive power of (collective) recognition extends to the important domain of political elections.

And now the fun stuff. The negative comments:

– This is probably the strangest manuscript that I have ever been asked to review… Even if the argument is correct, I’m not sure that it tells us anything useful. The fact that recognition can be used to predict the winners of tennis tournaments and soccer matches is unsurprising – people are more likely to recognize the better players/teams, and the better players/teams usually win. It’s like saying that a football team wins 90% (or whatever) of the games in which it leads going into the fourth quarter. So what?

– To be frank, this is an exercise in nonsense. Twofold nonsense. For one thing, to forecast election outcomes based on whether or not voters recognize the parties/candidates makes no sense… Two, why should we pay any attention to unrepresentative samples, which is what the authors use in this analysis? They call them, even in the title, “lousy.” Self-deprecating humor? Or are the authors laughing at a gullible audience?

So, their paper is either “a very interesting piece of work” whose main finding is “stunning”—or it is “an exercise in nonsense” aimed at “a gullible audience.”

The post Combining apparently contradictory evidence appeared first on Statistical Modeling, Causal Inference, and Social Science.

]]>

“Check yourself before you wreck yourself: Assessing discrete choice models through predictive simulations”最先出现在天天牛牛炸金花 棋牌。

]]>Typically, discrete choice modelers develop ever-more advanced models and estimation methods. Compared to the impressive progress in model development and estimation, model-checking techniques have lagged behind. Often, choice modelers use only crude methods to assess how well an estimated model represents reality. Such methods usually stop at checking parameter signs, model elasticities, and ratios of model coefficients. In this paper, I [Brathwaite] greatly expand the discrete choice modelers’ assessment toolkit by introducing model checking procedures based on graphical displays of predictive simulations. . . . a general and ‘semi-automatic’ algorithm for checking discrete choice models via predictive simulations. . . .

He frames model checking in terms of “underfitting,” a connection I’ve never seen before but which makes sense. To the extent that there are features in your data that are not captured in your model—more precisely, features that don’t show up, even in many different posterior predictive simulations from your fitted model—then, yes, the model is underfitting the data. Good point.

The post “Check yourself before you wreck yourself: Assessing discrete choice models through predictive simulations” appeared first on Statistical Modeling, Causal Inference, and Social Science.

“Check yourself before you wreck yourself: Assessing discrete choice models through predictive simulations”最先出现在天天牛牛炸金花 棋牌。

]]>Using multilevel modeling to improve analysis of multiple comparisons最先出现在天天牛牛炸金花 棋牌。

]]>I have mused on drafting a simple paper inspired by your paper “Why we (usually) don’t have to worry about multiple comparisons”.

The initial idea is simply to revisit frequentist “weak FWER” or “omnibus tests” (which assume the null everywhere), connecting it to a Bayesian perspective. To do this, I focus on the distribution of the posterior maximum or extrema (not the maximum a posteriori point estimate) of the joint posterior, given a data-set simulated under the omnibus null hypothesis. This joint posterior may be, for example, defined on a set of a priori exchangeable random coefficients in a multilevel model: it’s maxima just encodes my posterior belief in the magnitude of the largest of those coefficients (which “should” be zero for this data) and can be estimated for example by MCMC. The idea is that hierarchical Bayesian extreme values helpfully contract to zero with the number of coefficients in this setting, while non-hierarchical frequentist extreme values increase. The latter being more typically quantified by other “error” parameters such as FWER “multiple comparisons problem” or MSE “overfitting”. Thus, this offers a clear way to show that hierarchical inference can automatically control the (weak) FWER, without Bonferroni-style adjustments to the test threshold. Mathematically, I imagine some asymptotic – in the number of coefficients – argument for this behavior of the maxima, that I would need time or collaboration to formalize (I am not a mathematician by any means). In any case, the intuition is that because posterior coefficients are all increasingly shrunk, so is their maximum. I have chosen to study the maxima because it is applicable across the very different hierarchical and frequentist models used in practice in the fields I work on (imaging, genomics): spatial, cross-sectional, temporal, neither or both. For example, the posterior maximum is defined for a discretely indexed, exchangeable random process, or a continuously-indexed, non-stationary process. As a point of interest, frequentist distribution of spatial maxima is used for standard style multiple-comparisons adjusted p-values in mainstream neuroimaging, e.g. SPM.

I am very keen to learn more about the possible pros or cons of the idea above.

-Its “novelty”

– How it fares relative to alternative Bayesian omnibus “tests”, e.g. based on comparison of posterior model probabilities for an omnibus null model – a degenerate spike prior – versus some credible alternative model.

-How generally it might be formalized.

-How to integrate type II error and bias into the framework.

… and any more!

My reply:

This idea is not really my sort of thing—I’d prefer a more direct decision analysis on the full posterior distribution. But given that many researchers are interested in hypothesis testing but still want to do something better than classical null hypothesis significance testing, I thought there might be interest in these ideas. So I’m sharing them with the blog readership. Comment away!

The post Using multilevel modeling to improve analysis of multiple comparisons appeared first on Statistical Modeling, Causal Inference, and Social Science.

Using multilevel modeling to improve analysis of multiple comparisons最先出现在天天牛牛炸金花 棋牌。

]]>