Bayesian Epistemology: Perspectives and Challenges (10-14 August 2020)
The conference on 12-14 August 2020 is preceded by a Summer School on 10-11 August 2020.
Idea & Motivation
Bayesian epistemology remains the dominant account of rational beliefs, it underpins the dominant account of decision making in science and beyond, as well as many of our statistical methods.
While important applications continue to to emerge, the work on the foundations of Bayesian epistemology never stops and a number of challenges are emerging.
The aim of this conference is bring together scholars exploring applications, challenges and foundations of Bayesian epistemology.
Topics of interest (in alphabetic order) are not limited to:
- Bayesianism and Artificial Intelligence
- Bayesian Networks
- Bounded Rationality
- Evidence Aggregation
- Foundational Aspects of Bayesian Statistics
- Higher Order Evidence
- Imprecise Bayesian Approaches
- Interpretations of Probabilities
- Judgement Aggregation
- Maximum Entropy (Applications, Inference and Methods)
- Multi Agent Epistemology
- Objective Bayesian Epistemology
- Principles of Bayesianism (Conditionalisation, Probabilism, Total Evidence)
- Updating Procedures (Jeffrey, KL, L&P)
Speakers for the Conference
Speakers for the Summer School
- Leah Henderson (Groningen)
- James Joyce (Michigan)
- Anna Mahtani (LSE)
- Gerhard Schurz (Düsseldorf)
- Naftali Weinberger (MCMP, LMU Munich)
- Jürgen Landes (MCMP, LMU Munich)
In order to register for the conference, please send an email to Juergen.Landes@lrz.uni-muenchen.de with the subject line: Registration: Bayesian Epistemology.
The conference will be held online. Watch selected lectures @ Youtube.
Summer School 10.08 - 11.08.2020
|09:40 - 10:00||Welcome|
|10:00 - 11:00|
|11:00 - 11:15||Break|
|11:15 - 12:15|
|12:15 - 13:45||Lunch Break|
|13:45 - 14:45||Leah Henderson: Hierarchical Bayesian Modelling - Theory|
|14:45 - 15:45||Jürgen Landes: Objective Bayesian Epistemology|
|15:45 - 16:00||Break|
|16:00 - 17:00|
|10:00 - 11:00|
|11:00 - 11:15||Break|
|11:15 - 12:15|
|12:15 - 13:45||Lunch Break|
|13:45 - 14:45|
|14:45 - 15:45|
|15:45 - 16:00||Break|
|16:00 - 17:00|
Conference 12.08. - 14.08.2020
|09:40 - 10:00||Welcome|
|10:00 - 11:00|
|11:00 - 11:15||Break|
|11:15 - 12:15|
|12:15 - 13:45||Lunch Break|
|13:45 - 14:30||Mario Günther & Borut Trpin: Bayesians still don't learn from conditionals||Alicja Kowalewska & Rafal Urbaniak: Story coherence with bayesian networks|
|14:35 - 15:20||Miriam Bowen: Comparative beliefs and Imprecise Credences||Pavel Janda: Accuracy and Games with Absentmindedness|
|15:25 - 15:35||Break|
|15:35 - 16:20||Sven Neth: Rational Aversion to Information||Cancelled: Krzysztof Mierzewski: Probabilistic Stability and Statistical Learning|
|16:25 - 17:10||Cancelled: Francesca Zaffora Blando: Pride and Probability: A Tale of (Co-)Meagre Success|
|09:30 - 10:30|
|10:30 - 10:45||Break|
|10:45 - 11:30|
|11:35 -12:25||Andree Weber: Conciliatory Views on Peer Disagreement|
|12:25 - 14:00||Lunch Break|
|14:00 - 15:00||Leah Henderson: Emergent Compatibilism for IBE and Bayesianism|
|15:00 - 15:15||Break|
|15:15 - 16:00||Alex Meehan: Kolmogorov Conditionalizers Can Be Dutch Booked||Aviezer Tucker: Testimony and the analysis of disinformation|
|16:05 - 16:50||Snow Zhang: Trilemma about Deference, Judgment Aggregation and Disagreement|
|16:55 - 17:40||David Kinney: Why Average When You Can Stack?||Palash Sarkar & Prasanta Bandyopadhyay: Simpson's Paradox and Causality|
|09:30 - 10:15||Mario Günther: An Analysis of Actual Causation|
|10:15 - 10:30||Break|
|10:30 - 11:15||Patrick Klösel: Graphical Causal Modeling in Econometrics|
|11:20 -12:05||Rafal Urbaniak: Imprecise credences can increase accuracy wrt. claims about expected frequencies|
|12:05 - 13:50||Lunch Break|
|13:50 - 14:35||Cancelled: Andrea G. Ragno: A Vindication of Rudner-Steele's Argument||Patryk Dziurosz-Serafinowicz: The Value of Uncertain Evidence|
|14:40 - 15:25||Margherita Harris: Model-based Robustness Analysis||Michal Godziszewski: Fairness and Justified Representation in Judgment Aggregation and Belief Merging|
|15:25 - 15:40||Break|
|15:40 - 16:40||James Joyce: Accuracy First, or Accuracy Centered|
|16:40 - 16:45||Closing Words|
Abstracts Summer School
Hierarchical Bayesian modelling is a technique which allows for learning at multiple levels of abstraction. I will introduce the fundamentals of these models and some of the practical challenges in using them.top
Hierarchical Bayesian models have been used to tackle a variety of problems in cognitive science and in philosophy of science. I will give an overview of some of these applications.top
This tutorial will present an overview of the theory of credal accuracy and its use in justifying basic norms of Bayesian epistemology, including the requirements of probabilistic coherence and Bayesian updating. Topics to be covered include probabilism, truth-value estimation, strictly proper scoring rules, accuracy dominance, and Bayesian conditioning. Following this, the concept of an “expert” probability will be introduced. It will serve as our main model for the structure of evidential norms. The concept of objective chance and its associated evidential norm, the Principal Principle, will be used as our paradigm of an expert probability. top
This tutorial will begin by considering some attempts to derive epistemic norms from accuracy norms using epistemic utility theory. This proves to be more difficult than it appears. We then consider some putative alleged between norms of evidence and norms of accuracy. We will see that, when both sorts of norms are properly understood, no such conflicts can arise. Concepts to be covered here include chance dominance, epistemic utility theory, and surrogates for truth. top
Bayesians hold that rational degrees of belief are obtained by (Jeffrey) updating prior probabilities. They are, however, split about the choice of prior probabilities. Objective Bayesians insist that the relation between evidence and rational beliefs is, to a degree, objective. Hence, the choice of prior probabilities has to be objective, in some sense. In my talk, I will first briefly motivate and introduce this epistemology, then talk about recent advances from objectivists and end with some open problems.
A wide range of different disciplines work with the idea of credences (economics, formal epistemology, decision theory, philosophy), but little attention has been paid to the question: what are the objects of credence? In contrast, an analogous question - what are propositions? - has received a great deal of attention in the philosophy of language. I show how the two issues relate, and discuss some of the implications for users of the credence framework.
A prominent theory of propositions has been given by David Chalmers, and this is his two-dimensionalism account. I explain this account, and then consider how it can be applied by users of the credence framework to answer the question: what are the objects of credence? I discuss some implications of applying two-dimensionalism in this way, and conclude that it leads to some serious problems.
Gerhard Schurz (Düsseldorf): First lecture: Metainduction - Basic Account: A New Solution to the Problem of Induction?
The problem of induction, or Hume's problem, consists in the apparent impossibility of establishing a non-circular justification of induction, i.e. of the transfer of observed regularities from the past to the future. Hume's problem exemplifies in particular sharpness the regress problem of traditional foundationalism. This talk introduces to a new account to Hume’s problem and to the regress problem in general: the method of optimality justifications. This account concedes the force of Hume’s sceptical argu-ments against the possibility of a non-circular demonstration of the reliability of in-duction. What it demonstrates is that one can nevertheless give a non-circular justifi-cation of the optimality of induction, more precisely of meta-induction.
Results in mathematical learning theory have shown that is impossible to demon-strate the optimality of an 'object-level' prediction method in comparison to all other possible methods. This is the reason why Reichenbach's "best alternative" account of induction fails. The break-through of the optimality program lies in its application at the level of meta-methods. Meta-inductive methods use all cognitively accessible ob-ject-level methods and their track record as their input and attempt to construct from them an optimal method. Based on results in machine learning it can be demonstrated that there are meta-inductive prediction strategies whose predictive success are long-run optimal in all possible worlds in regard to all accessible methods, with tight upper bounds for short run losses that quickly vanish when the number of predictions in-creases. Moreover, the a priori justification of meta-induction generates a non-circular a posteriori justification of object-induction.
Literature: Gerhard Schurz: Hume's Problem Solved: The Optimality of Meta-induction. MIT Press, Cambridge/MA.
The basic account of meta-induction has two restrictions:
(i) it is restricted to prediction game with a finite number of competing methods of predictions that are accessible to the meta-inductivist, and
(ii) it assumes the events to be predicted are real-valued so that probabilities or weighted averages of events can be predicted.
There are two important extensions that overcome these restrictions. The finiteness restriction can be relaxed by allowing that the class of competing prediction methods is allowed to grow unboundedly. A universal long-run optimality result is provable even for this case.
Restriction (ii) is relaxed in so-called discrete prediction games. In these games predictions have to coincide with possible events. These games can handled by allow-ing for probabilistic prediction methods and considering their expected, as opposed to their actual success. While the optimality of this method (dominant in machine learn-ing) is not fully universal, a method of collective meta-induction has been developed that is universally optimal.
Discrete prediction games are of particular importance for a further generalization of the meta-induction approach: its generalization from prediction games to action games.
Finally it is shown that besides their universality a variety of dominance results can be established for meta-induction. These results may provide a new solution to the no free lunch problem.
In my talk I cover some of the basics of graphical causal models, focusing on bridge principles for linking causal hypotheses to probability distributions, and the role of such principles in causal search and in eliminating confounding. I then discuss the interpretation of probability in causal models and criticize an argument relying on the uniform assignment of probabilities to causal parameters.
In "Bayesian Orgulity" (2013), Belot argues that Bayesian agents are plagued by a pernicious type of epistemic immodesty. By the very nature of the Bayesian framework, they are bound to invariably expect that their beliefs will converge to the truth—and this is so even when, from a topological point of view, there are many data streams on which they will in fact fail to be inductively successful. In this talk, I will propose one possible strategy for evading Belot's worry. By appealing to the theory of algorithmic randomness, I will show that Belot's objection does not apply if one restricts attention to computable open-minded Bayesian agents and computable inductive problems. More precisely, we will see that, when a Bayesian agent with a computable open-minded prior estimates the values of a computable random variable, their successive estimates are guaranteed to converge to the truth both almost surely and on a topologically large set of data streams (i.e., on a co-meagre set of data streams).
While Imprecise Probabilities (IP) are, in some respects, a modest and useful generalisation of the standard Bayesian probabilistic approach to epistemology, IP has some issues that are perhaps holding back its widespread adoption. One such problem is “Belief Inertia”: the alleged inability of agents with imprecise priors to learn from evidence. In this paper I will demonstrate that a small change to the rule for updating imprecise probabilities solves the problem of belief inertia.
Michal Godziszewski (MCMP): Fairness and Justified Representation in Judgment Aggregation and Belief Merging
We put forth an analysis of actual causation. The analysis centers on the notion of a causal model that provides only partial information as to which events occur, but complete information about the dependences between the events. The basic idea is this: c causes e just in case there is a causal model that is uninformative on e and in which e will occur if c does. Notably, our analysis has no need to consider what would happen if c were absent. We show that our analysis captures more causal scenarios than any counterfactual account to date.
One of the open questions in Bayesian epistemology is how to rationally learn from indicative conditionals (Douven, 2016). Eva et al. (2019) propose a strategy to resolve this question. They claim that their strategy provides a "uniquely rational response to any given learning scenario". We show that their updating strategy is neither very general nor always rational. Even worse, we generalize their strategy and show that it still fails. Bad news for the Bayesians.
In science, obtaining the same result through different means (i.e. obtaining a ‘robust’ result) is often seen as a valid way to further confirm a hypothesis. The Bayesian should of course have something to say about the logic underpinning this method of confirmation. But, as Schupbach (2018) persuasively argues, Bayesian accounts of robustness analysis (RA) which rely on probabilistic independence to explicate the notion of RA diversity are in many cases woefully inadequate. Schupbach’s explanatory account of RA is a promising attempt to fill this gap. Indeed, by having ‘as its central notions explanation and elimination’, this account fits nicely with many empirically driven cases of RA in science, while at the same time providing important normative implications.
In this talk, however, I will assess Schupbach’s further claim that his explanatory account of RA ‘applies to model-based RAs just as well as it does to empirically driven RAs’. I will argue that applying his explanatory account of RA in the context of models is considerably more difficult than Schupbach suggests. Finally, I will consider what lessons we might learn from this difficulty, lessons about the viability of model-based robustness analysis as a method of confirmation.
A number of different views of the relationship between Inference to the Best Explanation (IBE) and Bayesianism have been proposed. I argue for a position I call ‘emergent compatibilism’, according to which the explanatory considerations involved in IBE emerge from an independently motivated Bayesian account. I discuss the assumptions behind this view, and show how it allows the Bayesian account to shed light on the relationship between different explanatory virtues.
Kierland and Monton presented an accuracy-based argument in favour of a credence of 1/2 as a solution to the duplicating Sleeping Beauty problem. I will argue, using the accuracy-first framework, that a credence of 1/2 should not be a solution to the duplicating case because it is rational only if the agent, upon her awakening, is sure that she is the original Beauty. This contradicts one of the key assumptions about the duplicating case. Namely that the agent is uncertain whether she woke up on Monday or Tuesday.
Richard Pettigrew’s “accuracy first” epistemology seeks to derive all legitimate epistemic norms from considerations of credal accuracy, thereby treating the alethic prescription “Hold accurate credences!” as the fundamental epistemic requirement. An alternative “accuracy centered” epistemology sees the relationship between alethic and evidential norms as one of symbiosis rather than subservience. While any system of epistemic norms must promote credal accuracy, the way we assess credal accuracy is beholden to our views about the correct epistemic norms. This leads to a pleasing picture of the relationship between accuracy and evidence in credal epistemology.
Formal and social epistemologists have devoted significant attention to the question of how to aggregate the credences of a group of agents who disagree about the probabilities of events. Most of this work focuses on strategies for calculating the mean credence function of the group. In particular, Moss (2011) and Pettigrew (2019) argue that group credences should be calculated by taking a linear mean of the credences of each individual in the group, on the grounds that this method leads to more accurate group credences than all other methods. In this paper, I argue that if the epistemic value of a credence function is determined solely by its accuracy, then we should not generate group credences by finding the mean of the credences of the individuals in a group. Rather, where possible, we should aggregate the underlying statistical models that individuals use to generate their credence function, using "stacking" techniques from statistics and machine learning first developed by Wolpert (1992). My argument draws on a result by Le and Clarke (2017) that shows the power of stacking techniques to generate predictively accurate aggregations of statistical models, even when all models being aggregated are highly inaccurate.
In this talk I introduce philosophy of methodology as a subfield of philosophy of science engaging closely with the actual scientific practice in empirical sciences. The main example illustrating this approach will be directed acyclic graphs (DAGs) for causal inference in epidemiology and econometrics. I present six reasons why econometrics can be enriched by a more widespread use of DAGs.
Alicja Kowalewska & Rafal Urbaniak (Gdansk University of Technology): Story Coherence with Bayesian Networks
The accuracy-first programme tries to lay the normative foundations of Bayesianism. In particular, accuracy-firsters argue that accuracy, i.e. closeness to truth, is the sole source of epistemic value and that credences satisfying the Bayesian norms are somehow systematically more accurate than others. An important part of this task is to justify a mathematical characterisation of accuracy that delivers the desired result. In this talk, I question the possibility of such a justification.
Typically, accuracy-firsters justify their characterisation by arguing that the different aspects of it are “intuitive” or “natural” or that the alternatives are “absurd”. This line of argument seems to presuppose that characterising accuracy is about analysing the ordinary language concept of accuracy, where this analysis is justified by linguistic intuition. If this is indeed the case, then a general worry is not far away: Is our ordinary language concept of accuracy really determinate and precise enough to warrant the rather narrow kind of mathematical characterisation that accuracy-firsters require? A natural suspicion is that it is not.
To corroborate this suspicion, I examine two influential characterisations of accuracy. First, Joyce’s famous original one in the 1998 paper that launched the accuracy-first programme. Second, the latest one given by Richard Pettigrew in his 2016 book about the accuracy-first programme. In both cases, I pick out one central aspect of the characterisation and show that it is doubtful that (linguistic) intuition supports it in the right way. In fact, I argue that intuition even undermines it.
Moreover, I identify a common property of both characterisations that causes the trouble: Both characterisations entail that distance to truth has the mathematical property of being strictly convex, but this does not seem to be part of our ordinary language concept of accuracy. Since strict convexity is apparently indispensable for the desired mathematical results, this is a significant challenge for the accuracy-first programme. Finally, I briefly explore the option of taking the characterisations of accuracy to be fruitful precisifications instead of conceptual analyses. In this case, accuracy-firsters would be less bound to the ordinary language concept. Instead, they would have to argue for the theoretical fruitfulness of their characterisations.
A vexing question in Bayesian epistemology is how an agent should update on evidence which she assigned zero prior credence. Some theorists have suggested that she should update by Kolmogorov conditionalization (a norm based on Kolmogorov's theory of regular conditional distributions). However, it turns out that, in some situations, a Kolmogorov conditionalizer will plan to always assign a posterior credence of zero to the evidence she learns. Intuitively, such a plan is irrational and easily Dutch bookable. In this talk, based on joint work with Snow Zhang, I propose a revised norm, Kolmogorov-Blackwell conditionalization, which avoids this problem. I present our main result, a Dutch book and converse Dutch book theorem for this new norm, and relate it to the results of Rescorla (2018).
Leitgeb offered an acceptance rule based on the notion of probabilistically stable hypotheses: that is, hypotheses that maintain sufficiently high probability under conditioning on new information. According to the stability rule, a proposition ought to be accepted whenever it is logically entailed by some probabilistically stable hypothesis. When applied to discrete probability spaces, the stability rule guarantees logically closed and consistent belief sets, and it suggests a promising account of the relationship between subjective probabilities and qualitative belief.
Yet, most natural inductive problems—particularly those commonly occurring in statistical inference—are best modelled with continuous probability distributions and statistical models with a richer internal structure. In this talk, I discuss the behaviour of Leitgeb’s stability rule on Bayesian statistical models. I show that, for a very wide class of probabilistic learning problems, Leitgeb's rule yields a notion of acceptance that either fails to be conjunctive (accepted hypotheses are not closed under finite conjunctions) or is trivial (only hypotheses with probability one are accepted). These results apply to most canonical Bayesian models involving exchangeable random variables, such as parametric models with Dirichlet priors over discrete distributions (and, in particular, to every method in Carnap’s family of inductive methods). Analogous results also affect refined notions of stability which take into account the evidence structure in the learning problem at hand.
These results exhibit a serious tension for the stability rule: in Bayesian statistical models, important properties of priors that are conducive to inductive learning—open-mindedness, as well as certain symmetries in the agent’s probability assignments—act against conjunctive belief. We will see that the main selling points of the stability account of belief—its good logical behaviour and its close connection to the Lockean thesis—do not survive the passage to richer probability models.
Is more information always better? Or are there some situations in which more information can make us worse off? Good (1966) famously argued that expected utility maximizers should always accept more information, provided that the information is cost-free. I argue that Good presupposes that we are certain that we will update on any new information by Bayesian conditionalization. If we relax this assumption and assign a non-zero probability to Non-Bayesian updating, then it can be rational to reject free information – from both a pragmatic and an epistemic point of view.
Epistemic risk refers to the danger of committing to a wrong claim due to the state of uncertainty of our knowledge. To accept or reject a hypothesis, Rudner (1953) argued that ”scientist qua scientist makes value judgments” in virtue of this risk. Instead, Levi (1960) believed that there was no ethical trade-off between science and socio-political goals because of such risk and thus he retained the ideal of a value-free (VFI) science. Levi defends this position within the Bayesian framework.
An important stance among this debate is offered by Steele (2012) who specifies Rudner’s most important thesis into the following: ”scientist qua policy advisor makes value judgments”. In fact, the way scientists are instructed to transmit information is (very often) either too orthodox or it consists of a different probability function and does not properly represent their credal state. In other words, the degree of belief held by a science advisor is transmitted in a crude shape for different reasons.
This means that scientists cannot avoid making value judgments. Steele’s claim seems to be well-founded because of the pressure policy-makers put on scientists. Therefore, one might ask: if there was no immediate policy consequences which determined scientists’ prior beliefs (like in the case of policies which act upon climate change), could science be value-free with respect to Steele’s argument? In this article, we will outline why it is difficult to talk about VFI in cases where immediate policy consequences are absent.
In the first section, we will overview Steele’s main argument. In the second section, we will introduce the case where policy consequences do not inform scientists’ priors and explain why Levi’s contention with Rudner seems prima facie to be plausible. Finally, in the last section, we will object Levi’s view and two further arguments of VF with two different objections. Both will criticise an objective Bayesianism and its view on priors focusing on the work of Wheeler and Williamson (2011).
Palash Sarkar (Indian Statistical Institute) & Prasanta Bandyopadhyay (Montana State University): Simpson's Paradox and Causality
This abstract is available as PDF file (Abstract Schurz, 66 Kb).
Rafal Urbaniak (Gdańsk): Imprecise Credences Can Increase Accuracy wrt. Claims about Expected Frequencies
The evidence that we get from peer disagreement is especially problematic from a Bayesian point of view since the belief revision caused by a piece of such evidence cannot be modelled along the lines of Bayesian conditionalisation. In my talk, I will explain how exactly this problem arises, what features of peer disagreements are responsible for it, and what lessons should be drawn for both the analysis of peer disagreements and Bayesian conditionalisation as a model of evidence acquisition.