Thoughts on rationalism and the rationalist community from a skeptical perspective. The author rejects rationality in the sense that he believes it isn't a logically coherent concept, that the larger rationalism community is insufficiently critical of it's beliefs and that ELIEZER YUDKOWSKY IS NOT THE TRUE CALIF.
Even after reading the counterarguments it seems clear that part of the mathematical community suppressed a mathematical [article] (https://arxiv.org/pdf/1703.04184.pdf) because it was ideologically inconvenient. As detailed in the quillette piece a paper accepted for publication (and actually published online) at the Mathematical Intelligencer was yanked from publication because some mathematicians felt it’s mathematical modeling of the greater male variability hypothesis might be discouraging to prospective female mathematicians and/or be picked up by conservative media.
Obviously (if the account is close to accurate) this is a horrific violation of academic norms. If there is anything that academic freedom means it’s that ideas can’t be suppressed because they might support politically inconvenient conclusions and it appears that is exactly what happened. What makes the whole situation truly absurd is that the people complaining about the paper were the ones doing the real damage to gender equality. Obviously, one unheralded paper entertaining a greater male variability hypothesis will have a lot less of a harmful effect than even the chance of a controversy over pulling it and the subsequent attention and overreactions1. Not to mention that explaining the gender imbalance via bias and harassment is far more discouraging to potential female mathematicians than invoking a biological difference in male/female variability2.
Some mathematical blogs critisize the argument in the paper (see back and forth in comments) but don’t allege it’s absurd or incoherent. A more substantial counterargument is made in the comments at ycombinator where it is suggested that a rogue editor deliberately pushed the piece through and directed changes to make it an overtly political piece supporting his own views. However, looking at the actual [article] (https://arxiv.org/pdf/1703.04184.pdf) reveals a piece perfectly appropriate to the Intelligencer or indeed many peer reviewed journals (perhaps excepting the relatively unspecialized nature of the content)3:. Moreover, if the editor had truly gone rogue rather than just having made a decision the board disagreed with she would have been summarily fired.
Of course, at this point it’s just one isolated incident (and new facts could always emerge to change the narrative) but it’s critical to express our disapproval now so this doesn’t become acceptable behavior. Ideally, I’d like to see some kind of resolution or statement of principles at an AMA meeting so hopefully we don’t need to move to a boycott or anything. Of course, political and ideological considerations do affect what gets published all the time but it’s critical to reinforce the norm that such considerations shouldn’t matter which makes our response in the few clear cut cases like this all the more important.
What to do about the original paper is more of a puzzle. I’m tempted to say that some other periodical should publish it merely to avoid giving even an apparent victory to the forces of censorship. However, there aren’t many equivalents to the Intelligencier and maybe it really isn’t a great model (though hopefully peer review could tell us). Also, at this point publishing the paper really would send a very different politicized message that other periodicals would be understandably reluctant to send. So other than moving quickly to the end of the journal format I don’t really know what should be done about the paper.
I do not expect many women inclined to fight through group theory and differential geometry (as well as any gender specific barriers) to be deterred because some mathematicians aired an idea about gender differences in a journal. ↩
Moreover, prospective female mathematicians have no reason to believe that conditional on being the sort of woman even considering going into math they are any less likely to be good than their male colleagues. Indeed, after conditioning they might expect to be better if it is true boys are more encouraged to enter these fields as well. ↩
Indeed, the inclusion of the appendix citing support for the greater male variability hypothesis seems necessary to insulate against criticisms that it’s perpetuating an unacademic, unsupported talking point. ↩
Listening to the Last Week Tonight on Gene Editing (it’s pretty good) and seeing this debate about paying organ donors I’m compelled to call out the practice of simply asserting that something is ethically fraught or troublesome.
Both with respect to not compensating organ donors (something which could save huge numbers of lives) and with (mostly prospective) limits on eliminating genetic disease or even barring improvement I think we let people who are simply uncomfortable with change off the hook by constantly repeating the supposed truism that the issue is ethically fraught or there are serious ethical concerns. It’s basically a free pass that excuses the fact that they are putting their discomfort ahead of people’s welfare.
Under all the scenarios/conditions seriously being considered No, there aren’t ethical concerns. Fears like letting a bank reposes your kidney are no more relevant to the proposals on the table than the fear that debtors will enslave people is to wages. Similarly, concerns about racially motivated eugenics programs have no plausible relationship to any kind of gene therapy even being prospectively considered.
Of course, we should hear potential concerns about such policies just like we would for any other policy/technology. However, opponents should be on the spot to either shut up or come up with compelling arguments suggesting harms. Based on the fact that the opponent in the WSJ to paying for organ donation is reduced to arguments like “The introduction of money for a precious good comes at the cost of the ability for one to aspire to virtue” makes me doubt they can come up with such arguments.
I’d add that I think philosophers are partially to blame on this point. As a matter of philosophical interest we correctly find clever new arguments seeking to show that paid organ donation is actually somehow problematically coercive or otherwise wrong more interesting than the obvious argument that it saves lives. However, just as physicists need to convey to the public that the very thing which makes theories which deviate from the standard model interesting also makes them less likely I think philosophers need to do this as well.
So the following letter is being widely reported online as if it is evidence for the importance of gun control. I’m skeptical of the results as I detail in the next post but even if one takes the results at face value the letter is pretty misleading and the media reporting is nigh fraudulent.
In particular if one digs into the appendix to the letter one finds the following statement: “many of the firearm injuries observed in the commercially insured patient population may reflect non-crime-related firearm injuries.” This is unsurprising as using health insurance data means you are only looking at patients rich enough to be insured and willing to report their injury as firearms related: so basically excluding anyone injured in the commission of a crime or who isn’t legally allowed to use a gun. As a result they also analyzed differences in crime rates and found no effect.
So even on it’s face this study would merely show that people who choose to use firearms are sometimes injured in that use. That might be a good reason to stay away from firearms yourself but not additional reason for regulation as is being suggested in the media.
Moreover, if the effect is really just about safety at gun ranges then its unclear if the effect is from lower use of such ranges or that the NRA conference encourages greater care and best practices.
Reasons To Suspect The Underlying Study
Also, I’m pretty skeptical of the underlying claim in the study. The size of the effect claimed is huge relative to the number of people who attend an NRA conference. I mean about 40% of US households are gun owners but only ~80,000 people attend nationwide NRA conventions or ~.025% of the US population or ~.0625 of US gun owners. Thus, for this statistic to be true because NRA members are busy at the conference we would have to believe NRA conference attendees were a whopping 320 times more likely to be inflict a gun related injury than the average gun owner.
Now if we restrict our attention to homicides this is almost surely not the case. Attending an NRA convention requires a certain level of financial wealth and political engagement which suggests membership in a socioeconomic class less likely to commit gun violence and than the average gun owner. And indeed, the study finds no effect in terms of gun related crime. Even if we look to non-homicides gun deaths from suicides far outweigh those from accidents and I doubt those who go to an NRA convention are really that much more suicidal inclined.
An alternative likely explanation is that the NRA schedules its conferences for certain times of the year when people are likely to be able to attend and we are merely seeing seasonal correlations masquerading as effects from the NRA conference (a factor they don’t control for). Also as they run all subgroup analysises and don’t report the results for census tracks and other possible subgroups the possibility for p-hacking is quite real. Looking at the graph they provide I’m not exactly overwhelmed.
The claim gets harder to believe when one considers the fact that people who attend NRA meetings almost surely don’t give up going to firing ranges during the meeting. Indeed, I would expect (though haven’t been able to verify) that there would be any number of shooting range expeditions during the conference and that this would actually mean many attendees would be more likely to handle a gun during that time period.
Though, once one realizes that the data set one is considering is only those who make insurance claims relating to gun related injuries it is slightly more plausible but only at the cost of undermining the significance of the claim. Deaths and suicides are much less likely to produce insurance claims and the policy implications aren’t very clear if all we are seeing is a reduction in people injured because of incorrect gun grips (see the mythbusters about this..such injuries can be quite serious).
So my understanding (which might be wrong) is that (with a few rare exceptions) the paleontological value of fossil bones is entirely a function of their 3D shape (and perhaps a small sample of the material they are made of) and the information about where and in what conditions they are found.
Given that we now have 3D scanners shouldn’t museums and universities be selling off the originals to finance more research? Or am I missing something?
I’d add that the failure to have greater funding for new expeditions means we are constantly losing potential fossils to erosion, looters, damage etc… It’s crazy to think that the optimal overall scientific end is served by selling none of the fossils in institutional collections (even the low value ones) while knowing that there are probably high value fossils being lost because we aren’t finding them before they are damaged or that land is developed or whatever.
Also, one could simply include buy-back, borrowing or sampling clauses in any sale. Thus, at worst, when the museum wants to do later sampling it must buy back or partially compensate the current private owner putting them in a strictly better situation.
So I see people posting this vox article suggesting Trump, but not Clinton, supporters are racist and I want to advise caution and urge people to actually read the original study.
Vox’s takeaway is,
All it takes to reduce support for housing assistance among Donald Trump supporters is exposure to an image of a black man.
Which they back up with the following description:
In a randomized survey experiment, the trio of researchers exposed respondents to images of either a white or black man. They found that when exposed to the image of a black man, white Trump supporters were less likely to back a federal mortgage aid program. Favorability toward Trump was a key measure for how strong this effect was.
If you look at the actual study its chock full of warning signs. They explicitly did not find any statistically significant difference between those Trump voters given the prompts showing black or white aid recipients degree of support for the program or degree of anger they felt or blame they assigned towards those recipients. Given that this is the natural reading of Vox’s initial description its already disappointing (Vox does elaborate to some extent but not in a meaningfully informative way).
What the authors of the study did is asked for a degree of Trump support (along with many other questions such as liberal/conservative identification, vote preference, racial resentment giving researchers a worryingly large range of potentially analysises they could have conducted). Then they regressed the conditional effect of the black/white prompt on the level of blame, support and anger against degree of Trump support controlling for a whole bunch of other crap (though they do claim ‘similar’ results without controls) and are using some dubious claims about this regression to justify their claims. This should already raise red flags about research degree of freedom especially given the pretty unimpressive R^2 values.
But what should really cause one to be skeptical is that the regression of Hillary support with conditional effect of black/white prompt shows a similar upward slope (visually the slope appears on slightly less for Hillary support than it did for Trump) though at the extreme high end of Hillary support the 95% confidence interval just barely includes 0 while for Trump it just barely excludes it. Remember, as Andrew Gelman would remind us the difference between significant and non-significant results isn’t significant and indeed the study didn’t find a significant difference between how Hillary and Trump support interacted with the prompt in terms of degree of support for the program. In other words if we take the study at face value it suggests at only a slightly lower confidence level that increasing support for Hillary makes one more racist.
So what should we make of this strange seeming result? Is it really the case that Hillary support also makes one more racist but just couldn’t be captured by this survey? No, I think there is a more plausible explanation: the primary effect this study is really capturing is how willing one is to pick larger numbers to describe one’s feelings. Yes, there is a real effect of showing a black person rather than a white person on support for the program (though showing up as not significant on its own in this study) but if you are more willing to pick large numbers on the survey this effect looks larger for you and thus correlates with degree of support for both Hillary and Trump.
To put this another way imagine there are two kinds of people who answer the survey. Emoters and non-emoters. Non-emoters keep all their answers away from the extremes and so the effect of the black-white prompt on them is numerically pretty small and they avoid expressing strong support for either candidate (support is only a positive variable) while Emoters will show both a large effect of the black-white prompt (because changes in their opinion result in larger numerical differences) and a greater likelihood of being a strong Trump or Hillary supporter.
This seems to me to be a far more plausible explanation than thinking that increasing Hillary support correlates with increasing racism and I’m sure there are any number of other plausible alternative interpretations like this. Yes, the study did seem to suggest some difference between Trump and Hillary voters on the slopes of the blame and anger regressions (but not support for the program) but this may reflect nothing more pernicious than the unsurprising fact that conservative voters are more willing to express high levels of blame and anger toward recipients of government aid.
However, even if you don’t accept my alternative interpretation the whole thing is sketchy as hell. Not only do the researchers have far too many degrees of freedom (both in terms of the choice of regression to run but also in criteria for inclusion of subjects in the study) for my comfort but the data itself was gathered via a super lossy survey process creating the opportunity for all kinds of bias to enter into the process not to mention. Moreover, the fact that all the results are about regressions is already pretty worrisome as it is often far too easy to make strong seeming statistical claims about regressions, a worry which is amplified by the fact that they don’t actually plot the data. I suspect that there is far more wrong with this analysis than I’m covering here so I’m hoping someone with more serious statistical chops than I have such as Andrew Gelman will analyze these claims.
But even if we take the study’s claims at face value the most you could infer (and technically not even this) is that there are some more people who are racist among strong Trump supporters than among those who have low support for Trump which is a claim so unimpressive it certainly doesn’t deserve a Vox article much less support the description given. Indeed, I think it boarders on journalistically unethical to show the graphs showing the correlation between increasing support for Trump and prompt effect but not the ones showing similar effects for support of Hillary. However, I’m willing to believe this is the result of the general low standards for science literacy in journalism and the unfortunate impression that statistical significance is some magical threshold.
All it takes to reduce support for housing assistance among Trump supporters is exposure to an image of a black man. That’s the takeaway from a new study by researchers Matthew Luttig, Christopher Federico, and Howard Lavine, set to be published in Research & Politics.
A Request For Clarification On What Predictive Processing Rules Out
So Scott Alexander has an interesting book review up about Surfing Uncertainty which I encourage everyone to read themselves. However, most of the post is really an exploration of the “predictive processing” model for brain function. I’ll leave a more in depth explanation of what this model is to Scott and just offer the following excerpt for those readers to lazy to click through.
Predictive processing begins by asking: how does this happen? By what process do our incomprehensible sense-data get turned into a meaningful picture of the world.
The key insight: the brain is a multi-layer prediction machine. All neural processing consists of two streams: a bottom-up stream of sense data, and a top-down stream of predictions. These streams interface at each level of processing, comparing themselves to each other and adjusting themselves as necessary.
As these two streams move through the brain side-by-side, they continually interface with each other. Each level receives the predictions from the level above it and the sense data from the level below it. Then each level uses Bayes’ Theorem to integrate these two sources of probabilistic evidence as best it can. This can end up a couple of different ways.
The upshot of these different ways is that when everything happens as predicted the higher levels remain unnotified of any change but that when there is a mismatch it draws attention from these higher layers. However, in some circumstances a strong prediction from a higher layer can cause lower layers to “rewrite the sense data to make it look as predicted.”
I admit that I’m intrigued by the idea of predictive processing, especially the suggestion that our muscle control is actually effectuated merely by `predicting’ our arm will be in a certain state and acting to minimize prediction error. However, my first reaction is to wonder how much content there is in this model.
Describing some kind of processing or control task in terms of predictions has a certain universality kind of feel to it. This is only a vague sense based on a book review but I worry that invoking the predictive processing model to describe how our brains work is much like invoking the lambda calculus model to describe how a particular computer functions. Namely, I worry that predictive processing is such a powerful model that virtually anything remotely plausible as a mechanism for processing sense data and effectuating control over our limbs could be fit into the model — meaning it offers no real insight.
I mean it was already apparent before this model came on to the scene that how we see even low level visual data is affected by high level classifications. The various figure-ground illusions make this point quite clearly. It was also already apparent that attention to one task (counting passes) could limit our ability to notice some other kind of oddity (a guy in a gorilla suit). However, its far from clear that the predictive processing model really adds anything to our understanding here.
Indeed, to even make sense of these examples we have to understand the relevant predictions to happen at a very abstract level that is highly context dependent so that by focusing on the number of basketball passes in a game it no longer counts as a sufficiently unpredicted event when a man in a gorilla suit walks past (or allows some other story about why paying one sort of attention suppresses this kind of notice). That’s fine but allowing this level of abstraction/freedom in describing the thing to be predicted makes me wonder what couldn’t be suitably described in terms of this model.
The attempt to describe our imagination, e.g., our ability to picture a generic police officer in our minds, as utilizing the mental machinery that would generate a sense-data stream as a prediction to match against reality raises more questions. Obviously, the notion of matching must be a very high level one quite removed from the actual pictorial representation if the mental image we conjure when we think of policemen is to be seen as matching the sense-data stream experienced when we encounter a policeman. Yet if the level at which we are evaluating a predictive match is so abstract why do we imagine a particular image when we think of a policeman and not merely whatever vague high level abstracta we will judge to match when we actually view a policeman. I’m sure there is a plausible theory to tell here about invoking the same lower level machinery we use to process sense-data when we imagine and leveraging that same feedback but, again, I’m left wondering what work predictive processing is really doing here.
More generally, I wonder to what extent all these predictions wouldn’t result from just assuming, as we know to be true, that the brain processes information in ‘layers’, there can be feedback between these layers and frequently the goal of our mental tasks is to predict events or control actions. Its not even obvious to me that the claimed predictions of the theory like the placebo effect couldn’t have equally well been spun the other way if the effect had been different, e.g., when your high level processes predict that you won’t feel pain it will be particularly salient when you nevertheless do feel pain so placebo pain meds should result in more people reporting pain.
But I haven’t read the book myself yet so maybe predictive processing has been suitably preciscified in the book so as to rule out many plausible ways the brain might have behaved and to clearly predict outcomes like the placebo effect. However, I wrote this post merely to raise the possibility that a paradigm like this can fail precisely because it is too good at describing phenomena. Hopefully, my worries are misplaced and someone can explain to me in the comments just what kind of plausible models of brain function this paradigm rules out.
NASA’s Cassini probe is plunging to its death. The nuclear-powered spacecraft has orbited Saturn for 13 years, and sent back hundreds of thousands of images. The photos include close-ups of the gaseous giant, its famous rings, and its enigmatic moons – including Titan, which has its own atmosphere, and icy Enceladus, which has a subsurface ocean that could conceivably harbour microbial life.
Personally, I think the proposal to ‘change’ the p-value for significant results from .05 to .005 is a mistake. The only sense in which this proposal has any real bite is if journals and hiring committees respond by treating research that doesn’t meet p < .005 as less important but all that does is make the incentives for the kind of behavior causing all the problems much stronger.
I’d much rather have a well designed (ideally pre-registered) trial at p < .05 than a p < .005 result that is cherry picked as a result of after the fact choice of analysis. Rather than making the distinction between well designed appropriate methodology and dangerous potentially misleading methodology more apparent this further obscures it and tells any scientist who was standing on principle they need to stop hoping their better methodology will be appreciated and do something to compete on p-value with papers published using problematic data analysis.
In particular, I think this kind of proposal doesn’t take sufficient account of the economics and incentives of researchers. Yes, p < .005 studies would be more convincing but they also cost more (both in $ and time) so by telling fledgling researchers they need p < .005 you force them to put all their eggs in one basket making dubious data analysis choices that much more tempting when their study fails to meet the threshold.
What we need is more results blind publication processes (in which journals publish the results based merely on a description of the experimental process without knowledge of what the results found). That would both help combat many of these biases and truly evaluate researchers on their ability not their luck. Ideally such studies would be pre-accepted before results were actually analyzed. Of course there still needs to be a place for merely suggestive work that invites further research but it should be regarded as such without any particular importance assigned to p-value.
However, as these are only my brief immediate thoughts I’m quite open to potential counterarguments.