Are GWAS studies of IQ/educational attainment problematic?

Nick Matzke writes:

I wonder if you or your blog-colleagues would be interested in giving a quick blog take on the recent studies that do GWAS (Genome-Wide-Association Studies) on “traits” like IQ, educational attainment, and income?

Matzke begins with some background:

The new method for these studies is to claim that a “polygenic score” can be constructed — these postulate that there are thousands of SNPs (single-nucleotide polymorphisms) that have tiny independent effects on the trait, and that by adding these up, the trait can be predicted to some degree. (The SNPs could themselves be causal, or perhaps in linkage disequilibrium (LD) with causal SNPs.)

I am an evolutionary biologist/phylogeneticist, but I do not work in GWAS. However, my sense of it is that the main way these studies work is that they construct hundreds of thousands of individual linear models, one for each (non-linked) SNP, do something like a Bonferroni correction, and then take all the SNPs beyond the p-value cutoff (something like 5×10-8, although even within a paper there seem to be multiple cutoffs used) as the interesting ones. Then the individual effects are summed to produce a polygenic score for educational attainment, IQ, etc.

This work today has received a huge publicity boost in the New York Times:

– An editorial by a psychologist arguing that progressives should welcome these new results, with few hints about the limitations and problems with these kinds of studies:

Why Progressives Should Embrace the Genetics of Education
By Kathryn Paige Harden
Dr. Harden is a psychologist who studies how genetic factors shape adolescent development.
July 24, 2018

– A news report by Carl Zimmer on the results, which seems much more responsible and mentions some of the limitations stated in the paper, but not what I think are possible bigger statistical issues:

Years of Education Influenced by Genetic Makeup, Enormous Study Finds
More than a thousand variations in DNA were involved in how long people stayed in school, but the effect of each gene was weak, and the data did not predict educational attainment for individuals.
By Carl Zimmer
July 23, 2018

Here’s the referenced paper:

Gene discovery and polygenic prediction from a genome-wide association study of educational attainment in 1.1 million individuals
James J. Lee, Robbee Wedow, […]David Cesarini
Nature Genetics (2018)
Published: 23 July 2018
https://www.nature.com/articles/s41588-018-0147-3

There was another round of this a few months back in various publications, about race and intelligence, that also involved GWAS, and seemed to try to prepare/repair the ground for people to accept the idea of genetic differences in IQ between races.

– How Genetics Is Changing Our Understanding of ‘Race’
By David Reich
March 23, 2018

– DNA is not our destiny; it’s just a very useful tool
Ewan Birney
Yes, our genes affect everything we do, from educational attainment to health, but they are only a contributing factor
https://www.theguardian.com/science/2018/apr/05/dna-sequencing-educational-attainment-height

– Denying Genetics Isn’t Shutting Down Racism, It’s Fueling It
By Andrew Sullivan
http://nymag.com/daily/intelligencer/2018/03/denying-genetics-isnt-shutting-down-racism-its-fueling-it.html

Some pushback:

– Genetic Intelligence Tests Are Next to Worthless
And not just because one said I was below average.
Carl Zimmer, May 29, 2018
https://www.theatlantic.com/science/archive/2018/05/genetic-intelligence-tests-are-next-to-worthless/561392/

He then expresses some concerns:

I [Matzke] am worried that (a) these GWAS statistical methods might be fundamentally flawed, despite their widespread popularity, leading to wrong or largely wrong conclusions both about the genetics of intelligence/education and perhaps many other traits (medical traits etc.), and (b) if flawed methods are contributing to the same bad old narratives about genetic causes of inequality (going back to eugenics, anti-immigration propaganda, genetic racism, etc.), we really need to know that!

Things that make me worry:

* The Lee et al. study reports that their polygenic score, derived from ~1 million individuals, still explains only 11% of the variance in educational attainment, and the median effect for an individual SNP was ~1 week of education

* The effect sizes go down by 40% when family-level variation is used (e.g. siblings where one has the SNP and one doesn’t)

* The polygenic score’s predictive ability, such as it was for a European-derived population, didn’t work well for an African-American population. Another case of GWAS predictions outside of the training population being problematic is this one on schizophrenia:

https://www.biorxiv.org/content/early/2018/03/23/287136
=========================
Polygenic risk score for schizophrenia is more strongly associated with ancestry than with schizophrenia
David Curtis
doi: https://doi.org/10.1101/287136

Key quote: “There are striking differences in the schizophrenia PRS between cohorts with different ancestries. The differences between subjects of European and African ancestry are much larger, by a factor of around 10, than the differences between subjects with schizophrenia and controls of European ancestry. . . . Two kinds of explanation suggest themselves. The most benign, from the point of view of the usefulness of the PRS, is that the PRS does indeed indicate genetic susceptibility to schizophrenia and that the contributing alleles are under stronger negative selection in African than non-African environments. The least benign would be to say that the PRS is basically an indicator of African ancestry and that for some reason, perhaps through mechanisms such as social adversity, subjects in the PGC with schizophrenia have a higher African ancestry component than controls. It does not seem that the latter can be a full explanation, because it does seem that the PRS is associated with schizophrenia risk in a homogeneous sample after correction for principal components. On the other hand, it is difficult to accept that the PRS does not index ancestry to at least some extent. . . . Whatever the explanation, these results have important implications for the interpretation of the PRS. . . .”

Much of the GWAS data comes from sources like the UK BioBank. We know, even if all the samples are from “European” individuals, that there will be genetic structure in the data due to ancestral geography and isolation-by-distance. All of these social “traits” — education, income (and IQ which correlates with both — Ken Richardson [https://scholar.google.co.nz/scholar?q=%22Ken+Richardson%22++IQ&hl=en&as_sdt=0%2C5&as_ylo=1990&as_yhi=2018] argues that IQ may be nothing more than an index of these middle/upper-class attributes) also have geographically-structured variation, simply due to the history of economic development (among many other things). It seems to me that all it would take would be some regional historical variation in wealth/education, and some spatial structure in the genetics, to lead to weak correlations between certain alleles and educational attainment. That would apply in the UK with deep ancestral genetic structure, or in the USA where the history of immigration (even just European immigration) has been highly nonrandom, as has the wealth and status accumulation by ethnic group. I think it is not a stretch to say that there might be a difference in wealth and educational attainment between USA people with different European ethnicities — say, classic WASP populations in New England that date back to before the American Revolution, versus southern and eastern European populations that came later. This difference in wealth/average education would not have a genetic cause, but it would definitely have genetic correlations.

Matzke concludes:

I wonder if these GWAS studies for wealth/IQ/education are mostly picking up accidental correlations due to ancestry (perhaps with a moderate proportion of genuinely causal alleles, perhaps mostly ones with a pathological effect). This would be a ready explanation of why polygenic scores can be nonpredictive or pathological outside of the training population, and why the effect sizes decrease dramatically when studying variants within families.

In other words, are GWASes on education and perhaps many other social traits mostly bunk? Are we perhaps going to see another great statistical crisis (like the crises in small-data psychology, or the p-value/replicability crises), but in the “big data” arena of Genome-Wide Association Studies?

And, is Harden’s essay in the New York Times, “Why Progressives Should Embrace the Genetics of Education”, thus wildly misguided, expressing confidence about statistical results that we shouldn’t be confident in, and dissuading skepticism about the Very Long And Bad history of people trying to explain systematic inequalities through genetics, when in fact we should be maintaining or increasing our skepticism in the modern world of genomics and GWAS?

PS: There is an extensive FAQ from the authors of the Nature Genetics study, which makes me feel somewhat better about the population stratification issue:
https://www.thessgac.org/faqs

Also this from Graham Coop about the generic topic of between-population differences:
Polygenic scores and tea drinking

This stuff is so technical, and I have not tried to follow the details. But the topic seems important enough that I thought I’d share with all of you. Speaking generally, I can see the appeal of both sides of the argument. On one hand, even noisy data can provide some insight, and it seems reasonable to start by drawing hypotheses and tentative conclusions based on what we have; on the other, when a variable or set of variables explains only a small percentage of the variation in the outcome, you have to be concerned that selection biases will overwhelm any effect of interest. We can draw an analogy here to surveys with 10% response rates: for many purposes this is just fine, as long as we adjust for relevant differences between sample and population, but there will be questions for which the results of any comparisons are driven by biases that are hard to adjust for.