Currently viewing a development environment

Produced in partnership with NAVEL

Huge genetic databases are hurting marginalized people's health

White people could be offered genetic tests for certain health conditions, while other people could be offered incorrect or no testing at all

Rebecca Muir


University College London

Imagine this: you are a cash-strapped early-career health scientist, looking for your next big project. One day, you get your big break — the chance to study half a million people, and the freedom to focus on virtually any topic you like, from DNA mutations to blue cheese intake. Best of all, this study will cost you virtually nothing.

It's easy to imagine that organizations like the UK Biobank make anything possible. Biobanks are huge repositories containing health, genetic, and demographic information from volunteers. Researchers look through the vast amount of data to find new health patterns and trends. There are few limits: you can analyze scans of volunteers' hearts, infer their sexual behaviors, or study their reasoning skills

Over 850 UK Biobank papers have been published, with new studies appearing in journals constantly. Studies so far have found results which could improve global health, such as a study showing that anyone, regardless of their genetic background, can reduce their risk of dementia with a change in lifestyle

However, as promising as biobanks might seem, the data may tell only partial or even misleading stories. Criticisms of the project include that the research coming out of the UK Biobank will only benefit certain people, and even then, the usefulness of the health associations found are under question.

Compared to the 2011 UK census, Black, Indian, Pakistani and Chinese participants are all underrepresented in the Biobank by at least one third. David Curtis, at University College London, tested whether this under-representation of ethnic minority groups has any impact on schizophrenia genetics research.

diverse family people feet sitting grass

Black, Indian, Pakistani and Chinese participants are all underrepresented in the UK Biobank. 

He found that calculating the risk for schizophrenia when using Biobank data is only accurate for white European populations. This means that in the future, white people could be offered genetic tests for certain health conditions, while other people could be offered incorrect or no testing at all.

This is because of the complex evolutionary history of humans. While humans who migrated out of Africa and settled in Europe faced bottlenecks where their genetic diversity was reduced dramatically, Africans have maintained large and diverse populations, and so have a more unique genetic makeup.

Other researchers are investigating the Biobank's data as well. Na Cai, a statistical geneticist at the Wellcome Trust Sanger Institute and European Bioinformatics Institute, began thinking about how what gets put into the Biobank affects what conclusions come out of it, similar to Curtis' study on schizophrenia.

In her study, currently a pre-print posted on bioRxiv, Cai and colleagues decided to focus on major depressive disorder. Depression is one of the most common mental health disorders, and has been a major topic of investigation in genetic association studies.

Because of this, Cai was concerned that researchers might not be investigating depression specifically, but instead looking at the genetics of poor mental health in general.

Cai defined depression in five different ways, using both strict and loose criteria. For example, some people might tell their doctor that they feel depressed, but not meet the specific psychiatric definition of major depressive disorder. She looked to see if the same genetic variants were associated with each different definition of depression.

The results were surprising. She found less of a genetic contribution towards all the "looser" definitions of depression compared to the full assessment used by psychiatrists.

First, it shows that researchers do not have the power in their studies that they assume they do. Previously, it was assumed that it didn't matter too much if researchers defined depression loosely. It could be that these broader definitions are just milder cases of depression, or show less of a genetic association because more people in these groups are misdiagnosed, which dilutes the signal.

However, when the researchers controlled for these factors, nothing changed. The strict psychiatric definition of depression was still genetically distinct from these other versions, meaning that it had more genes associated with it, and there wasn't much overlap in the genes which all the definitions did share.

This throws into question whether papers which have found links between depression and genes are coming to the right conclusions. Are they finding a genetic basis for major depressive disorder, or are they showing something else — like the less specific genetic basis for poor mental health in general?

female scientist in the lab data work

We need to rethink how we collect data, especially when it comes to biobanks.

By Shopify Partners 

Both Cai and Curtis conclude that we need to rethink how we collect biobank data. Both issues are the result of design flaws present since the UK Biobank's inception. Cai does not necessarily think all participants need to be assessed by a psychiatrist. She suggests that we use new technologies, such as computer assessments and smartphone behavioral tracking, to diagnose people with clinical depression. 

But tackling the lack of diversity in biobank data requires those in charge to recognize that the current design excludes marginalized and hard-to-reach groups.

John Savill, the Chief Executive of the UK Medical Research Council, the organization which provided major funding for the Biobank, was reported by the Guardian to say in response to Curtis’ research that “I do not think it is helpful to cast concerns over experimental design as ‘equalities issues’”. 

However, David Heel, who is the Chief Investigator of the East London Genes & Health Project, which aims to improve the health of South Asian people in the UK, thinks that the UK Biobank's recruitment tactic of mailing a letter meant British-Bangladeshi and British-Pakistani people missed out. When reached via email, Heel said that, in regards to volunteers in the project, "A much better response rate comes from a face to face discussion," or "a trusted setting" such as talking at a doctor's office. 

Curtis also thinks more can be done, but is not optimistic that we can save the UK Biobank from this bias. He said “It may be too late to try to make the UK Biobank more representative. We may need to look to other initiatives...and to look to samples recruited in other countries."

Comment Peer Commentary

We ask other scientists from our Consortium to respond to articles with commentary from their expert perspective.

This is well written, Rebecca!

You know the other day I was reading about an organization that is trying to bridge this very gap in Biobanking by creating a bank based by/from Africa called “54 gene” has anyone heard of them?

Marnie Willman


University of Manitoba Bannatyne and National Microbiology Laboratory

This definitely plays into so much of the research we do that is later stored for epidemiology, genetic counseling, tracking of conditions, etc. Are we sure we have reached a diverse enough group and stored data from all sectors, to ensure these results are accurate? I think the Biobank was a revolutionary first step forward, and certainly not a bad thing. However, it’s time for an upgrade taking into account what we now know is true - without proper representation, which entails getting out there and getting the information from extremely remote areas sometimes, we can’t claim we have “a complete data set”, and the resulting findings may or may not be true for all ethnicities. I’m looking forward to seeing future changes to Biobank that either include a massive overhaul of representation, or the next generation of the system. Well written, Rebecca.