Uncovering genetic clues to long COVID: Insights from a global GWAS

Long COVID genetics

Although long COVID cases have surged in recent years, scientific insights into its underlying mechanisms are still limited. Few studies have explored the genetic factors that may influence who develops long COVID. The Sano GOLD dataset offers a valuable resource for investigating such gaps as it combines genotype and phenotype data from 1,996 individuals who have experienced long COVID. In a new study that was just published in Nature Genetics, investigators conducted a genome-wide association study (GWAS) and replication using data from 33 cohorts, including the Sano GOLD cohort, spanning 19 countries. In this blog, we cover key insights from the study. 

According to the World Health Organization, long COVID is characterized by the presence of symptoms that begin within 3 months of infection and subsequently persist for at least 2 months. Its prevalence is unclear, with estimates suggesting that 10% to 70% of individuals who get infected with COVID may develop long COVID. In total, the GWAS included data from 15,950 individuals with long COVID and approximately 1.8 million controls, as part of the COVID-19 Host Genetics Initiative (COVID-19 HGI). 

The main finding from the study was a significant genome-wide association within the FOXP4 locus, with certain variants exhibiting association with increased risk of long COVID. In line with previous studies, variants in the FOXP4 region were associated with severity of COVID. Importantly, vaccination was associated with decreased risk of long COVID. 

Interestingly, the variant frequency differed widely across ancestry, from 1.6% in non-Finnish Europeans to 36% in East Asians. While most individuals in the cohorts were of European ancestry, this variation underscores the importance of diverse representation in genomic studies. 

Blood sample analysis showed that FOXP4 levels were higher in non-acute COVID cases. This was associated with increased risk of long COVID in non-acute COVID samples but not in acute COVID samples. The expression of FOXP4 was found to be high in type 2 alveolar cells and granulocytes (immune cells) in the lung under normal conditions, indicating that its presence in the lungs was not an effect of infection.

Most FOXP4 variants associated with long COVID were found to be localized within active enhancers or transcription factor binding sites, indicating a potential implication in gene expression. Furthermore, one of the risk alleles for long COVID was associated with lung cancer in Biobank Japan samples. 

By combining data across 33 cohorts worldwide, including the Sano GOLD dataset, researchers identified genetic variants associated with long COVID and their potential implication in lung pathology. This study underscores the power of large-scale data collaboration and genomic research in uncovering the biological underpinnings of complex conditions like long COVID. 

Get in touch