Connecting genes to diseases through proteins

Connecting genes to diseases through proteins

Hundreds of connections between different human diseases have been uncovered through their shared origin in our genome by an international research team, challenging the categorisation of diseases by organ, symptoms, or clinical speciality.

A new study published in Science  generated data on thousands of proteins circulating in our blood and combined this with genetic data to produce a map showing how genetic differences that affect these proteins link together seemingly diverse as well as related diseases.

Proteins are essential functional units of the human body that are composed of amino acids and coded for by our genes. Malfunctions of proteins cause diseases across most medical specialties and organ systems, and proteins are also the most common target of drugs that exist today.

The findings published help explain why seemingly unrelated symptoms can occur at the same time in patients and suggest that we should reconsider how diverse diseases can be caused by the same underlying protein or mechanism. Where a protein is a drug target, this information can point to new strategies for treating a variety of conditions, as well as minimising adverse effects.

In the study using blood samples from over 10,000 participants from the Fenland study, the team demonstrated that natural variation in 2,500 regions of the human genome is very robustly associated with differences in abundance or function of 5,000 proteins circulating in the blood.

This approach addresses an important bottleneck in the translation of basic science to clinically actionable insights. While large scale studies of the human genome have identified many thousands of variants in our DNA sequence that are associated with disease, underlying mechanisms remain often poorly understood due to uncertainties in mapping those variants to genes. By linking such disease-related DNA variations to the abundance or function of an encoded protein, the team produced strong evidence for which genes are involved, and identified novel mechanisms by which proteins mediate genetic risk into disease onset.

For example, multiple genome-wide association studies (GWAS) have linked a region of the human genome known as KAT8 with Alzheimer’s disease but failed to identify which gene in this region was involved. By combining data on both proteins and genes the team was able to identify a gene in the KAT8 region named PRSS8, which codes for the protein prostasin, as a novel candidate gene in Alzheimer’s disease. Similarly, they identified a novel risk gene for endometrial cancer (RSPO3).

The authors used these new insights to systematically test which of these protein-encoding genes affected a large range of diseases. They discovered more than 1,800 examples in which more than one disease was driven by variations in an individual gene and its protein products. What emerged was a network-like structure of human diseases, because many of the genes connected a range of seemingly diverse as well as related conditions in different tissues. This provides strong evidence that the respective protein is the origin, and points to new potential strategies for treatment.