Unraveling Novel Therapeutic Targets for Coronary Artery Disease Via Gene Signatures of Plasma Proteome: a Mendelian Randomization Study

: a


Introduction
Coronary artery disease (CAD) imposes a significant epidemiological burden as one of the leading causes of mortality globally.Despite advances in pharmacological and procedural intervention, the prevention, diagnosis, and management of CAD remain challenging for healthcare providers.Darapaladib (PLA2G7 inhibitor) and PCI intervention (COURAGE trial) do not significantly reduce cardiovascular events including myocardial infarction and stroke in CAD patients [1,2].The FOURIER trial demonstrated that PCSK9 inhibitor evolocumab significantly reduced the risk of cardiovascular events but is indicated only in patients with severe risk [3,4].Thus, there is a pressing need for novel target-based therapies for CAD.Traditional approaches to target discovery have been rely on omic-scale sequencing.While they provide rigorous insights into potential avenues for treatment, they are also associated with difficulties in sample acquisition, preparation, and processing as well as high costs.Here, we proposed an in silico approach that combines plasma proteomics and Mendelian Randomization (MR) to prioritize potential targets for the treatment of CAD.MR is an increasingly popular epidemiological tool that leverages Genome-Wide Association Study (GWAS) data to infer the causal genetic variants underlying the phenotype of interest.The proteome is the central mediator of the etiology and pathophysiology of diseases.Plasma proteome is of particular interest for cardiovascular diseases due to their involvement in the circulatory system and thus serves as the ideal candidates for omic-scale analyses [5,6].Here, we used the gene signatures of proteins (protein quantitative trait loci or pQTL) as the instrumental variables to identify potential therapeutic targets for CAD through MR.

Mendelian Randomization
The main workflow of our study is outlined in (Figure 1A).TwoSampleMR package (version 0.5.9) was used to perform the MR analysis by following the standard protocol and as previously described [7,8].Briefly, we first downloaded the GWAS data from Aragam et al, the largest CAD GWAS to date (discovery cohort), and Sun et al., the newest plasma proteome GWAS from UKB [9,10].Three assumptions must be satisfied for MR.The relevance assumption assumes that the single-nucleotide polymorphisms (SNPs), which are the instrumental variables used for the MR analysis, are significantly associated with the exposure.To this end, we first extracted the SNPs of pQTLS that have a P-value of being smaller than 5 ×10 -6, this is to ensure that the SNPs are genetically significant.We then clumped the SNPs with the parameter R2 <0.001, and kb=10,000.This is to ensure that only one SNP from a defined chromatin region is extracted to ensure that the laws of independent assortment are obeyed.Furthermore, we computed the F-statistics using the formul , where Rindicates the exposure variance of SNPs (i.e., the extent to which the exposure as a whole can be accounted for by an individual SNP), n denotes the sample size of SNPs, and k equals the number of IVs included (k = 1 for individual SNP).R 2 was obtained by using the formula ,where β is the effect size for the SNP and SE is the standard error for β.SNPs that are not strongly correlated with the exposure (F < 10) were excluded from the study.The second assumption is the exclusion restriction assumption.This is achieved by performing the build-in harmonization function of the TwoSampleMR package.Lastly, the independence assumption assumes that the SNPs are independent from pleiotropy.This is achieved by performing the pleiotropy test.A pleiotropy P-value of smaller than 0.05 indicates significant pleiotropy.Odd ratios and P-value were generated from the buildin function of TwoSampleMR package using the inverse-weighted variance method.The analysis was replicated in Van der Harst et al (replication cohort) [11].The Bonferroni correction threshold for the pQTL was .

Target prioritization
The ties of targets are based on the following algorithm: Tier 1: already existing direct inhibitor that can be repurposed or investigated in clinical trials and pass the pleiotropy test in both cohorts.Tier 2: no direct inhibitors but has the possibility of targeting and passing the pleiotropy test in at least one cohort.Tier 3: high likelihood of off-target effect and does not pass the pleiotropy test in both cohorts.Potential targets: has relevant biological function from cited references.

KEGG Analysis
Clusterprofiler (version 4.3) package and reference to the KEGG pathways were used to annotate the protein list in supplemental [Table 3].Standard codes of Clusterprofiler were used and the vignette can be found in : https://bioconductor.org/packages/release/bioc/vignettes/ clusterProfiler/inst/doc/clusterProfiler.html

GSEA Analysis
GSEA analysis was performed using the GSEA software (version 4.3.2) with normalized gene expression from CAD patients and healthy individuals as the input.Vignette for GSEA software can be found in https://www.gsea-msigdb.org/gsea/index.jsp .The RNA-seq data was downloaded from GSE20680 [12].PRISM (version 10.

Plasma-proteomic-wide Mendelian Randomization reveals possible therapeutic targets
Out of the complete 3072 plasma protein panel, we were able to extract sufficient instrumental variables to perform our analyses to estimate the causal effect of 2942 proteins on the Aragam et al GWAS [9].Over 300 plasma risk proteins with OR>1.05 and P < 0.05 were identified as associated with an increased risk of CAD (Figure 1B).To enhance specificity, we performed Bonferroni correction (OR>1.05 and P < 1.7E-5) to reveal a list of candidate proteins, consisting of 9 proteins that meet the statistical requirement.The final list of candidate proteins is shown in Figure 1C.The GWAS from van der Harst et al was included as a replication cohort [11].PLA2G7, LPA, and PCSK9 are known to be associated with CAD, substantiating the validity of our analysis.

Prioritization of therapeutic targets
We prioritized targets based on the availability of direct inhibitors, the annotated function of proteins, and the pleiotropy test results (Figure 1D).C1S [OR= 1.07 (1.04-1.10),Pleiotropy P-value= 0.92], C1R [OR=1.08,Pleiotropy P-value = 0.07] and ENO2 [OR=1.09(1.05 to 1.14), Pleiotropy P-value=0.24]were determined to be tier 1 target.SNAP25 [OR=1.12(1.07 to 1.17), Pleiotropy P-value=0.004]and A1BG [OR=1.12(1.07 to 1.18), Pleiotropy P-value=0.92]were determined to be tier 2 targets.CA11 was determined to be a tier3 target [OR=1.35(1.18 to 1.53).Pleiotropy P-value=0.02].The complete analysis results are included in supplemental tables 1 and 2 for the Aragam et al and van der Harst et al cohorts, respectively.The repurposing of C1S inhibitor sutimlimab and targeting of ENO2 with POMHEX are of particular interest for their availability and mechanistic relevance.C1S is a critical component of the complement system involved in a myriad of biological and pathophysiological processes [13].ENO2 is an enolase isoenzyme responsible for catalyzing the conversion of 2-phosphoglycerate to phosphoenolpyruvate in the glycolysis pathway [14].For tier 2 targets, a clinical trial has shown that Botulinum toxin type A (inhibitor of SNAP25) injection to the pericardial fat pad suppressed atrial fibrillation following coronary artery bypass graft (CABG) longitudinally [15].Interestingly, the involvement of polymorphism of A1BG, a glycoprotein with limited characterization, in CAD was reported [16].CA11 is a catalytically inactive carbonic anhydrase with difficulty in selective targeting.In addition, it failed the pleiotropy test (P<0.05)across the primary and replication cohorts and is thus not prioritized as a therapeutic target.

Risk proteins are involved in immune-related signature and lipid metabolism
To elucidate the functional significance of the risk proteins in the pathophysiology of CAD, we selected a list of proteins by overlapping the risk proteins identified from the two CAD GWAS (supplemental table 3).Subsequently, we performed KEGG pathway enrichment analysis and showed that these proteins are involved in chemokine signaling, complement activation, chronic myeloid leukemia, and lipid metabolism pathways, among others (Figure 1E, 1F).These pathway analyses provided context into the biological significance of the target identified.

GSEA analysis shows the enrichment of complement and glycolysis pathways in CAD patients
We further explored RNA-Seq data of CAD patients versus healthy individuals (HI) whole blood from GSE20680.With Gene set enrichment analysis (GSEA), we revealed hallmark pathways similar to the KEGG results such as complement, cholesterol homeostasis, JAK-STAT signaling (involved in leukemia), and Volume 9; Issue 01 Cardiolog Res Cardiovasc Med, an open access journal ISSN: 2575-7083 interferon response.These results corroborated our findings on the biological processes of risk proteins.Of note, the enrichment in the glycolysis pathway provides additional rationale for targeting ENO2.Based on these findings, we delved into the risk protein list and further identified potential targets for CAD (Figure 1D).LCAT, SEM3AG, LAG3, and TGFB1 were considered to be potential targets based on their biological function.LCAT is involved in lipid metabolism, obesity, and cardiovascular events [17].SEMA3G is a purported adipokine whose KO suppressed high-fat diet (HFD)-induced obesity in mice [18].LAG3 is involved in T-cell regulation with the recently approved inhibitor relatlimab [19].Dysregulated TGFb signaling is proposed to be involved in vascular diseases, which is corroborated by our RNAseq and MR analyses [20].

Discussion
Overall, our study innovatively leveraged the pQTLs of plasma proteome and unraveled an avenue for the treatment of CAD.We have identified C1R/S, two critical components of the complement system, as potential therapeutic targets for the treatment of CAD.Previous studies have highlighted that the complement system plays an important role in atherosclerotic lesions and that suppressing the overactivation of the complement system can have therapeutic value [21,22].Results from our studies are consistent with these findings while providing evidence from a genetic perspective.Additionally, we also found that ENO2, an isozyme of enolase (ENO), a key component of glycolysis can be a promising target for CAD.Prior studies have reported that the genetic silencing of ENO can suppress apoptosis and mitigate mitochondrial dysfunction by habiting the release of mitochondrial cytochrome C, an important mediator of apoptotic response, into the cytoplasm [23].Thus, our finding showing that ENO2 is associated with the risk of CAD supported this observation.Furthermore, the complement activation pathway and glycolysis pathway were identified from our RNA-seq analysis to be enriched in CAD patients relative to healthy controls, collaborating with our MR findings.In short, our study has highlighted two potential avenues for the treatment of CAD: 1) suppression of the complement system and 2) glycolysis.
Our study has a few strengths.First, our study is statistically robust as we stratified the candidate proteins after Bonferroni correction into three tiers to prioritize novel targets to ensure that each target identified has genomic significance.Second, most studies applying MR do not further extend their studies by validating MR findings with other modalities, whereas we elucidated the biological function of the risk proteins and corroborated our findings with RNA-Seq data.However, our study is not free from limitations.First, PLA2G7 and PCSK9 were both identified.However, the former is not an efficacious target while the latter is indicated only for high-risk CAD patients, highlighting the fallibility of targeted therapy for CAD.Second, while we have endeavored to minimize sample overlaps, some of the reference samples in the outcomes overlap with the UK Biobank cohort, lowering the confidence of our conclusion.

Conclusion
In conclusion, although we have provided new insights into the treatment of CAD and demonstrated the application of pQTLs as valuable instruments in revealing drug targets for diseases, we must proceed with caution as in silico analysis cannot fully capture the complexity of disease biology in humans, especially for multifactorial diseases like CAD.

Figure 1 A
Figure 1 A: Plasma Proteome-based Mendelian Randomization Analysis of CAD Therapeutic Targets) : Overview of the study design(Created with Biorender.com. 2.2) was used to plot the enrichment results.Volume 9; Issue 01 Cardiolog Res Cardiovasc Med, an open access journal ISSN: 2575-7083

Figure 1 B:
Figure 1 B: Volcano plot showing the MR analysis results.Y-axis: -Log( P-value); X-axis: ln(odd ratios).The purple line indicates the P=0.05 threshold.The pink line indicates the P-value threshold after Bonferroni correction.

Figure 1 C:
Figure 1 C: Forest plot showing the odd ratio, confidence interval, and Pleitropy test results of selected targets.

Figure 1 D:
Figure 1 D: List of targets in the descending order of priority.

Figure 1 E
Figure 1 E: KEGG enrichment results from the overlapped proteins identified from MR.

Figure 1 F
Figure 1 F: GSEA enrichment results from RNA-seq data of CAD patients versus healthy controls.