Laboratory Based Non-Invasive Markers are Suboptimal in Detecting Advanced Fibrosis in Patients with Non-Alcoholic Steatohepatitis

Background and Aim: Hepatic fibrosis is a major determinant of clinical outcomes in patients with non-alcoholic steatohepatitis (NASH). We aimed to investigate the diagnostic performance of non-invasive tests in detecting advanced fibrosis (F3-4) in a large NASH cohort from central Ohio, the United States. Methods: Data of all patients with biopsy-proven NASH between 2014 and 2017 were collected. Diagnostic performance of aspartate aminotransferase (AST) to platelets ratio index (APRI), fibrosis-4 index (FIB-4) and NAFLD fibrosis score (NFS) were studied. Results: A total of 284 NASH patients were included, 27.82% of whom had F3-4. The cohort was predominantly female (60.92%) and White (88.38%) with a mean age of 50±13 years. The most common comorbidities were obesity (77.11%) and type 2 diabetes (49.65%). There was a significant difference in NFS between fibrosis stage F0-2 and F3-4 (-0.43±1.99 and 0.30±2.28, p=0.01). The sensitivity of APRI <1, FIB-4 <1.3, NFS <-1.455 were 28%, 64%, and 73.33%, respectively. The specificity of APRI ≥2, FIB-4 ≥3.25, NFS ≥0.675 were 93.1%, 84.73%, 74.26%, respectively. The negative predictive value of all three models ranged between 72.59% and 77.72%, and the positive predictive values were consistently low (<40.38%). The area under receiver operator curves of APRI, FIB-4, and NFS were 0.52, 0.55, and 0.59, respectively. Diagnostic performance of these models appeared to be better in older (>35 year) and male population. Conclusion: Overall APRI, FIB-4, NFS were suboptimal in detecting advanced fibrosis in our NASH cohort. Newer non-invasive tests with robust diagnostic accuracy are needed.

However, many patients fall into the indeterminate zone for fibrosis assessment with the scoring models. Factors such as age, liver enzymes levels, prevalence of obesity, diabetes, and fibrosis may influence diagnostic accuracy of these scoring models [10,11]. In addition, different regions and practice (e.g. decision on liver biopsy) may also affect the sample selection of the NASH population. Imaging based tests such as elastography appears to be promising but are not readily available in primary care settings or small hospitals. Therefore, majority of facilities use laboratory based NITs despite their limitations. The literature on the utility of NITs is growing all around the world. Majority of the reported studies are based on relatively small sample sizes and there is a need for larger studies on the utility of NITs for stage of fibrosis in NASH. It is also clinically relevant to test these NITs in regionspecific NASH populations. Therefore, we aim to examine the diagnostic performance of three commonly used fibrosis scoring models including FIB-4, NFS, and APRI for advanced fibrosis in our NASH population from central Ohio, the United States.

Methods
This study was conducted at the Ohio State University, Wexner Medical Center (OSUWMC), Columbus, Ohio where patients from central Ohio are referred. We reviewed the records of all patients with biopsy-proven steatohepatitis from 2014 to 2017. Patients who had history of excessive alcohol use or other competing liver etiologies were excluded. Excessive alcohol use among men was defined as consuming ≥21 standard drinks a week or ≥ 30 grams per day; and women consuming ≥14 drinks a week or ≥20 grams per day. Other liver etiologies including hepatitis B, hepatitis C, autoimmune hepatitis, hemochromatosis, alpha 1 antitrypsin deficiency, Wilson's disease, and history of liver transplant were excluded. We also excluded patients who had fatty liver disease due to chronic use of drugs (corticosteroids, methotrexate, tamoxifen), or total parenteral nutrition. We collected clinical data including age, gender, race, body mass index, comorbidities (obesity, type 2 diabetes, dyslipidemia, hypertension, hypothyroidism, obstructive sleep apnea, ischemic heart disease). We also collected information on history of bariatric surgery, history of alcohol use and smoking, and family history of liver and metabolic disorders.
Laboratory data including aspartate aminotransferase (AST), alanine aminotransferase (ALT), total and indirect bilirubin, alkaline phosphatase, albumin, hemoglobin, white blood cell counts, platelet, creatinine, and international normalized ratio (INR) were collected closest to the visit for liver biopsy within 6 months window. Patients with more than 5% missing data were not included in analysis. These data included triglyceride, lowdensity lipoprotein, high-density lipoprotein, glucose, ferritin, iron saturation, anti-smooth antibody, and anti-mitochondrial antibody.
The body mass index (BMI) was calculated using the formula: The APRI was calculated as AST (U/L)/(upper limit of normal)/ platelet count (x 10 9 /L) x 100 7 . The FIB-4 score was calculated according to the following formula: age x AST (U/L)/platelet count (x 10 9 /L) x √ALT (U/L) 4,5 . The NFS was calculated according to the following formula: -1.675 + 0.037 x age (years) + 0.094 x BMI (kg/ m 2 ) + 1.13 x impaired fasting glycaemia or diabetes (yes=1, no=0) + 0.09 x AST/ALT ratio -0.013 x platelet (x 10 9 /L) -0.06 x albumin (g/dL) 6 . We used literature-reported cut-offs of 1 and 2 for APRI, 1.3 and 3.25 for FIB-4, and -1.455 and 0.675 for NFS, respectively [5][6][7]. Specimens of liver pathology were fixed in formalin solution and stained with hematoxylin & eosin. Reticulin stain was used to assess stage of fibrosis. Mean length of liver biopsy sample was 20mm with at least 11 portal tracts. All of the biopsies were reviewed by two experienced liver pathologists at the OSUWMC.
Histological scoring of nonalcoholic steatohepatitis (NASH) and fibrosis were described according to the NAFLD Clinical Research Network criteria [12]. The Institutional Review Board of the OSUWMC approved the study.

Statistical Analysis
All statistical analyses were conducted using SAS 9.4 (SAS institute, Cary, NC). As the identification of patients with advanced fibrosis is of clinical importance, the patients were divided into two groups: patients with no/mild fibrosis (F0-2) and patients with advanced fibrosis (F3-4). Categorical variables were expressed as weighted frequency (percentage) and differences between groups were analyzed by χ2 tests or Fisher exact tests in the case of small cell sizes. Continuous variables were expressed as mean ± SD and differences were analyzed with student's t tests or Wilcoxon ranksum tests. Statistical significance was defined as p-value < 0.05. The AUROC with 95% confidence interval (CI) was calculated for each scoring model treated as a continuous variable.

Results
A total of 462 patients with liver biopsy-proven steatohepatitis were identified at OSUWMC during the study period. After chart review, 284 patients met the inclusion criteria for NASH for analysis. Baseline characteristics of these patients are shown in as compared to patients in the F3-4 group. The mean NFS score for patients with F0-2 and F3-4 were -0.43 ± 1.99 and 0.3 ± 2.28, respectively, p=0.01. No significant differences in APRI and FIB-4 scores were found between the two groups.  Note: NASH, non-alcoholic steatohepatitis; PPV, positive predictive value; NPV, negative predictive value; CI, confidence interval; APRI, aspartate aminotransferase to platelets ratio index; FIB-4, fibrosis-4 index; NFS, non-alcoholic fatty liver disease fibrosis score.

Subgroup Analysis
We performed various sub-group analysis to identify a group of patients who may benefit more from NITs. To examine the impact of  (Table 3). Sensitivity, specificity, PPV, and NPV of APRI, FIB-4, and NFS are shown in Table 3. Overall, all three models had good specificity with high cutoff values (>90%) for identifying F3-4 fibrosis in NASH patients younger than 35 but had poor sensitivity (<50%). As age advances, there was improved test sensitivity at the cost of lower specificity.  Note: NASH, non-alcoholic steatohepatitis; PPV, positive predictive value; NPV, negative predictive value; CI, confidence interval; APRI, aspartate aminotransferase to platelets ratio index; FIB-4, fibrosis-4 index; NFS, non-alcoholic fatty liver disease fibrosis score. Note: NASH, non-alcoholic steatohepatitis; ALT, alanine aminotransferase; PPV, positive predictive value; NPV, negative predictive value; CI, confidence interval; APRI, aspartate aminotransferase to platelets ratio index; FIB-4, fibrosis-4 index; NFS, non-alcoholic fatty liver disease fibrosis score. Elevated ALT is defined as women ≤30U/L, men ≤45U/L.

Discussion
With the enormous global prevalence of NAFLD, it is imperative to develop non-invasive diagnostic tools to identify high-risk population. There are multiple studies addressing the role of laboratory-based scoring models for assessment of fibrosis stage especially advanced fibrosis in patients with NASH [5,6,[13][14][15].
Majority of these studies are small comprising of sample size less than 200 [9]. The diagnostic performance of APRI, FIB-4, and NFS in our NASH cohort from central Ohio is consistent with but lower than other large studies [5,6,9,13]. The PPVs are consistently poor to detect advanced fibrosis in our cohort. The impact of gender on performance of NITs has not been reported previously. The underlying reason remains unclear and could be related to the gender differences on NASH development and progression. Recent meta-analysis showed that women have a lower risk of non-alcoholic fatty liver disease, but a higher risk of advance fibrosis than men, especially after age 50 years [16].
In addition, differences may exist between genders regarding laboratory values and NASH related comorbidities such as diabetes and obesity [17,18] Ohio. Therefore, the present findings may not be generalizable to other NASH populations. Second, this is a retrospective study and all data are collected from medical records.
Given the significant difference of the diagnostic performance of NITs between our study and other published studies, we made every effort to ensure accurate data collection. All patient records were reviewed by two study authors separately. This has reduced our sample size by 57 cases without any major changes in the findings. Third, this is a cross-sectional study with laboratory test results collected closest to the time of liver biopsy within a sixmonth window period. We know that the laboratory test results that are used to calculate the fibrosis scores may fluctuate over time.
Longitudinal studies assessing the value of these scoring models are needed to determine their utility in clinical practice. Despite these limitations, our study is in parallel with the other studies demonstrating suboptimal performance of these laboratory-based NITs, probably more useful ruling out rather than ruling in advanced fibrosis. Combinations or sequential use of other NITs particularly elastography has been suggested to improve the diagnostic value of NITs for advanced fibrosis [14,20,21]. In summary, the diagnostic accuracy of APRI, FIB-4, and NFS is suboptimal to predict advanced hepatic fibrosis in our NASH patient cohort from central Ohio, the United States. NFS has relatively better performance than FIB-4 or APRI. Age and gender appear to be affecting factors on performance besides regional differences of NASH populations. Clinicians should be aware of the limitations of current NITs and apply them to clinical practice appropriately. There is a need for further studies to develop strong NITs to detect advanced fibrosis in patients with

Disclosure Statement
All authors declare no conflict of interest.