Variations in Abnormal Nipple Discharge Management in Women- a Systematic Review and Meta-analysis
Alison Leong, Alison Johnston, Michael Sugrue*
Department of Breast Surgery, Breast Centre North West, Donegal Clinical Research Academy, Letterkenny University Hospital, Donegal, Ireland
*Corresponding author: Michael Sugrue, Department of Breast Surgery, Breast Centre North West, Donegal Clinical Research Academy, Letterkenny University Hospital, Donegal, Ireland. Tel: +353749188823; Fax: +353749188816; Email: michael.sugrue@hse.ie
Received
Date:
13 July, 2018; Accepted Date: 19
July, 2018; Published Date: 26 July,
2018
Citation: Leong A, Johnston A, Sugrue M (2018) Variations in Abnormal Nipple Discharge Management in Women- a Systematic Review and Meta-analysis. J Surg 2018: 1154. DOI: 10.29011/2575-9760.001154
1. Abstract
Nipple discharge accounts for 5% of referrals to breast units; breast cancer in image negative nipple discharge patients varies from 0 to 21%. This systematic review and meta-analysis determined variability in breast cancer rates in nipple discharge patients, diagnostic accuracy of modalities and surgery rates. An ethically approved meta-analysis was conducted using databases PubMed, EMBASE, and Cochrane Library from January 2000 to July 2015. For the breast cancer rates' review, studies were excluded if no clinical follow-up data was available. For the diagnostic accuracy meta-analysis, studies were excluded if there was no reference standard, or the number of true and false positives and negatives were not known. Pooled sensitivities were determined using Mantel-Haenszel method. For the surgery rates' review, only studies with consecutive nipple discharge patients were included. Average risk of having a breast cancer is 10.2% in nipple discharge patients. Most studies reported an age threshold of 50 above which breast cancer risk greatly increases. Pooled sensitivities of ultrasound, mammogram, mammogram and ultrasound, breast MRI, conventional galactography, smear cytology, ductal lavage cytology and ductoscopy were 0.64, 0.34, 0.65, 0.81, 0.75, 0.37, 0.49 and 0.82 respectively. Average surgery rate was 43.4%. Malignancy rate of 10.2% indicates the need to continue surgery, especially for patients aged over 50. Patients below 50, in the absence of risk factors such as family history, can be managed conservatively with close follow up.
2. Keywords:
Breast Cancer; Breast Diseases; Management; Nipple Discharge;
Pathologic; Women
1. Introduction
Nipple discharge accounts for about 3-5% of referrals to a breast unit, and is the third most common presenting symptom after a mass or breast pain [1-3]). The clinical challenge is differentiating physiological from pathological nipple discharge. The latter is usually spontaneous, persistent, unilateral, uniductal and may be bloodstained but can be clear, pink, serous or serosanguinous [4]. There are various guidelines and algorithms reported for nipple discharge management [5-22]. While an abnormal mammogram and subareolar ultrasound allows fine needle aspiration cytology, core biopsy, or excision to determine the pathology, a particular challenge occurs in patients with nipple discharge who have a normal physical examination and imaging. The risk of breast cancer in image negative nipple discharge patients varies depending on the patient cohort between 0 and 21% [23-27]. In such cases, while some algorithms propose observation, microdochectomy, or major duct excision, others propose the use of galactography. A significant number of countries have used intraductal approaches in nipple discharge evaluation, such as Japan, China, US and Turkey which have used ductoscopy and ductal lavage cytology, and Germany, Taiwan, Bulgaria, and the USA which have used galactography. For countries where galactography use for suspicious nipple discharge is routine, the European Society of Breast Cancer Specialists (EUSOMA) recommends the use of breast MRI when galactography fails for technical reasons [28]. Studies have examined the need for a duct excision as compared to conservative management involving a clinical follow-up [23,29]. This systematic review and meta-analysis aim to provide an overview of the variability in breast cancer rates of nipple discharge patients across different age groups, the diagnostic accuracy of modalities for nipple discharge management, and the differences in surgery rates of pathological nipple discharge patients.
2. Methods
2.1. Review Questions
The overarching questions for this review were:
1.
The risk of breast
cancer in women with pathological nipple discharge in different age groups.
2.
The diagnostic
accuracy of tests used in nipple discharge evaluation.
3. The rates of nipple discharge patients undergoing surgery.
2.2. Search Strategy
This systematic review was conducted in accordance with the PRISMA statement for systematic reviews and meta-analysis. An electronic search was conducted using PubMed, EMBASE and the Cochrane Library from January 2000 to July 2015 and the results were limited to those in English. The search strategy combined the Medical Subject Heading (MeSH), Emtree terms and free text words. The search terms used were: nipple discharge AND (breast diseases OR breast cancer OR intraductal OR lesion OR pathologic).
2.3.
Inclusion and Exclusion Criteria
For the first review question, studies were included
if they reported the number of breast cancer cases or percentages of those who
had breast cancer stratified for age for women with a pathological nipple discharge.
Studies were excluded if there was no clinical follow-up for those who did not
undergo a biopsy or surgery to avoid bias. For the second review question, all
studies that reported the diagnostic accuracy of tests for evaluation of women
with a pathological nipple discharge were included. Studies were excluded if i)
there was no reference standard for the index tests via pathology (surgical
excision or tissue biopsy) or clinical follow-up, ii) a four-field contingency
table for sensitivities and specificities could not be constructed as the
number of true and false positives and negatives were not clear or could not be
calculated, or the criteria for the classification of findings into positive
and negatives were not stated in the case of cytology, and iii) the population
involved only those who already had breast cancer as they were not reflective
of the general population and positive predictive values are known to increase
with a higher disease prevalence. Studies were also excluded if they were
non-English publications, or were conference abstracts, letters, reviews, case
series or case reports as these usually present limited data for analysis (Figure 1).
For the third review question, studies were included only if they were prospective studies involving the evaluation of women with abnormal nipple discharge or were retrospective studies involving the selection of consecutive patients from a prospective database. Studies were excluded if it was not clear how many underwent surgery, or the patients were pre-selected on the basis of having undergone certain diagnostic tests.
2.4.
Methodological Quality Assessment
For the diagnostic test accuracy review, studies that were included were assessed using the QUADAS-2 (Quality Assessment of Diagnostic Accuracy Studies) criterion which grades the quality of primary diagnostic studies via four domains involving patient selection, index test, reference standard, and flow and timing [30]. The risk of bias in each domain are assessed as “low”, “high”, or “unclear” via signalling questions. In addition, generalizability of studies is assessed simultaneously for the first three domains. This data is available in supplementary material.
2.5. Data Extraction
For studies that met the inclusion criteria for breast cancer rates in age groups, numbers with breast cancer determined from surgery, biopsy, and clinical follow-up together with numbers in cohort were extracted. Numbers or proportions of those with breast cancer stratified for age were also extracted. For the diagnostic test accuracy review, the following data were extracted from each study: first author, publication year, country, study design, setting, sample size, mean age and age range, type of index test, criteria for a positive test, contrast type and dose if applicable, and number of True Positives (TP), False Positives (FP), False Negatives (FN), and True Negatives (TN). TP, FP, FN, and TN were calculated according to the sample size of those who had breast pathology and those without. The following formula was used: TP = number of breast pathology patients × sensitivity; FN = number of breast pathology patients × (1-sensitivity); TN = number of breast pathology patients × specificity; FP = number of breast pathology patients × (1-specificity). For studies that met the inclusion criteria for surgery rates, numbers undergoing surgery and the numbers in cohort were extracted. Numbers with the various etiologies of a pathological nipple discharge were also extracted.
2.6.
Data Analysis
All data was entered into an excel spreadsheet for analysis. For the diagnostic tests’ accuracy review, data was analysed using Meta-DiSc 1.4. For each study, the sensitivities and the 95% confidence intervals were calculated. The primary objective was to determine the pooled sensitivities of diagnostic tests for nipple discharge evaluation using the Mantel-Haenszel method. The Cochran Q-statistic and I2 test was used to assess heterogeneity. For the reviews on breast cancer rates in different age groups and surgery rates of abnormal nipple discharge patients, proportions were calculated, and data summarised in tables. A funnel plot was used to show the surgery rates.
3. Results
3.1.
Review question 1:
The risk of breast cancer in women with pathological nipple discharge in
different age groups
Using
the aforementioned search terms and inclusion criteria, a total of 15 studies
were identified, 10 of which had mentioned the number of breast cancer cases
detected from observation or clinical follow-up in addition to biopsy and
surgery, and 5 of which grouped breast cancer risk into those under 50 years or
50 years and over.
The average risk of a having a breast cancer is 10.2%
in patient with features of a pathological nipple discharge i.e. unilateral
spontaneous bloodstained or serous discharge as shown in (Table 1).
Most studies reported an age threshold of 50 and above
where the risk of breast cancer with a pathological nipple discharge is greatly
increased (Table 2).
According to Morrogh et al. (2007), the incidence of cancer in nipple discharge patients with a negative standard evaluation was reported to be seven percent in patients younger than 40 years old, nine percent between 40 and 60 years old, and 14% over the age of 60 [34]. In a heterogeneous series of 116 patients with pathological discharge 4/9 identified with cancer were premenopausal [38]. Sabel et al. reported that four of seven cancer cases were in women less than or equal 40 years [23]. Yoon et al. (2015) reported than 23.5% of cancer cases were in those below 40 [36]. In contrast, Lau et al. (2005) reported that 10 of 11 patients with cancer were postmenopausal and recommended that all postmenopausal women with pathological nipple discharge undergo excision [39]. Moreover, Cabioglu et al. have shown that age 40 years and younger is a statistically significant predictor of clinically benign disease [18].
3.2.
Review question 2:
The diagnostic accuracy of tests used in nipple discharge evaluation
A total of 34 studies were included, and the
diagnostic test type and number of participants are summarised in (Table 3).
(Table 4) lists the quality assessment of the 34 eligible studies. In terms of
patient selection, in 1/34 of studies it was not clear whether a consecutive or
random sample of patients enrolled.
Variability was minimised as we only included
consecutive patients presenting with nipple discharge and excluded studies
involving only breast cancer patients, those with proliferations, and those
known to be high risk undergoing surgery [60-64]. In
terms of the risk of bias in the index test, in 4/34 of studies it was not
clear what the criteria for a positive test was as it was not pre-specified. In
terms of reference standard, 1/34 of studies used duct excision via ductoscopy
rather than the usual biopsy or surgery. In terms of flow and timing, in 20/34 of
studies it was not clear what the interval between the index test and the
reference standard was. Of the 34 studies included in the meta-analysis 10/34
(29.4%) were prospective and 10 countries represented. The overall number of patients
was 6997; mean age 48.7 ± 4.1 as shown in (Table 5).
Of the 9 studies reporting sensitivity and specificity
the mean sensitivity was 0.63 (0.2-1). Specificity is shown in (Table 6) .
12 studies assessed the sensitivity and specificity of mammography with a mean sensitivity of 0.3 and specificity of 0.7 as shown in (Table 7) .
When combining ultrasound and mammogram, which was
reported in 4 studies, it revealed a mean sensitivity and specificity of 0.7
and 0.7 respectively as shown in (Table 8) .
10 studies assessed the sensitivity and specificity of
breast MRI with a mean sensitivity of 0.7 and specificity of 0.7 as shown in (Table 9) .
The accuracy of galactography was reported in 12
studies with a mean sensitivity of 0.7 and specificity of 0.6 as shown in (Table 10) .
The accuracy of smear cytology was reported in 11
studies with a mean sensitivity of 0.4 and specificity of 0.8 as shown in (Table 11) .
6 studies assessed ductal lavage cytology with a mean
sensitivity of 0.5 and specificity of 0.9 as shown in (Table
12) .
For ductoscopy, while some studies focused on overall
lesion detection, others classified lesions into malignant and non-malignant.
Ductoscopy from all 8 studies reported a mean sensitivity of 0.8 and
specificity of 0.6 as shown in (Table 13) .
Overall it can be seen that variability exists in the criteria for a positive
test.
Mammogram followed by nipple discharge smear cytology
had the lowest pooled sensitivities. Mammogram combined with ultrasound
resulted in a higher pooled sensitivity. The tests with the highest pooled
sensitivities were breast MRI followed by galactography. For ductoscopy, while
some studies focused on overall lesion detection, others classified lesions
into malignant and non-malignant. Ductoscopy had a pooled sensitivity of 0.82
in lesion detection.
3.3.
Review question 3:
The rates of nipple discharge patients undergoing surgery
A total of 13 studies met the inclusion criteria. The
average surgery rate among women with a pathological nipple discharge was
43.4%, with the highest being 83.0% (Cabioglu 2002) and the lowest being 24.0%
(Yoon 2015) [18,36]. (Figure 3) shows the
surgery rates.
Etiology data was extracted from 3 studies which we
were able to group into complementary categories (Table
14). Papillomas (48.1%) followed by ductal ectasia (14.9%) were the main
causes of a pathological nipple discharge in women with a tissue diagnosis.
4. Discussion
Pathological nipple discharge characterised by unilateral spontaneous bloody or serous exudate from a single duct can be a cause for concern as it can be the underlying sign of breast cancer. Surgery is used to identify those with cancer and pre-malignant changes as well as to manage symptoms [5]. For many of those who undergo duct excision for nipple discharge, few cases reveal an underlying carcinoma [13]. This begets the question of whether a duct excision is necessary for all cases of suspicious nipple discharge. While studies have reported rates of breast cancer in nipple discharge patients ranging from 5.8% to 20.2% it is important to note that these only include patients undergoing surgery and does not account for others [5,18]. Our calculations taking into account patients who underwent surgery, biopsy, or observation show that the average risk of breast cancer is 10.2%. The risk of breast cancer in women with pathological nipple discharge increases with age and has been shown by studies to be much higher in those aged beyond 50. Hence, a surgical approach via central duct excision or microdochectomy is favored to rule out breast cancer in this age group. Most studies reported a higher risk of breast cancer beyond the age of 50 years. However, it would be useful if studies could be conducted to explore the risk of breast cancer in nipple discharge patients according to decades. Other factors to take into account regarding risk of breast cancer include a positive family history of breast cancer and a previous biopsy history [18]. A palpable mass in nipple discharge patients has also been shown to be a predictor of a malignant nipple discharge [11,12,32,36]. Morrogh et al. (2007) reported that only large volume nipple discharge appeared to be predictive of breast cancer [32]. Patients with spontaneous nipple discharge are also at an increased risk of breast cancer if they had a higher number of pregnancies and a longer period of lactation, and the possibility of breast cancer in patients with provoked nipple discharge should also be considered [42].
Sabel et al (2012) mentioned that the reason why it was recommended that all women with pathological nipple discharge underwent duct excision was because of the inadequate sensitivity of diagnostic modalities in the past [23]. This is supported by Gray et al. that there was no clear consensus on what diagnostic modality could reliably differentiate benign etiologies comprising a large number of patients from those with carcinoma comprising the relatively few patients [13]. Our meta-analysis showed that in patients with nipple discharge, breast MRI had the highest pooled sensitivity followed by galactography, and that mammography and nipple discharge smear cytology had the lowest pooled sensitivities. However, combinations of modalities, such as ultrasound and mammogram led to a higher sensitivity. As highlighted by Dolan et al. the limitations of cytology are that it has a high non-diagnostic rate, leads to a low number of cancers diagnosed using this technique, and is unable to distinguish between carcinoma in and invasive cancers [11]. Technical problems also result from insufficient retrieval of cellular material leading to an inconclusive result. Mammography has low sensitivity for nipple discharge patients as retroareolar lesions are often small and intraductal and lack calcifications. In addition Bahl et al. reported that in the 70% of patients who had normal mammogram but abnormal ultrasound findings, there were extremely dense breasts which can obscure breast and intraductal abnormalities [31]. Breast MRI has been shown to demonstrate the location and distribution most clearly, especially for a ductal carcinoma in situ [65]. It also has a high sensitivity for papillomas [66]. However, according to van Gelder et al., it does not have an added value in the evaluation of patients who have no signs of a malignancy on conventional diagnostic examinations, with malignancy being demonstrated in less than 2% [42].
5. Explanation for Variations
The Cochran-Q value and I2 test showed that there was statistically significant variability in the sensitivities of the various diagnostic tests. This can be attributed to patient selection, and differing criteria for a positive test as well as local expertise and interpretation. For example, Hou et al. (2002) reported good results for galactography that was due to the use of a monofilament polypropylene guiding suture that eased cannulation, and the availability of a pathologist on site to identify intraductal lesions together with the surgeon once the affected ducts were opened during the operation [45]. For studies on ductoscopy, while some studies focused on lesion detection, others differentiated between benign and malignant appearing lesions. Classifying lesions according to the system proposed by Makita et al. and Al Sarakbi et al. would depend on surgeon experience in differentiating the lesions. Another factor leading to variability would be the number of features of pathological nipple discharge i.e. unilateral, clear or bloody, and spontaneous considered in each study [30]. Variations in the breast cancer rates in consecutive nipple discharge patients are likely to be due to patient selection [30]. While 4.7% of patients were found to have cancer in the studies of Sabel et al. and Vargas et al., 23.7% of patients in the study of Morrogh et al. (2010) were found to have breast cancer (Table 1) [23,5,33]. This could be because the patients in Morrogh et al. study presented over a ten-year period and were selected to further undergo cytologic examination, ductography, or MRI followed by needle biopsy with or without surgery [30]. Variations in the breast cancer rates of nipple discharge patients with a tissue diagnosis are likely to be due to patient selection as well, along with differing diagnostic tests used to select patients for surgery. For example, some practices use ductography or MRI in addition to ultrasound and mammogram for nipple discharge. Some studies looked at biomarkers in nipple aspirate fluid, nipple discharge and ductal washings such as microsatellite alterations, chromosomal aneusomy and proteins and carbohydrates [67-69]. These were not included in our meta-analysis as there were too few reported for each biomarker for a pooled analysis. Novel methods such as mammary ductoscopy by helical CT, direct and indirect galactography and scintimammography were also not included for the same reason.
6. Conclusion
As the yield of malignancy can be low when nipple
discharge patients undergo excision, stricter guidelines regarding the need for
interventions is needed. A malignancy rate of 10.2% indicates the need to
continue surgery, especially for patients aged over 50 years old. Younger
patients below 50, in the absence of risk factors such as family history, a
palpable mass, an increased period of lactation or a high-volume discharge, can
be managed conservatively with close follow up. We also carried out the first
meta-analysis involving pooled sensitivities of diagnostic modalities for women
with pathological nipple discharge. Our final recommendation for practical
diagnostic work up of patients with nipple discharge would include a step-up
approach from combined clinical examination and mammography with ultrasound, to
selective additional investigation to include individualized request for nipple
fluid cytology, MRI Breast and, in selective units, ductoscopy in selected
patients. Irrespective of all investigations in high risk patients either
microdochotomy or central duct excision may be indicated.
Figure 1: Flow diagram of study selection.
Figure 3: Funnel plot showing the
percentage of pathological nipple discharge patients undergoing surgery.
Author |
Year |
No. of patients with cancer |
Total number of patients |
Percentage with cancer (%) |
Dinkel [31] |
2001 |
16 |
384 |
4.2 |
Vargas [5] |
2006 |
4 |
82 |
4.9 |
Gray [13] |
2007 |
7 |
124 |
5.6 |
Morrogh [32] |
2007 |
31 |
306 |
10.1 |
Morrogh [33] |
2010 |
68 |
287 |
23.7 |
Khan [34] |
2011 |
6 |
59 |
10.1 |
Sabel [23] |
2012 |
7 |
142 |
4.9 |
Ashfaq [26] |
2014 |
9 |
142 |
6.3 |
Bahl [35] |
2015 |
20 |
273 |
7.3 |
Yoon [36] |
2015 |
35 |
198 |
17.7 |
Average (n) |
|
20 |
200 |
9 |
Total |
203 |
1997 |
10.2 |
|
*Inclusion criteria vary with referral practice |
Table 1: Average risk of a patient with pathological nipple discharge having breast cancer diagnosed via biopsy, surgery, or clinical follow-up*.
Most studies reported an age threshold of 50 and above where the risk of breast cancer with a pathological nipple discharge is greatly increased (Table 2).
Author |
Year |
No. of patients |
% cancer in <50 yrs old |
% cancer in ≥50 yrs old |
Seltzer [2] |
2004 |
318 |
1.3 |
9.5 |
Gray [13] |
2007 |
204 |
0.0 |
6.0 |
Dolan [11] |
2010 |
313 |
2.0 |
15.0 |
Lubina [25] |
2015 |
56 |
7.7 |
20.0 |
Yang [37] |
2015 |
207 |
18.5 |
46.4 |
Table 2: Breast cancer risk in patients with nipple discharge stratified for age.
Diagnostic Test |
No. of studies looking at test |
Participants |
Ultrasound |
9 |
1100 |
Mammogram |
12 |
1318 |
Mammogram and ultrasound |
4 |
403 |
Breast MRI |
10 |
470 |
Conventional Galactography |
12 |
1007 |
ND Smear Cytology |
11 |
1036 |
Ductal Lavage |
6 |
582 |
Ductoscopy |
8 |
1169 |
Table 3: Number of studies for each diagnostic test in meta-analysis and the corresponding number of participants.
(Table 4) lists the quality assessment of the 34 eligible studies. In terms of patient selection, in 1/34 of studies it was not clear whether a consecutive or random sample of patients enrolled.
Study |
Year |
Risk of bias |
Applicability concerns |
|||||
Patient selection |
Index test |
Reference standard |
Flow and timing |
Patient selection |
Index test |
Reference standard |
||
Ashfaq [26] |
2014 |
L |
L |
L |
L |
L |
L |
L |
Bahl [35] |
2015 |
L |
L |
L |
L |
L |
L |
L |
Baitchev [40] |
2013 |
L |
? |
L |
L |
L |
L |
L |
Dietz [41] |
2002 |
L |
L |
? |
L |
L |
L |
L |
Dinkel [31] |
2001 |
L |
L |
L |
L |
L |
L |
L |
Gelder [42] |
2014 |
L |
L |
L |
? |
L |
L |
L |
Gray [13] |
2007 |
L |
L |
L |
? |
L |
L |
L |
Grunwald [43] |
2007 |
L |
L |
L |
? |
L |
L |
L |
Hahn [17] |
2009 |
L |
L |
L |
? |
L |
L |
L |
Hou [44] |
2001 |
L |
L |
L |
L |
L |
L |
L |
Hou [45] |
2002 |
L |
? |
L |
L |
L |
L |
L |
Kalu [46] |
2012 |
L |
L |
L |
L |
L |
L |
L |
Kamali [47] |
2014 |
L |
L |
L |
? |
L |
L |
L |
Kaplan [48] |
2011 |
? |
L |
L |
? |
? |
L |
L |
Khan [34] |
2011 |
L |
L |
L |
L |
L |
L |
L |
Kooistra [27] |
2008 |
L |
L |
L |
L |
L |
L |
L |
Lau [39] |
2005 |
L |
? |
L |
L |
L |
L |
L |
Lee [49] |
2002 |
L |
L |
L |
? |
L |
L |
L |
Liu [12] |
2008 |
L |
L |
L |
? |
L |
L |
L |
Lorenzon [50] |
2011 |
L |
L |
L |
? |
L |
L |
L |
Lubina [25] |
2015 |
L |
L |
L |
? |
L |
L |
L |
Manganaro [51] |
2015 |
L |
L |
L |
? |
L |
L |
L |
Morrogh [34] |
2007 |
L |
L |
L |
? |
L |
L |
L |
Ohlinger [52] |
2014 |
L |
L |
L |
? |
L |
L |
L |
Pritt [53] |
2004 |
L |
L |
L |
L |
L |
L |
L |
Sabel [23] |
2011 |
L |
L |
L |
? |
L |
L |
L |
Shen [54] |
2000 |
L |
L |
L |
? |
L |
L |
L |
Shen [55] |
2001 |
L |
L |
L |
? |
L |
L |
L |
Simmons [56] |
2003 |
L |
L |
L |
? |
L |
L |
L |
Tokuda [57] |
2009 |
L |
L |
L |
? |
L |
L |
L |
Vargas [5] |
2006 |
L |
? |
L |
? |
L |
L |
L |
Vaughan [58] |
2009 |
L |
L |
L |
L |
L |
L |
L |
Yamamoto [59] |
2001 |
L |
L |
L |
? |
L |
L |
L |
Yoon [36] |
2015 |
L |
L |
L |
? |
L |
L |
L |
L: Low risk?: Unclear risk, H: High risk. |
Table 4: QUADAS -2 risk of bias assessment.
Study |
Year |
Study Design |
Country |
No. of patients |
Mean age |
Ashfaq [26] |
2014 |
Retrospective |
USA |
192 |
|
Bahl [35] |
2015 |
Retrospective |
USA |
273 |
48 |
Baitchev [40] |
2013 |
Retrospective |
Bulgaria |
172 |
|
Dietz [41] |
2002 |
Retrospective |
USA |
121 |
52 |
Dinkel [31] |
2001 |
Retrospective |
Germany |
384 |
47.5 |
Gelder [42] |
2014 |
Retrospective |
Netherlands |
111 |
52 |
Gray [13] |
2007 |
Retrospective |
USA |
153 |
55 |
Grunwald [43] |
2007 |
Retrospective |
Germany |
64 |
|
Hahn [17] |
2009 |
Prospective |
Germany |
33 |
51.7 |
Hou [44] |
2001 |
Retrospective |
Taiwan |
487 |
44.7 |
Hou [45] |
2002 |
Retrospective |
Taiwan |
215 |
47.6 |
Kalu [46] |
2012 |
Retrospective |
USA |
89 |
49.3 |
Kamali [47] |
2014 |
Prospective |
Turkey |
430 |
|
Kaplan [48] |
2011 |
Retrospective |
USA |
50 |
50 |
Khan [34] |
2011 |
Prospective |
USA |
59 |
45 |
Kooistra [27] |
2008 |
Retrospective |
Netherlands |
618 |
|
Lau [39] |
2005 |
Retrospective |
Germany |
116 |
56.7 |
Lee [49] |
2002 |
Retrospective |
Taiwan |
174 |
41.5 |
Liu [12] |
2008 |
Prospective |
China |
1048 |
|
Lorenzon [50] |
2011 |
Retrospective |
Italy |
38 |
51.8 |
Lubina [25] |
2015 |
Prospective |
Germany |
50 |
51.2 |
Manganaro [51] |
2015 |
Retrospective |
Italy |
53 |
42 |
Morrogh [32] |
2007 |
Retrospective |
USA |
376 |
|
Ohlinger [52] |
2014 |
Retrospective |
Germany |
214 |
52.2 |
Pritt [53] |
2004 |
Retrospective |
USA |
39 |
|
Sabel [23] |
2011 |
Retrospective |
USA |
175 |
50.4 |
Shen [54] |
2000 |
Prospective |
China |
415 |
|
Shen [55] |
2001 |
Prospective |
China |
259 |
46 |
Simmons [61] |
2003 |
Retrospective |
USA |
108 |
49 |
Tokuda [56] |
2009 |
Prospective |
Japan |
47 |
49 |
Vargas [5] |
2006 |
Retrospective |
USA |
82 |
42 |
Vaughan [58] |
2009 |
Prospective |
USA |
89 |
|
Yamamoto [59] |
2001 |
Prospective |
Japan |
65 |
|
Yoon [36] |
2015 |
Retrospective |
Korea |
198 |
44.8 |
Table 5: Characteristics of studies included in meta-analysis.
Of the 9 studies reporting sensitivity and specificity the mean sensitivity was 0.63 (0.2-1). Specificity is shown in (Table 6) .
Study |
Criteria for abnormality |
TP |
FP |
FN |
TN |
Sensitivity [95% CI] |
Specificity [95% CI] |
Ashfaq [26] |
Mass or intraductal mass(es) |
8 |
38 |
0 |
102 |
1.00 [0.63, 1.00] |
0.73 [0.65, 0.80] |
Bahl [35] |
Subareolar and intraductal masses are coded BI-RADS category 4 or 5 if the patient has nipple discharge. Aim was to detect DCIS and invasive adenocarcinoma |
10 |
58 |
8 |
170 |
0.56 [0.31, 0.78] |
0.75 [0.68, 0.80] |
Gray [13] |
Mass or intraductal mass(es). Aim was to detect carcinoma |
5 |
38 |
1 |
5 |
0.83 [0.36, 1.00] |
0.12 [0.04, 0.25] |
Grunwald [43] |
Suspected papilloma or malignancy |
39 |
5 |
19 |
8 |
0.67 [0.54, 0.79] |
0.62 [0.32, 0.86] |
Hou [45] |
|
7 |
21 |
28 |
120 |
0.20 [0.08, 0.37] |
0.85 [0.78, 0.91] |
Lau [39] |
|
6 |
28 |
4 |
49 |
0.60 [0.26, 0.88] |
0.64 [0.52, 0.74] |
Liu [12] |
|
16 |
26 |
28 |
3 |
0.36 [0.22, 0.52] |
0.10 [0.02, 0.27] |
Lorenzon [50] |
Cases scored as BI-RADS 3, BI-RADS 4 or BI-RADS 5 with a final histological diagnosis of a malignant or highrisk lesion were considered as true positive, while cases assessed as BI-RADS 1 or BI-RADS 2 with a final histological diagnosis of malignant or high-risk lesions were considered as false negative |
12 |
3 |
7 |
16 |
0.63 [0.38, 0.84] |
0.84 [0.60, 0.97] |
Ohlinger [52] |
DEGUM (German equivalent of BIRADS) |
102 |
73 |
21 |
16 |
0.83 [0.75, 0.89] |
0.18 [0.11, 0.28] |
Table 6: Accuracy rates for use of ultrasound in women with pathological nipple discharge.
12 studies assessed the sensitivity and specificity of mammography with a mean sensitivity of 0.3 and specificity of 0.7 as shown in (Table 7) .
Study |
Criteria for abnormality |
TP |
FP |
FN |
TN |
Sensitivity [95% CI] |
Specificity [95% CI] |
Ashfaq [26] |
Mass, indeterminate/suspicious calcifications, or architectural distortion. Aim was carcinoma detection. |
2 |
11 |
7 |
157 |
0.22 [0.03, 0.60] |
0.93 [0.89, 0.97] |
Bahl [35] |
BIRADS |
3 |
5 |
17 |
237 |
0.15 [0.03, 0.38] |
0.98 [0.95, 0.99] |
Dietz [41] |
BIRADS 3-5 |
2 |
100 |
3 |
4 |
0.40 [0.05, 0.85] |
0.04 [0.01, 0.10] |
Gray [13] |
Indeterminate/suspicious calcifications, or architectural distortion. Aim was to detect carcinoma. |
3 |
5 |
3 |
3 |
0.50 [0.12, 0.88] |
0.38 [0.09, 0.76] |
Grunwald [43] |
BIRADS 3-5 for any abnormality |
22 |
1 |
36 |
12 |
0.38 [0.26, 0.52] |
0.92 [0.64, 1.00] |
Hou [45] |
|
1 |
4 |
34 |
137 |
0.03 [0.00, 0.15] |
0.97 [0.93, 0.99] |
Lau [39] |
|
4 |
27 |
5 |
56 |
0.44 [0.14, 0.79] |
0.67 [0.56, 0.77] |
Liu [12] |
|
18 |
16 |
19 |
10 |
0.49 [0.32, 0.66] |
0.38 [0.20, 0.59] |
Lorenzon [50] |
Cases scored as BI-RADS 3, BI-RADS 4 or BI-RADS 5 with a final histological diagnosis of a malignant or highrisk lesion were considered as true positive, while cases assessed as BI-RADS 1 or BI-RADS 2 with a final histological diagnosis of malignant or high-risk lesions were considered as false negative |
5 |
1 |
14 |
18 |
0.26 [0.09, 0.51] |
0.95 [0.74, 1.00] |
Ohlinger [52] |
BIRADS |
41 |
79 |
31 |
40 |
0.57 [0.45, 0.69] |
0.34 [0.25, 0.43] |
Simmons [56] |
Masses, nodules, microcalcifications |
4 |
20 |
3 |
32 |
0.57 [0.18, 0.90] |
0.62 [0.47, 0.75] |
Vargas [5] |
|
3 |
0 |
38 |
25 |
0.07 [0.02, 0.20] |
1.00 [0.86, 1.00] |
Table 7: Accuracy rates for use of mammogram in women with pathological nipple discharge.
When combining ultrasound and mammogram, which was reported in 4 studies, it revealed a mean sensitivity and specificity of 0.7 and 0.7 respectively as shown in (Table 8) .
Study |
Criteria for abnormality |
TP |
FP |
FN |
TN |
Sensitivity [95% CI] |
Specificity [95% CI] |
Liu [12] |
|
21 |
15 |
16 |
11 |
0.57 [0.39, 0.73] |
0.42 [0.23, 0.63] |
Lorenzon [50] |
Cases scored as BI-RADS 3, BI-RADS 4 or BI-RADS 5 with a final histological diagnosis of a malignant or high-risk lesion were considered as true positive, while cases assessed as BI-RADS 1 or BI-RADS 2 with a final histological diagnosis of malignant or high-risk lesions were considered as false negative. |
14 |
3 |
5 |
16 |
0.74 [0.49, 0.91] |
0.84 [0.60, 0.97] |
Sabel [23] |
MMG: benign findings, dilated retroareolar ducts, suspicious mass or asymmetry and suspicious microcalcifications US: Benign changes, dilated ducts, subareolar mass, intraductal mass or filling defect |
60 |
1 |
39 |
4 |
0.61 [0.50, 0.70] |
0.80 [0.28, 0.99] |
Yoon [36] |
US: BIRADS |
28 |
74 |
6 |
90 |
0.82 [0.65, 0.93] |
0.55 [0.47, 0.63] |
Table 8: Accuracy rates for use of ultrasound and mammogram in women with pathological nipple discharge.
10 studies assessed the sensitivity and specificity of breast MRI with a mean sensitivity of 0.7 and specificity of 0.7 as shown in (Table 9) .
Study |
Contrast Agent type |
Dose |
Criteria for abnormality |
TP |
FP |
FN |
TN |
Sensitivity [95% CI] |
Specificity [95% CI] |
Ashfaq [26] |
|
|
mass or suspicious enhancement pattern. Aim was carcinoma detection. |
1 |
3 |
1 |
4 |
0.50 [0.01, 0.99] |
0.57 [0.18, 0.90] |
Gelder [42] |
|
|
BIRADS >/= 5 |
2 |
3 |
3 |
99 |
0.40 [0.05, 0.85] |
0.97 [0.92, 0.99] |
Gray [13] |
|
|
Mass or suspicious enhancement pattern. Aim was carcinoma detection. |
1 |
1 |
0 |
1 |
1.00 [0.03, 1.00] |
0.50 [0.01, 0.99] |
Grunwald [43] |
|
|
Suspected papilloma or malignancy |
15 |
3 |
8 |
1 |
0.65 [0.43, 0.84] |
0.25 [0.01, 0.81] |
Lorenzon [50] |
Gadobenato Dimeglumina 0.5 M (Multihance, Bracco, Milan, Italy) was administered IV as an automated bolus injection |
Dose of 0.1 mL/kg body weight at a flow rate of 2 mL/s, followed by flushing of 20 mL of saline |
Cases scored as BI-RADS 3, BI-RADS 4 or BI-RADS 5 with a final histological diagnosis of a malignant or high risk lesion were considered as true positive, while cases assessed as BI-RADS 1 or BI-RADS 2 with a final histological diagnosis of malignant or high-risk lesions were considered as false negative |
18 |
4 |
1 |
15 |
0.95 [0.74, 1.00] |
0.79 [0.54, 0.94] |
Lubina [25] |
|
|
MR-BI-RADS® ratings of 1, 2, and 3 were regarded as benign, and 4 and 5 as malignant |
6 |
6 |
2 |
42 |
0.75 [0.35, 0.97] |
0.88 [0.75, 0.95] |
Manganaro [51] |
Gadobutrol |
Dose of 0.1mmol per body weight kilogram with a rate of 2 mL/s together with 10mL of saline bolus. |
NA. Overall benign, papillomatous, malignant and DCIS lesion detection |
44 |
0 |
1 |
8 |
0.98 [0.88, 1.00] |
1.00 [0.63, 1.00] |
Morrogh [32] |
|
|
BI-RADS 4 or 5 considered suspicious, while score of 1-3 considered negative |
11 |
8 |
4 |
29 |
0.73 [0.45, 0.92] |
0.78 [0.62, 0.90] |
Ohlinger [52] |
|
|
|
45 |
30 |
9 |
4 |
0.83 [0.71, 0.92] |
0.12 [0.03, 0.27] |
Tokuda (57) |
Gadopentetate dimeglumine (0.1mmol per kilogram of body weight) at a rate of 3 mL/s |
0.1mmol per kilogram of body weight at a rate of 3 mL/s |
Clustered ring enhancement evaluation for malignant detection |
9 |
2 |
6 |
20 |
0.60 [0.32, 0.84] |
0.91 [0.71, 0.99] |
Table 9: Accuracy rates for use of breast MRI in women with pathological nipple discharge.
The accuracy of galactography was reported in 12 studies with a mean sensitivity of 0.7 and specificity of 0.6 as shown in (Table 10) .
Study |
Contrast |
Criteria for abnormality |
TP |
FP |
FN |
TN |
Sensitivity [95% CI] |
Specificity [95% CI] |
Ashfaq [26] |
|
Duct cutoff or filling defect. Aim was carcinoma detection. |
1 |
11 |
0 |
4 |
1.00 [0.03, 1.00] |
0.27 [0.08, 0.55] |
Baitchev [40] |
urografin (0.5-2.0ml) |
|
11 |
44 |
5 |
73 |
0.69 [0.41, 0.89] |
0.62 [0.53, 0.71] |
Gray [13] |
|
Duct cutoff or filling defect. Aim was to detect carcinoma detection. |
1 |
16 |
0 |
1 |
1.00 [0.03, 1.00] |
0.06 [0.00, 0.29] |
Grunwald [43] |
|
Intraductal mass |
9 |
0 |
7 |
3 |
0.56 [0.30, 0.80] |
1.00 [0.29, 1.00] |
Hahn [17] |
|
NA. Aim was to assess intraductal epithelial proliferation |
17 |
5 |
5 |
4 |
0.77 [0.55, 0.92] |
0.44 [0.14, 0.79] |
Hou [44] |
|
intraductal filling defects |
32 |
32 |
3 |
109 |
0.91 [0.77, 0.98] |
0.77 [0.69, 0.84] |
Hou [45] |
Urografin (0.5-2 ml) |
Malignant : ductal obstructions and irregular intraductal defects. To a smaller extent, ductal wall irregularity, surr duct torsion and displacement; Benign: lobular (smooth) intraductal defect and ductal dilatation |
34 |
31 |
3 |
113 |
0.92 [0.78, 0.98] |
0.78 [0.71, 0.85] |
Lau [39] |
0.5-1 mL of a 1:1 solution of sterile, water-soluble contrast material (Solutrast; Altana Pharma GmbH, Konstanz, Germany) and toluol blue |
|
8 |
38 |
3 |
60 |
0.73 [0.39, 0.94] |
0.61 [0.51, 0.71] |
Manganaro [51] |
nonionic iodinated contrast agent (iopamidol 300) up to a maximum of 1-1.5 mL |
Findings were classified according to the Gregl scheme. Values calculated for the detection of ductal pathologies. |
22 |
0 |
23 |
8 |
0.49 [0.34, 0.64] |
1.00 [0.63, 1.00] |
Morrogh [32] |
|
filling defect or duct ectasia |
21 |
91 |
10 |
17 |
0.68 [0.49, 0.83] |
0.16 [0.09, 0.24] |
Ohlinger [52] |
|
possible intraductal lesion |
70 |
19 |
16 |
15 |
0.81 [0.72, 0.89] |
0.44 [0.27, 0.62] |
Simmons [56] |
|
intraductal or suggestive mass |
0 |
1 |
2 |
9 |
0.00 [0.00, 0.84] |
0.90 [0.55, 1.00] |
Table 10: Accuracy rates for use of galactography in women with pathological nipple discharge.
The accuracy of smear cytology was reported in 11 studies with a mean sensitivity of 0.4 and specificity of 0.8 as shown in (Table 11) .
Study |
Criteria for abnormality |
TP |
FP |
FN |
TN |
Sensitivity [95% CI] |
Specificity [95% CI] |
Dinkel [31] |
categories of “normal,” “dubious,” and “no cells” are considered to be negative, and those of “suspicious” and “positive” to be positive, |
5 |
4 |
11 |
153 |
0.31 [0.11, 0.59] |
0.97 [0.94, 0.99] |
Grunwald [43] |
Suspected papilloma or malignancy. |
18 |
2 |
31 |
7 |
0.37 [0.23, 0.52] |
0.78 [0.40, 0.97] |
Hahn [17] |
NA. Assessed epithelial intraductal proliferations via papillomatous cell detection |
1 |
1 |
20 |
8 |
0.05 [0.00, 0.24] |
0.89 [0.52, 1.00] |
Hou [45] |
Negative, atypia, suspicious, positive, inadequate |
13 |
14 |
22 |
127 |
0.37 [0.21, 0.55] |
0.90 [0.84, 0.94] |
Kalu [46] |
Negative cytology was defined as the presence of histiocytes, proteinaceous fluid, and the absence of epithelial cells. Atypical, suspicious, and papillary results were grouped together and designated as positive cytology. Benign non-papillary results were designed as negative cytology. |
44 |
21 |
15 |
9 |
0.75 [0.62, 0.85] |
0.30 [0.15, 0.49] |
Kaplan [48] |
positive (invasive and intraductal carcinoma), papillary, atypical, negative, unsatisfactory |
1 |
2 |
9 |
38 |
0.10 [0.00, 0.45] |
0.95 [0.83, 0.99] |
Kooistra [27] |
national cancer institute-recommended diagnostic categories: benign, atypical, suspicious or malignant (for accuracy calculation, suscipicious and malignant considered positive) |
6 |
43 |
30 |
84 |
0.17 [0.06, 0.33] |
0.66 [0.57, 0.74] |
Lee [49] |
nondiagnostic, benign, papilloma, indeterminate papillary lesion, atypia, suspicious or malignant. cytological categories of ‘nondiagnostic’, ‘benign’, ‘papilloma’, ‘indeterminate papillary lesion’, and ‘atypia’ are considered to be negative, and those of ‘suspicious’ and ‘malignant’ to be positive |
10 |
0 |
8 |
64 |
0.56 [0.31, 0.78] |
1.00 [0.94, 1.00] |
Ohlinger [52] |
unremarkable ductal epithelium vs papilloma or carcinoma |
18 |
8 |
61 |
47 |
0.23 [0.14, 0.34] |
0.85 [0.73, 0.94] |
Pritt [53] |
Categories: negative, atypical, suspicious, and positive. Diagnoses of “negative” and “atypical” were considered negative for malignancy and diagnoses of “suspicious” and “positive” were considered positive for malignancy. |
11 |
1 |
2 |
31 |
0.85 [0.55, 0.98] |
0.97 [0.84, 1.00] |
Simmons [56] |
Benign, atypical, malignant |
1 |
1 |
8 |
26 |
0.11 [0.00, 0.48] |
0.96 [0.81, 1.00] |
Table 11: Accuracy rates for use of nipple discharge smear cytology in women with pathological nipple discharge.
6 studies assessed ductal lavage cytology with a mean sensitivity of 0.5 and specificity of 0.9 as shown in (Table 12) .
Study |
Criteria for abnormality |
TP |
FP |
FN |
TN |
Sensitivity [95% CI] |
Specificity [95% CI] |
Khan [34]* |
Abnormal cytology= mild atypia or malignant cytology/ severe atypia or malignancy. Aim was detection of papilloma or cancer. |
12 |
5 |
23 |
19 |
0.34 [0.19, 0.52] |
0.79 [0.58, 0.93] |
Ohlinger [52] |
unremarkable vs suspicious |
17 |
6 |
13 |
36 |
0.57 [0.37, 0.75] |
0.86 [0.71, 0.95] |
Shen [54] |
Malignant vs benign cells |
39 |
15 |
52 |
51 |
0.43 [0.33, 0.54] |
0.77 [0.65, 0.87] |
Shen [55] |
Cytological findings were grouped into three categories: clumps of ductal cells (. 50 cells), clumps with atypia (based on nuclear pleomorphism, chromatin staining, and size), and single ductal cells or small clumps. For the purposes of this study, they assumed that large ductal clumps reflected the exfoliation of an intraductal papillary lesion and that single ductal cells reflected the absence of the same. Positive findings: clumps with atypia and clumps with ductal cells |
7 |
0 |
4 |
155 |
0.64 [0.31, 0.89] |
1.00 [0.98, 1.00] |
Vaughan [58] |
positive cytology = malignancy, papilloma, or atypia |
44 |
1 |
34 |
10 |
0.56 [0.45, 0.68] |
0.91 [0.59, 1.00] |
Yamamoto [59] |
positive cytology = malignancy, papilloma, or atypia |
2 |
2 |
2 |
33 |
0.50 [0.07, 0.93] |
0.94 [0.81, 0.99] |
* Involved use of brush. |
Table 12: Accuracy rates for use of ductal lavage cytology in women with pathological nipple discharge.
For ductoscopy, while some studies focused on overall lesion detection, others classified lesions into malignant and non-malignant. Ductoscopy from all 8 studies reported a mean sensitivity of 0.8 and specificity of 0.6 as shown in (Table 13) . Overall it can be seen that variability exists in the criteria for a positive test.
Study |
Year |
Use of ductoscopy |
TP |
FP |
FN |
TN |
Sensitivity [95% CI] |
Specificity [95% CI] |
||||
Shen [54] |
2000 |
Lesion detection |
76 |
16 |
12 |
53 |
0.86 [0.77, 0.93] |
0.77 [0.65, 0.86] |
||||
Grunwald [43] |
2007 |
Lesion detection |
32 |
5 |
26 |
8 |
0.55 [0.42, 0.68] |
0.62 [0.32, 0.86] |
||||
Hahn [17] |
2009 |
Lesion detection |
18 |
5 |
2 |
4 |
0.90 [0.68, 0.99] |
0.44 [0.14, 0.79] |
||||
Vaughan [58] |
2009 |
Lesion detection |
77 |
11 |
1 |
0 |
0.99 [0.93, 1.00] |
0.00 [0.00, 0.28] |
||||
Kamali [47] |
2014 |
Lesion detection |
115 |
98 |
11 |
131 |
0.91 [0.85, 0.96] |
0.57 [0.51, 0.64] |
||||
Ohlinger [52] |
2014 |
Lesion detection |
89 |
45 |
|
44 |
0.71 [0.62, 0.79] |
0.49 [0.39, 0.60] |
||||
Total mean |
|
|
68
|
30 |
15 |
40 |
|
|
||||
range |
|
|
18-115 |
5-98 |
1-36 |
0-131 |
|
|
||||
|
||||||||||||
Shen [55] |
2001 |
Malignant lesion detection |
8 |
2 |
3 |
153 |
0.73 [0.39, 0.94] |
0.99 [0.95, 1.00] |
||||
Liu [12] |
2008 |
Malignant lesion detection |
49 |
2 |
3 |
34 |
0.94 [0.84, 0.99] |
0.94 [0.81, 0.99] |
||||
Total mean (n) |
|
|
29 |
2 |
3 |
94 |
|
|
||||
range |
|
|
8-49 |
2-2 |
3-3 |
34-153 |
|
|
Table 13: Accuracy rates for use of ductoscopy in women with pathological nipple discharge.
Author |
Year |
No. with pathology results |
Papilloma n (%) |
Ductal Ectasia n (%) |
Benign/ non-specific changes n (%) |
LCIS/ ADH/ Papilloma with Atypia n (%) |
Carcinoma n (%) |
Cabioglu [63] |
2003 |
94 |
62 (66.0) |
4 (4.3) |
8 (8.5) |
1 (1.1) |
19 (20.2) |
Morrogh [32] |
2007 |
182 |
88 (48.4) |
37 (20.3) |
11 (6.0) |
16 (8.8) |
30 (16.5) |
Morrogh [33] |
2010 |
287 |
121 (42.2) |
43 (15.0) |
28 (9.8) |
30 (10.5) |
65 (22.6) |
Total n (%) |
|
563(100.0) |
271 (48.1) |
84 (14.9) |
47 (8.3) |
47 (8.3) |
114 (20.2) |
Abbreviations: ADH = atypical ductal hyperplasia; LCIS = lobular carcinoma in situ |
Table 14: Cause of pathological nipple discharge in women who had a tissue diagnosis.
4.
Sickles EA (2000) Galactography and other imaging investigations of
nipple discharge 356: 1622-1623.
9.
Salzman B, Fleegle S, Tully AS (2012) Common breast problems. Am Fam
Physician 86: 343-349.
15.
Ingvar C (2002) Papillary secretion. Diagnostic
assessment and treatment. Scand J Surg 91: 246-250.
22.
BCMA (2013) Breast Disease and Cancer: Diagnosis. Clin Pract Guidel
Protoc Br Columbia 2013.
© by the Authors & Gavin Publishers. This is an Open Access Journal Article Published Under Attribution-Share Alike CC BY-SA: Creative Commons Attribution-Share Alike 4.0 International License. Read More About Open Access Policy.