Potential of Computer-Aided Diagnosis to Improve CT Lung Cancer Screening-FqU.pdf

136

IEEE REVIEWS IN BIOMEDICAL ENGINEERING, VOL. 2, 2009

Potential of Computer-Aided Diagnosis to

Improve CT Lung Cancer Screening

Noah Lee , Student Member, IEEE , Andrew F. Laine , Senior Member, IEEE , Guillermo Márquez,

Jeffrey M. Levsky, and John K. Gohagan

ABSTRACT— The development of low-dose spiral computed tomog-

raphy (CT) has rekindled hope that effective lung cancer screening

might yet be found. Screening is justiﬁed when there is evidence that

it will extend lives at reasonable cost and acceptable levels of risk. A

screening test should detect all extant cancers while avoiding unnec-

essary workups. Thus optimal screening modalities have both high

sensitivity and speciﬁcity. Due to the present state of technology, ra-

diologists must opt to increase sensitivity and rely on follow-up di-

agnostic procedures to rule out the incurred false positives. There is

evidence in published reports that computer-aided diagnosis tech-

nology may help radiologists alter the beneﬁt–cost calculus of CT

sensitivity and speciﬁcity in lung cancer screening protocols. This

review will provide insight into the current discussion of the ef-

fectiveness of lung cancer screening and assesses the potential of

state-of-the-art computer-aided design developments.

INDEX TERMS— Computer-aided diagnosis, lung cancer

screening, machine learning, receiver operating characteris-

tics.

Lung cancer screening is the presumptive identiﬁcation

of unrecognized malignant tissue in high-risk asymptomatic

individuals. Screening may include medical examinations such

as sputum cytology (SC) [15], [79], chest X-ray (CXR) [15],

low-dose spiral computed tomography (LDCT) [8], [15], [31],

[46], [68], gene expression tests [88], and accompanied com-

puter-aided diagnosis (CADx) and detection (CADe) schemes

[81]–[87]. Current CAD technology shows potential to im-

prove CT lung cancer diagnosis, yet the question of whether

state-of-the-art screening technology can decrease mortality

rate is inconclusive [1], [3]–[5], [9]–[15], [20], [64], [67],

[69]–[75], [77], [80], [86]. To address the mortality question

by direct comparison of LDCT technology with CXR, several

clinical trials are active or planned [1]. The largest of these and

most advanced is the National Lung Screening Trial [6] (NLST)

under the direction of U.S. National Cancer Institute in which

the targeted 50 000 asymptomatic former or current heavy

smokers were randomized to receive an initial and three annual

screens by LDCT or CXR. Conclusive results are expected to

be available by 2011. Outside the United States, randomized

controlled trials include the ITALUNG trial, 4 NELSON, 5 and

the UK Lung Cancer Screening Trial (UKLS). 6 Other clinical

trials [90] 7 lack current gold standard clinical research guide-

lines to perform randomized controlled studies for objective

evaluation of the superiority of a medical intervention for lung

cancer screening.

In this paper, we take a top-down approach in discussing the

potential of CAD for lung cancer screening. We hypothesize

that state-of-the-art CAD technology has the potential to im-

prove LDCT lung cancer screening, but the technology needs

comparative and careful assessment with respect to clinically

relevant performance measures [7], [103]–[105], [118]. We will

provide insight into the current effectiveness of LDCT lung

cancer screening and assess state-of-the-art CAD developments

in commercial and academic research. In this context, this

paper branches into three main components: 1) an overview of

CAD and associated image-processing methodologies to aid

the diagnostic decision process in lung cancer diagnosis; 2) the

various evaluation criteria, in particular, high sensitivity and

high speciﬁcity, in order to assess the potential of CAD for lung

cancer screening; and 3) the integration of CAD into clinical

practice.

and poor, men, women, and children. Lung cancer is the

leading cancer killer in the United States, with 1.3 million deaths

worldwide annually [76]. Estimates for 2008 were for 215 020

new lung cancer cases and 161 840 deaths from lung cancer in

the United States [78]. 1 In 2009, the estimated new cases and

deaths are 219 440 and 159 390, respectively. 2 More than 75%

of lung cancers are diagnosed in advanced stages. The average

ﬁve-year survival rate after lung cancer diagnosis is about 15%.

If lung cancer is detected at its earliest stage, the ﬁve-year sur-

vival rate can reach 70% [14], [16], [19]. Approximately $9.6

billion are spent in the United States each year for lung cancer

treatment. 3 These ﬁgures call for effective cancer control and

prevention strategies such as lung cancer screening programs.

Manuscript received February 18, 2009. First published October 16, 2009;

current version published December 09, 2009.

N. Lee and A. F. Laine are with the Heffner Biomedical Imaging Lab, Depart-

ment of Biomedical Engineering, Columbia University, New York, NY 10027

USA (e-mail: nl2168@columbia.edu; laine@columbia.edu).

G. Márquez is with the Early Detection Research Group, National Cancer

Institute, Bethesda, MD 20892 USA (e-mail: marquezg@mail.nih.gov).

J. M. Levsky is with the Division of Cardiothoracic Imaging, Department of

Radiology, Monteﬁore Medical Center and Albert Einstein College of Medicine,

Bronx, NY 10467 USA (e-mail: jlevsky@monteﬁore.org).

J. K. Gohagan is with the Basic Prevention Sciences Research Group, Na-

tional Cancer Institute, Bethesda, MD 20892 USA (e-mail: gohaganj@mail.nih.

gov).

Digital Object Identiﬁer 10.1109/RBME.2009.2034022

4 http://www.cspo.it.

5 http://www.nelsonproject.nl/.

6 http://www.hta.ac.uk/1752.

7 http://clinicaltrials.gov/ct2/show/NCT00963651.

1 http://www.cancer.gov/cancertopics/types/lung (2008).

2 http://www.cancer.gov/cancertopics/types/lung (2009).

3 http://progressreport.cancer.gov.

AUTHORIZED LICENSED USE LIMITED TO: IEEE XPLORE. DOWNLOADED ON MAY 13,2010 AT 11:46:08 UTC FROM IEEE XPLORE. RESTRICTIONS APPLY.

I. I NTRODUCTION

C ANCER affects everyone—the young and old, the rich

LEE et al. : POTENTIAL OF COMPUTER-AIDED DIAGNOSIS TO IMPROVE CT LUNG CANCER SCREENING

137

II. H ISTORICAL D EVELOPMENT OF CAD

The ﬁrst concept of CAD was introduced half a century ago

[17], [18], where Lusted talked about automated diagnosis of ra-

diographs by computers in 1955 [77]. Early attempts to build a

CAD system were initiated in the early 1950s. Several decades

of research passed until this dream bared fruit in 1998 for the

ﬁrst commercial mammography CAD system [77] approved by

the FDA. Large-scale systematic research began in 1980, but

new automated systems were not immediately successful [89].

Development began from an initial concept of a fully automated

computer diagnosis to a computer-aided diagnosis, where the

human relies on the machine as a second reader. Since then,

many CAD applications have been developed to help radiolo-

gists interpret images [77], [89]. CAD remains a major research

subject, and many CAD systems and applications have been pro-

posed. Currently, the major application of CAD involves breast

cancer [71], lung cancer [90]–[98], colon cancer, and prostate

cancer treatment. They have become part of the routine clin-

ical work for detection of breast cancer in some clinics [89].

Starting from ad hoc and heuristic approaches [24], [25], [29],

CAD technology moved to sophisticated machine-learning and

data-mining techniques [82], [93], [98], [106]–[109]. In recent

years, sophisticated machine-learning schemes have been devel-

oped [21]–[23], [90], [110] and entered the ﬁeld of automated,

semiautomated, and interactive CAD systems [82], [90]. Ma-

chine learning for CAD has become one of the principal re-

search areas in medical imaging and diagnostic radiology. The

reported literature [2], [24], [25], [27], [29], [91], [92], [98], [99]

gives evidence that current CAD schemes as a second reader

opinion often outperform manual grading performance of ex-

perts alone. Nishikawa and Doi provide an in-depth review of

the historical and current developments of CAD from a clin-

ical perspective [77], [89]. An in-depth review of CAD method-

ologies for lung cancer is described by Sluimer et al. [54] and

Chan et al. [86].

image retrieval and search. The ﬁeld is advancing quickly, with

new CAD schemes being developed and investigated for the task

of lung cancer diagnosis and detection [106]–[109]. In what fol-

lows, we will give a brief snapshot of current CAD schemes for

the three areas as well as present day and future areas of inves-

tigation that are being made.

A. Lung Tissue and Regions of Abnormality Discrimination

The task of discriminating lung tissue and abnormal lung re-

gions involves the analysis of large thoracic three-dimensional

CT image datasets (see Fig. 1). Images containing diffuse abnor-

malities have been especially problematic in nodule screening

due to partial volume effects and ambiguous image artifacts,

making the distinction between nodule tissue and abnormal

lung tissue difﬁcult. Early approaches deﬁned lung boundaries

and used thresholding methods to differentiate between vessels

and nodules [24]–[26]. Subsequent approaches attempted to

remove ambiguous structures by comparisons between neigh-

boring slices [27], while others applied true three-dimensional

algorithms [28]. The notion of subtracting known anatomical

structures to simplify the detection and classiﬁcation task

was suggested by the work of Mori et al. [29]. The authors

described a method for the automated anatomical labeling of

the tracheobronchial tree extracted from three-dimensional CT

data and its application to virtual bronchoscopy. Proposed work

in this area is manifold and provided discrimination results can

be grouped to binary two-class discrimination [98] and ﬁner

multiclass discrimination [91] into respective tissue types.

Depeursigne et al. [91] presented a texture classiﬁcation

system for lung tissue multiclass classiﬁcation into ﬁve dif-

ferent lung tissue patterns, (i.e., healthy, emphysema, ground

glass, ﬁbrosis, and micronodules). They used overcomplete

wavelet frames combined with gray-level histogram features

and obtained a classiﬁcation accuracy of 92.5%. Classiﬁcation

was performed using k-nearest neighbor (KNN). In 2008, the

authors reported a system [92] that integrated additional clinical

context information to perform lung tissue classiﬁcation with

further 8% performance improvement compared to [91] using

an optimized support vector machine (SVM).

Arzhaeva et al. [98] proposed a system for the localization of

interstitial lesions in chest radiographs. The system used a two-

class supervised classiﬁcation approach to distinguish between

normal and diseased texture. Texture analysis was performed

by multiscale Gaussian ﬁlter banks, linear discriminant analysis

(LDA), and an SVM classiﬁer. They evaluated the method on 44

abnormal and eight normal cases with an area under the ROC

curve (AUC) value of 78%.

Kato et al. [111] presented a bag of features approach for lung

tissue multiclass classiﬁcation in diffuse lung disease to clas-

sify disease patterns with inhomogeneous texture distributions

within a region of interest (ROI). They use a scale-invariant fea-

ture transformation descriptor over many ROI samples for local

feature extraction and to account for translation and rotation in-

variance. The authors report a classiﬁcation accuracy of 92.8%

using 1109 ROIs from 211 patients.

III. CAD O VERVIEW FOR CT L UNG C ANCER S CREENING

CAD systems for lung tissue discrimination, nodule discrimi-

nation, and nodule characterization are increasingly being used

as a second reader to aid the diagnostic decision process and

to reduce the number of overlooked lung cancers. There is ev-

idence in published reports that CAD technology may help ra-

diologists alter the beneﬁt–cost calculus of CT sensitivity and

speciﬁcity in lung cancer screening protocols to the beneﬁt of

patients and radiologists alike [85]. Current CAD schemes in-

clude lung tissue discrimination [21]–[29], nodule detection and

classiﬁcation [34], [36], [39], [43]–[46], [82], [87], [97], inter-

stitial disease detection, differential diagnosis of interstitial dis-

ease, distinction between benign and malignant pulmonary nod-

ules [93], [94], and estimation of malignancy potential as well

as growth measurement [55]. CAD in this context has improved

since 2000, but major challenges persist in three areas: 1) dis-

crimination of lung tissue and regions of abnormality ;2) nodule

detection and classiﬁcation ; and 3) nodule characterization and

growth measurement .

The pool of existing CAD systems and approaches is broad,

ranging from hybrid image-processing systems [37] including

registration [114], [115] and segmentation [116], [117] to lung

B. Lung Nodule Discrimination

Lung nodule discrimination consists of two main com-

ponents: a) nodule detection and b) nodule classiﬁcation .

AUTHORIZED LICENSED USE LIMITED TO: IEEE XPLORE. DOWNLOADED ON MAY 13,2010 AT 11:46:08 UTC FROM IEEE XPLORE. RESTRICTIONS APPLY.

138

IEEE REVIEWS IN BIOMEDICAL ENGINEERING, VOL. 2, 2009

Fig. 1. From CT lung data to lung cancer diagnosis. (Top) CT lung dataset from the LIDC database with several hundred slices. (Middle) True positive nodules

with different characteristics (solid, spiculated, and low contrast) surrounded in red. (Bottom) False positive nodules surrounded in yellow.

Recently, a comparative CAD assessment 8 was performed on

the NLST data through standardized databases such as the

Lung Image Database Consortium (LIDC) for the ﬁrst time.

The results of this assessment have not been published yet.

The European counterpart for comparative CAD assessment is

the NELSON and ANODE09 study. 9 Other openly available

datasets include the database of Lung Test images from Motol

Environment (Lung TIME) [113]. The Lung TIME database

consists of 157 CT scans with 394 annotated nodules of various

types, including solitary, regular, irregular, pleural, and vessel

attached nodules.

In the last decades, a large body of research has been re-

ported in the ﬁeld of lung nodule detection and classiﬁcation

[2], [31]–[39], [43]–[46], [82], [87], [97], [112]. A central

concern in nodule detection is the high rate of false positives

when sensitivity is increased to detect subtle nodules. A nodule

is deemed a false positive result if it led to a completely negative

workup or more than 12 months of follow-up with no cancer

diagnosis. Reducing false positive rates while maintaining high

sensitivity is still a difﬁcult problem (see Fig. 1). Techniques

include LDA [34], rule-based approaches (a set of “if-then”

statements) [38], combinations of these two [39], artiﬁcial

neural networks (ANNs) [40], and maximum-margin based

discriminators such as the SVM [92]. Novel methodologies

for searching have been introduced, which include template

matching for detection [36], unsupervised clustering techniques

[39], and a local density maximum algorithm [41]. Methods

to improve discrimination of nodules from lung tissue include

subtraction of vessels by region-growing [30], knowledge-con-

strained routines based on anatomical models of the thorax

[31], and deformable models [32]. Various approaches have

been taken to deﬁne the pleural interface and distinguish

juxta-pleural nodules, including morphological image ﬁltering

[33], [34], curvature analysis of the pleural interface [35],

and adaptive template-matching for the appropriate shape of a

nodule given a location on the pleural wall [36].

Lee et al. [36] proposed a novel template-matching technique

based on genetic algorithms and template matching for the de-

tection of nodules. They evaluated their method on 557 sectional

images with a detection rate of 72% and a false positive rate of

1.1 per sectional image.

Armato et al. [38] reported an extension of his earlier two-

and three-dimensional automated lung CT analysis method [34]

to segment lung volume on a section-by-section basis. A rule-

based [42] approach combined with LDA was applied to reduce

the number of nodule candidates. They evaluated their method

on 43 CT scans with an AUC value of 90% and a nodule detec-

tion sensitivity of 70%. The false-positive detections per section

were 1.5.

Li et al. [45] reported on a CAD scheme to help radiolo-

gists improve the detection of pulmonary nodules in chest ra-

diographs by focusing on false positive reduction. They could

reduce the number of false positives to 44.3% with a small in-

crease in the number of true positives of 2.3%.

Katsuragawa et al. [2] described an automated method to dis-

tinguish benign and malignant solitary nodules. Fifty-ﬁve chest

radiographs were discriminated using LDA and ANN for fea-

ture combination and classiﬁcation. Comparisons with manual

grading showed that LDA had an AUC value of 88.6%, whereas

manual identiﬁcation resulted in an AUC value of 85.4%.

Brown et al. [46] reported patient-speciﬁc models for de-

tecting lung nodules for use in screening and follow-up surveil-

lance. Baseline image data facilitated segmentation of subse-

8 http://skynet.ohsu.edu/lungnodule09/.

9 http://anode09.isi.uu.nl/index.php.

AUTHORIZED LICENSED USE LIMITED TO: IEEE XPLORE. DOWNLOADED ON MAY 13,2010 AT 11:46:08 UTC FROM IEEE XPLORE. RESTRICTIONS APPLY.

LEE et al. : POTENTIAL OF COMPUTER-AIDED DIAGNOSIS TO IMPROVE CT LUNG CANCER SCREENING

139

quent images so that changes in size and/or shape of nodules

could be measured automatically. The system performed with

an 86% detection rate and an average of 11 false positives per

case on the baseline scans of 17 subjects. Follow-up scans per-

formed with a detection rate of 81%. Brown et al. [62] also de-

veloped an automated system for detecting lung micronodules

and applied it to data from 15 subjects with 77 lung nodules.

Preliminary results indicated that the automated system consid-

erably improved the radiologist’s performance in micronodule

detection but with a compensatory loss of speciﬁcity.

Gurcan et al. [39] developed a CAD system for lung nodule

detection on CT images wherein the ﬁrst-stage lung regions

were identiﬁed by k-means clustering. After rule-based classi-

ﬁcation, LDA was used to further reduce the number of false

positives. They used 1454 CT slices from 34 patients with 63

lung nodules and obtained a sensitivity of 84% with 5.48 false

positives per slice.

Arimura et al. [43] reported a CAD system for nodule detec-

tion using a difference-image technique. They compared several

rule-based schemes for identifying nodules. A massive-training

ANN (MTANN) [44] reduced the false positives. The method

was evaluated on a conﬁrmed cancer database of 106 CT scans

with 109 cancer lesions from 73 patients. They reported a sen-

sitivity of 83% and 5.8 false positives per scan.

Suzuki et al. [82] developed a technique that used a multiple

MTANN (multi-MTANN) for false-positive reduction. The in-

vestigators found that use of the trained multi-MTANN elimi-

nated 68.3% of false-positive ﬁndings with a reduction of one

true-positive result. The false-positive rate of the original CAD

scheme was improved from 4.5 to 1.4 false positives per image,

at an overall sensitivity of 81.3%, suggesting that this technique

reduced the false-positive rate of the CAD scheme for nodule

detection on chest radiographs while maintaining a high level

of sensitivity.

Shiraishi et al. [87] investigated the effect of a CAD scheme

on radiologist performance in the detection of lung cancers

on chest radiographs. They combined two independent CAD

schemes for the detection and classiﬁcation of nodules into one

new CAD scheme by use of a database of 150 chest images.

Performance of the CAD scheme indicated that sensitivity in

detecting lung nodules was 80.6%, with 1.2 false-positive re-

sults per image, and sensitivity and speciﬁcity for classiﬁcation

of nodules by use of the same database for training and testing

the CAD scheme were 87.7% and 66.7%, respectively. The

AUC value for detection of lung cancers improved signiﬁcantly

from without (72.4%) to with CAD (77.8%). Shi-

raishi et al. (100) also developed a CAD system for detection

of nodules in the lateral views of chest radiographs in order to

improve the overall performance in combination with a CAD

scheme for posterior–anterior (PA) views.

Murphy et al. [112] presented a large-scale evaluation study

of automatic nodule detection in chest CT using local image fea-

tures (shape index and curvedness) and two successive iterations

of KNN classiﬁcation for false-positive reduction. On 813 ran-

domly selected scans, a sensitivity of 80% was achieved with an

average of 4.2 false positives/scan. The same group participated

in the ANODE09 benchmark and achieved top performance

among six different CAD systems. Most of the work reported

for nodule detection and classiﬁcation uses binary two-class de-

cisions to discriminate nodules.

C. Nodule Characterization by Malignancy Potential

The characterization of nodules by their malignancy potential

involves the analysis of nodule candidates into different nodule

type categories such as subtlety, texture, margin, sphericity,

calciﬁcation, internal structure, lobulation, spiculation, and

malignancy. A major application of CAD for lung CT is the

classiﬁcation of nodules by likelihood of malignancy using

automated feature analyses algorithms [47], [96]. Here, the

common approach has been to calculate many features by

which to measure nodules and attempt to ﬁnd correlations

between particular features (e.g., size, shape, attenuation) and

histological-conﬁrmed cancers. Promising results have been

demonstrated using classiﬁers based on classical nodule texture

features [48]. More recently, fractal analysis of lung-nodule

interfaces [49] and LDA of multiple features [50] have shown

promise. Other important efforts in distinguishing benign and

malignant nodules are measurement of size change over time

[35], [46] and quantiﬁcation of nodule uptake of intravenously

administered contrast enhancement [51]. The solitary pul-

monary nodule is a commonly encountered ﬁnding that might

represent lung cancer. Morphological characteristics including

lesion size, contour, edge, calciﬁcation, nodule density, and

contrast enhancement can help differentiate malignant from

benign nodules. Temporal change in lung nodule size raises

concern for malignancy, while size stability is traditionally

considered an indicator of benignity [52].

Yankelevitz et al. [55] sought to determine the accuracy of

LDCT volumetric measurements of small pulmonary nodules to

assess growth and malignancy via three-dimensional image ex-

traction and isotropic resampling. The synthetic nodule studies

revealed that volume could be measured accurately to within

3%.

Ko et al. [35] developed a CAD system that automatically

identiﬁed nodules from chest CT, quantiﬁed nodule diam-

eter, and estimated temporal change in size. High correlation

between the algorithm and thoracic radiologists on change

in nodule size was achieved (Spearman rank correlation co-

efﬁcient ). The automated nodule detection system

identiﬁed 86% of 370 nodules in 16 studies from eight patients

with known nodules.

Li et al. [95] evaluated a system to investigate whether a

CAD scheme can assist radiologists in distinguishing small

benign from small malignant nodules on LDCT data. The

dataset used consisted of 28 primary lung cancers (6–20 mm)

and 28 benign nodules. Cancer cases included nodules with

pure ground-glass opacity, mixed ground-glass opacity, and

solid opacity. The AUC of the CAD scheme alone was 83.1%

for distinguishing benign from malignant nodules. The average

AUC value for radiologists was improved with the aid of the

CAD scheme from 78.5% to 85.3% . The radi-

ologists’ diagnostic performance with the CAD scheme was

more accurate than that of the CAD scheme alone

and that of radiologists alone. Li et al. [93] also described

the current status of the development and evaluation of CAD

schemes for the detection and characterization of lung nodules

AUTHORIZED LICENSED USE LIMITED TO: IEEE XPLORE. DOWNLOADED ON MAY 13,2010 AT 11:46:08 UTC FROM IEEE XPLORE. RESTRICTIONS APPLY.

140

IEEE REVIEWS IN BIOMEDICAL ENGINEERING, VOL. 2, 2009

in thin-section CT. They also reviewed a number of observer

performance studies, in which it was attempted, to assess the

potential for clinical usefulness of CAD schemes for nodule

detection and characterization in thin-section CT.

Petkovska et al. [94] studied whether conventional nodule

densitometry or contrast enhancement maps of indeterminate

lung nodules can distinguish benign from malignant nodules.

Conventional nodule densitometry was performed to obtain

the maximum difference in mean enhancement values for each

nodule from a circular ROI. The ROC curve for higher values

of enhancement indicated malignancy, which had an AUC

value of 76%. The visually scored magnitude of enhancement

was found to be less effective in distinguishing malignant from

benign lesions, with an AUC value of 62%. The visually scored

pattern of enhancement was found to be more effective with an

average AUC value of 79%.

Zhen et al. [114] proposed a new tumor growth measure for

pulmonary nodules, which could account for tumor deforma-

tion using nonrigid registration combined with nodule detection

and -segmentation. They proposed an adaptive doubling time

measure and reported comparative results to the standard

doubling time growth-rate measure. Results were based on

two successive scans for ten benign and nine malignant nodule

datasets.

Dreiseitle [109] proposed a learning scheme for training mul-

ticlass classiﬁers by maximizing the volume under the ROC sur-

face, which could beneﬁt multiclass lung nodule characteriza-

tion tasks. Rather than having a binary decision on the malig-

nancy of a test case, a multiclass grading on the malignancy

decision would further provide additional measures that could

improve lung cancer screening.

The mentioned CAD schemes were not applied directly to

the domain of lung cancer screening, yet they provide theoret-

ical justiﬁcation and potential to be effective tools to improve

CT lung cancer screening. One has to examine carefully the re-

ported results and their applicability to lung cancer screening.

IV. C LINICALLY R ELEVANT S ENSITIVITY AND S PECIFICITY

We emphasize that for cancer screening and performance

assessment of available CAD schemes, the focus of attention

should be put on clinically relevant performance measures.

In the context of cancer screening, the sensitivity-speciﬁcity

calculus of CAD systems is an essential factor when it comes

to treatment cost and patient outcome. One should consider

that validation of false-positives and ground-truth generation is

still a very timid approach. Problems of inter- and intraobserver

variability (see Fig. 2) and manual time-intensive grading call

for minimally invasive methodologies to obtain ground truth

information that would further alter the sensitivity-speciﬁcity

calculus of CAD and its potential acceptance. Minimally inva-

sive lung cancer surgery such as thoracoscopic lobectomy or

video-assisted thoracic surgery (VATS), or even noninvasive

surgeries such as the CyberKnife method [121], are advances

towards this direction.

To provide clinically relevant deﬁnitions for sensitivity and

speciﬁcity [103]–[105], we follow the deﬁnition in [100] and

[101] and point out how these diagnostic performance mea-

sures should be interpreted for cancer screening. Sensitivity and

speciﬁcity measure the number of false positives and false neg-

atives and are useful in evaluating the effectiveness of screening

methods. Alternative terms are the true-positive rate (TPR) and

the false-positive rate (FPR). The terms “positive” and “nega-

tive” are used to refer to the presence or absence of lung cancer.

Sensitivity and speciﬁcity are deﬁned as follows. The sensi-

tivity of a screening test is its ability to detect those individuals

with cancer. It is computed by taking the number of true posi-

tives (TPs) and dividing it by the total number of cancer cases

(TP FN). The speciﬁcity of a test is its ability to identify those

individuals who actually do not have cancer. It is computed by

dividing the true negative (TN) by the sum of the TN and FP

cases. From these probabilities, one can compute conﬁdence in-

tervals [102] and ROC curves that summarize diagnostic perfor-

mance for comparative assessment. However, the majority of

published research does not provide conﬁdence intervals even

though they could be obtained from the algorithms.

D. Machine Learning for Lung Cancer Screening

In recent years, the machine-learning community developed

sophisticated tools and learning paradigms to address the issue

of CAD schemes that show clinically relevant performance mea-

sures. Feature selection methods and temporal learning schemes

are being employed successfully for the task of nodule char-

acterization and growth measurement. Recently, Vapnik et al.

[123] proposed a new framework called learning with hidden in-

formation that would enable the integration of hidden informa-

tion that could further improve CAD technology for lung cancer

diagnosis.

In Barreno et al. [106], the authors described a theoretical

analysis on how to combine classiﬁers with an optimal decision

rule and optimal ROC curve. The combination of different CAD

schemes also found interest in comparative CAD studies such

as the ANODE09 study. The issue of unbalanced class distribu-

tion in medical diagnostic applications and different class im-

portance needs to be addressed when developing CAD schemes

for effective lung cancer screening. Most standard classiﬁcation

methods, however, are designed to maximize the overall accu-

racy and cannot incorporate different costs to different classes

explicitly. Liu and Tan et al. [107] proposed a method to di-

rectly maximize the weighted speciﬁcity and sensitivity of the

ROC curve. They reported excellent generalization properties

with the ability to assign different error costs to different classes

to account for the difference in the importance of the class dis-

tribution. Mozer et al. [108] took an approach of constrained

optimization to obtain a reduced solution space that directly

models the problem domain and has relevant performance char-

acteristics on a speciﬁc target region of the ROC curve. They

showed signiﬁcant performance improvements in the domain of

telecommunications that could also beneﬁt the application do-

main of lung cancer screening.

A. Detection Theory and ROC

The ROC curve was ﬁrst developed by electrical engineers

and radar engineers during World War II for detecting enemy

objects in battleﬁelds and was soon introduced in psychology to

account for perceptual detection of signals [17], [18]. The use

of ROC in medicine to assess diagnostic test performance was

AUTHORIZED LICENSED USE LIMITED TO: IEEE XPLORE. DOWNLOADED ON MAY 13,2010 AT 11:46:08 UTC FROM IEEE XPLORE. RESTRICTIONS APPLY.

Plik z chomika:

Inne pliki z tego folderu:

Inne foldery tego chomika: