RadioGraphics
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


This Article
Right arrow Abstract Freely available
Right arrow Figures Only
Right arrow Full Text (PDF)
Right arrow Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when eLetters are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Sinha, U.
Right arrow Articles by Kangarloo, H.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Sinha, U.
Right arrow Articles by Kangarloo, H.
Related Collections
Right arrow Informatics
(Radiographics. 2002;22:1271-1289.)
© RSNA, 2002


infoRAD

Principal Component Analysis for Content-based Image Retrieval1

Usha Sinha, PhD and Hooshang Kangarloo, MD

1 From the Department of Radiological Sciences, University of California, Los Angeles, School of Medicine, 924 Westwood Blvd, Suite 420, Los Angeles, CA 90024. Presented as an infoRAD exhibit at the 2000 RSNA scientific assembly. Received March 27, 2001; revision requested May 7; final revision received March 12, 2002; accepted April 17. Address correspondence to U.S. (e-mail: usinha@itmedicine.net).


    Abstract
 Top
 Abstract
 Introduction
 Methods
 Results
 Discussion
 References
 
Most picture archiving and communication systems provide image search capabilities that support queries based on patient demographics and study descriptions. In a preliminary study, principal component analysis was used to represent and retrieve images on the basis of content. Principal component analysis reduces the dimensionality of the search to a basis set of prototype images that best describes the images. Each image is described by its projection on the basis set; a match to a query image is determined by comparing its projection vector on the basis set with that of the images in the database. The training image database consisted of 100 axial brain images from a three-dimensional T1-weighted magnetic resonance imaging study. The algorithm was evaluated by using 96 axial images from eight patients. Image retrieval was considered accurate if the automated algorithm returned the match section to within 3 mm of an expert-selected section; the retrieval accuracy was 83% when the images were preprocessed for uniformity in intensity and geometry. Principal component analysis can be applied to content-based retrieval of medical images. The algorithm is designed to be part of an automated image selection module that filters relevant images from an imaging study.

© RSNA, 2002

Index Terms: Computers • Images, storage and retrieval • Picture archiving and communication system (PACS)


    Introduction
 Top
 Abstract
 Introduction
 Methods
 Results
 Discussion
 References
 
Radiologic studies such as magnetic resonance (MR) imaging and computed tomographic (CT) examinations yield a large number of images. In most cases, only a small subset of these images is associated with the relevant findings mentioned in the associated radiology report. There are several reasons to identify the relevant images of a given study, including intelligent prefetching, effective communication with the referring physician, and clinical compression. It is desirable to create an automated technique that can assign each image to a certain level or class, since manual classification methods are both tedious and time-consuming. Once this assignment or matching is performed, images of the given patient study can be considered labeled. In the technique proposed herein for selecting relevant images of a patient imaging study, two methods are combined: natural language processing (NLP) of free-text radiology reports and automated image classification. However, the focus of this article is only on the automated technique for image classification and not on the entire process. The Methods section does include a brief overview of the entire process to provide the context for the image selection algorithm discussed in this article.

Content-based image retrieval is an active area of research, as reflected in the large number of systems that are currently available as either commercial or prototype implementations, including the QBIC (1), VIRAGE (2), Photobook (3), Blobworld (4), and Netra (5) systems. The underlying technique of content-based image retrieval algorithms is to extract a signature for every image based on its pixel values and then provide a rule for comparing signatures. The signature serves as an image representation, and the components of the signature are called features. Color, texture, and shape have been incorporated into many of the content-based image retrieval systems as features for image representation (15). Of these three features, color and texture can be extracted on a pixel-by-pixel basis, whereas characterization by the shape feature requires extraction of a region or object, in itself a complicated problem. After extraction of signatures, the next step is to determine a comparison rule, including a querying scheme and the definition of a similarity measure between images. Several querying schemes are available: region-based searching, where the retrieval is based on a particular region in the image, or searching by specifying the color histogram or object shape of the images/objects to be retrieved (4,5). Although this type of indexing allows efficient image representation, significant semantic information is lost. For example, a color indexing scheme will make the retrieval system "think" that two objects with the same color histograms are similar when in reality they may represent images of two completely different objects (red apples and a red car with the same color histograms). An alternate approach that achieves semantics-preserving image compression is the one adopted in the Photobook system (3,6). In this system, which is for face recognition, principal component analysis is used to find uncorrelated prototype images that best describe a set of images. This article discusses an application of the principal component analysis technique to classification of MR images.

In addition to the methods listed in the preceding paragraph, which have been developed for a broad range of images, there have also been several systems specifically tailored for content-based retrieval of medical images. The I2C information system (7) allows indexing and retrieval of medical images by visual content. This system integrates tools for defining image analysis routines based on specific image classes; some of the algorithms are interactive, while others are automated. The system is integrated into a mini–picture archiving and communication system (PACS) and is also deployed on the World Wide Web. A content-based image retrieval algorithm that combines semantic indexing using the Unified Medical Language System (UMLS) metathesaurus and knowledge-based image analysis has been reported (8). The CANDID system (9), which is based on extension of the N-gram method for searching free-text documents, has been applied to image indexing. In this system, a global signature is computed, which represents the content of the image in an abstract sense. The system has been evaluated for retrieval of chest CT images, and the image signature is based on texture features. The ASSERT system (10) requires a physician to delineate the region of interest of the pathologic condition. The image is indexed by features calculated for the region of interest. This system is currently being evaluated clinically on high-resolution lung CT images. A content-based image retrieval system for neurologic images that combines text information, three-dimensional (3D) alignment, and feature extraction to automatically select an image that best matches a query image has been evaluated with images containing pathologic conditions (stroke, bleeding, and tumor) (11). Users are required to enter some text data related to the query image to narrow the image search space. Another system, the Image Reference Databases (IRDB) system (12), incorporates medical image indexing based on principal component analysis and was evaluated on retrieval of brain MR images. A recent work discusses the extension of a general content-based image retrieval algorithm for multiresolution region-based searching of images of pathologic conditions (13). A content-based retrieval system integrated into a picture archiving and communication system environment that provides an infrastructure for hosting different image-matching algorithms has also been described recently (14).

The method described in this article is similar to the method described in reference 12. We have extended the work reported in reference 12 in several ways: (a) quantitative estimation of the sensitivity of the technique to image appearance by simulating image variations in geometry, scale, and contrast; (b) exploring methods to decrease sensitivity to image contrast and brightness by use of a log index; (c) integration of a 3D alignment algorithm into the preprocessing to permit accurate image standardization; (d) development of an accurate validation scheme by using a quantitative index to evaluate the retrieval efficiency of the algorithm; and (e) integration with a novel application of the algorithm to classify an image at an anatomic level rather than to retrieve the closest match image.


    Methods
 Top
 Abstract
 Introduction
 Methods
 Results
 Discussion
 References
 
Principal Component Analysis for Image Classification
The idea behind the principal component analysis method is briefly outlined herein: An image can be viewed as a vector by concatenating the rows of the image one after another. If the image has square dimensions (as in MR images) of L x L pixels, then the size of the vector is L2. For typical image dimensions of 256 x 256, the vector length (dimensionality) is 65,536. Each new image has a different vector, and a collection of images will occupy a certain region in an extremely high dimensional space. The task of comparing images in this hundred thousand–dimension space is a formidable one. The brain image vectors are large because they belong to a vector space that is not optimal for image description. However, knowledge of brain anatomy provides us with similarities between these images: an elliptical shape (for axial orientation) and essentially three tissue types: gray matter, white matter, and cerebrospinal fluid. It is because of the similarities that we can deduce that brain image vectors will be located in a small cluster of the entire image space. The idea behind principal component analysis is to find a more appropriate representation for the image vectors so that the dimensionality of the space used to represent them can be reduced.

The mathematical steps used to determine the principal components of a training set of brain images are outlined in this paragraph (6): A set of training images N are represented as vectors of length L x L, where L is the number of pixels in the x (y) direction. The average image m of the N training set images is given by the following formula:

where i is the L x L dimension vector corresponding to the ith image in the training set. An N x N matrix O is formed, whose elements Oij are given by the inner product of image vectors (i - m) and (j - m). Let {nu}n and {lambda}n be the eigenvectors and the eigenvalues of O, respectively; then, there will be N - 1 eigenvectors of length N. These eigenvectors determine linear combinations of the N training set images to form the basis set of images, ui, that best describe the variations in the training set images:

for i = 1, 2, ... N. These basis set images are called eigenimages. The eigenimages associated with the largest eigenvalues capture most of the information in the training set images. Each image in the set can then be approximated with a linear combination of these eigenimages:

The coefficients wp are the feature description for the image xk, each of which is assigned to a different class k. A new query image, qi, is projected similarly onto the eigenspace and the coefficients wq are computed. The class that best describes the query image is determined by a similarity measure defined in terms of the euclidean distance of the coefficients wq and wp (where p = 1, 2 ... k for k classes in the training set). The training set image whose coefficients are closest (in the euclidean sense) to those of the query image is selected as the match image. If the minimum euclidean distance exceeds a preset threshold, the query image is assigned to a new class.

The method outlined in the preceding paragraph is summarized as follows: Eigenimages are computed for a training set of images. The eigenimages are ordered, each one accounting for a different amount of the variation among the images. These eigenimages can be thought of as a set of features that together characterize the variation among the images. The space spanned by the eigenimages is called eigenspace. Each image location contributes more or less to each eigenimage, so that the eigenimage appears like a ghostly brain, which we term the eigenbrain (in analogy to the term eigenface used in face recognition applications [3,6]). Each eigenbrain deviates from uniform gray where some feature differs among the set of training images; the eigenbrains are a map of the variations between the images. Images from the training set as well as the query images are then represented by the coefficients of the projections onto each eigenimage. It has been shown, in several studies, that for face recognition, each image in the training set can be approximated by using only the "best" eigenimages: those that have the largest eigenvalues and that therefore account for the most variance within the set of face images (6).

Overview
The eigenimage algorithm described in the previous section is integrated with an automated image selection module; a brief overview is provided in this section. This module combines two techniques: natural language processing to structure the findings of the radiology report and image classification based on the eigenimage-matching algorithm. Natural language processing structures the free-text radiology report and identifies the relevant anatomic structures. Details of this algorithm, developed and evaluated by our group, are given elsewhere (15). Patient images are classified by using the image-matching algorithm, and images containing the structures identified by the natural language processing algorithm are selectively filtered from the imaging study. Although the focus of this article is on image classification, a brief description of the overall architecture of the relevant image filtering is included to define the scope and requirements of the image-matching algorithm (Fig 1).



View larger version (48K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 1.  Overview of the relevant image filtering process.

 
Training Set Images. A set of training images from different subjects is preprocessed to ensure uniform orientation, image geometry, and intensity scale. An expert manually classifies each section; the classification procedure sorts each image by the anatomic level. Each image in a serial study is nominally assigned to a separate class. Structures visible at each level (class) are enumerated. Note that structures are merely listed; no manual segmentation is performed, a far more tedious task. Anatomic structures are represented in a standard nomenclature (eg, SNOMED RT [16]). A set of prototype images (the eigenimages) are created from the labeled training set, and each image is represented as a set of weights or coefficients on the eigenimages. Separate training sets (and associated eigenimages) are created for the three orientations (axial, coronal, and sagittal) and the two contrast types (T1-weighted and T2-weighted) routinely used in clinical brain MR imaging acquisitions.

The manual classification of the training set images by the expert is detailed in this paragraph: The radiologist assigns images (from the different imaging studies) at approximately the same anatomic location to the same class. For example, in the axial orientation, all images at the level of the largest extent of the lateral ventricle are assigned to the same class. Note that image variation due to differences in alignment and/or intensity is reduced prior to classification by preprocessing steps. The idea is to collect images that reflect variability from normal physiologic variations and/or pathologic processes. This process is analogous to creating a face class of the same person under different conditions: illumination, facial expression, and face orientation (7). The role of the radiologist is to assign images to one class if they contain approximately the same anatomic regions. The number of images required to represent a class is ultimately determined by the variance of images at any given anatomic level. For instance, the ventricles may have a larger variance among normal subjects than a structure such as the hippocampus. It follows that images from a larger number of subjects will be required to capture the variance in the ventricles than in the hippocampus. Images containing pathologic conditions are grouped under the same class as "normal" images as long as both are at the same anatomic level.

Patient Images. The orientation and contrast of the patient images (not belonging to the training set) are determined from the Digital Imaging and Communications in Medicine (DICOM) headers, and the appropriate training set is chosen. To reduce image variations arising from changes in orientation and intensity, the patient images are processed to match the training set images in orientation and intensity. The projections of each patient image onto the training set eigenimages are calculated, and assignment of an image to a class is based on the similarity measure (discussed in the section entitled "Principal Component Analysis for Image Classification"). Anatomic structures that are visible in each class are listed during the classification process. It follows that once the images are assigned to a class, the mapping of structures to a particular image is straightforward. In parallel, the radiology report associated with a given study is structured by using natural language processing techniques developed by members of our group (15). The natural language processing algorithm extracts and structures the report findings and associated attributes, including anatomic location (eg, finding = mass, location = temporal lobe). The final step in relevant image selection is to identify images from the study that contain the sections at the anatomic location of the radiology finding from among the labeled patient study images that contain the structures identified with natural language processing.

An example of relevant image selection for a specific patient image study is outlined in this paragraph: Images of the patient study are processed (in the Image Processor module [Fig 1]) to obtain uniform alignment and intensity to match the training set images. The Study Identifier module extracts information from the DICOM header and identifies images as obtained with a 3D sequence in the axial orientation with T1-weighted contrast (120 images). The Training Set Selector module selects the eigenimage set generated from T1-weighted axial training images. The patient images are classified by the image-matching algorithm (Classifier module). The natural language processing extracts findings and associated anatomic locations from the radiology report (eg, the sylvian fissure). The expert-created mapping of structures to image levels now permits identification of the images containing this structure (four to five levels at a section thickness resolution of 1 mm). The fairly large image set of 120 images is now reduced to five images that contain the relevant finding (assuming that "sylvian fissure" is the only structure mentioned in the report). The potential for large clinical compression ratios can be readily appreciated from this simple example.

Retrieval Accuracy. Previous efforts in application of this algorithm to the face recognition problem have highlighted the need for "well-framed" images: similar illumination, orientation, size, and location of each face image. This type of image standardization requires uniform acquisition strategies and/or image preprocessing steps (6,12). We first determine the sensitivity of the retrieval algorithm to image acquisition parameters and then devise methods to reduce sensitivity by modification of the retrieval algorithm and/or image preprocessing. For image preprocessing, the requirements were that the steps be completely automated and, if possible, computationally nonintensive. It is important to determine the sensitivity of the algorithm to image geometry and intensity/contrast, since this determines both the extent of postprocessing and the number and type of images in a training set.

There are many sources of variation in medical images: differences in acquisition strategies and patient positioning can cause images to differ in intensity, orientation, and scale. Beyond these sources of variation, there are inevitable differences arising from the biologic variability in brains. We investigated the sensitivity to differences in orientation and intensity by synthesizing images with a known amount of change (in rotation, geometric scale, or intensity scale). Further, we also investigated several preprocessing methods to standardize image presentation to increase retrieval accuracy.

Effect of intensity scale and geometry on image matching: The sensitivity of the matching algorithm to scale (geometry and intensity) and orientation was determined by deliberately altering images from the training set by known changes in scale and orientation. These synthetic data simulate differences in images between patients arising from positioning variability and imaging unit variability. Each image was queried against the training data to retrieve the closest matching image.

Optimization: The following sections focus on methods to improve the retrieval accuracy of the content-based image retrieval algorithm for MR images. The aim is to minimize the amount of preprocessing required as well as to completely automate any required steps. We also tested the range of applicability (ie, to identify the type of images that can be classified given a certain set of training images). For example, given a training set of images generated from normal brain data acquired with a 3D axial T1-weighted sequence, would it be possible to classify a brain image containing a pathologic condition or a normal brain image acquired with a different contrast (T2-weighted contrast)?

Image intensity optimization: The intensity scale in MR images differs considerably between studies even for identical acquisition protocols and imaging units (in fact, even within a study, there are both in-plane and out-of-plane intensity variations). However, this type of intensity variation does not affect image interpretation, since contrast is the more important parameter for diagnosis. However, unlike the human visual process, the image-matching algorithm is sensitive to absolute values of image intensity, since the matrix O is formed from vectors of image difference values (difference from the mean image of the training data set). We investigated two methods to decrease the sensitivity to intensity scaling: (a) contrast standardization and (b) an intensity scaling–insensitive index.

Standardization of contrast: A typical image section and the associated histogram are shown in Figure 2a. Most pixels are clustered around the lower intensities with loss of image detail. Each image was scaled to range between 0 and 255 so as to have a common dynamic range for the images from different subjects. Further, histogram equalization was performed to obtain similar contrast enhancement in the image sets. Histogram equalization is a mathematical process that increases the contrast in the image by spreading the pixel distribution equally among all the available intensities. This results in a flatter histogram for each image (compare the histograms of Figs 2a and 2b). The histogram-equalized image (Fig 2b) clearly shows greater image detail, especially in the dark portions of the images.



View larger version (57K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 2a.  (a) Typical MR image without any changes in the intensity scale. The histogram shows the pixel intensities clustered at the lower end of the gray-scale intensities. (b) Same MR image after histogram equalization. The histogram clearly shows spread of the pixel intensity values as a result of histogram equalization.

 


View larger version (70K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 2b.  (a) Typical MR image without any changes in the intensity scale. The histogram shows the pixel intensities clustered at the lower end of the gray-scale intensities. (b) Same MR image after histogram equalization. The histogram clearly shows spread of the pixel intensity values as a result of histogram equalization.

 
Intensity scale–insensitive measure for the eigenimage-matching algorithm: We investigated use of the ratio of image intensities rather than the difference in image intensities to form the covariance matrix. The ratio of image intensities is less sensitive to intensity scale differences between training and query image sets. To reduce the bias introduced by noise pixels, the logarithms of the ratios were obtained, with the O matrix now given by the inner product of the following image vectors:

and

Further, to eliminate the salt-and-pepper appearance of the background, all pixels that were below a threshold value in the average image were set to zero in the log-ratio image. In the rest of the article, we refer to this method as the log-ratio index and to the original method as the subtraction index.

Geometric orientation and scale: The image-matching algorithm is also sensitive to the orientation (both in-plane and out-of-plane) and scale of the images. Three preprocessing methods were investigated: (a) Image translation so that the center of the image coincided with the center of the field of view and linear scaling so that the training set and query images were of similar size. (b) Three-dimensional registration to align the training set and query images to a common reference (17). (c) Spatial filtering of the images to emphasize the central section of the image.

The first method was computationally nonintensive and easy to implement. Centering the image in the field of view shifted the center of mass of the object in each two-dimensional image to the center of the image field of view (eg, in the images of size 256 x 256, the object center of mass was shifted to [128,128]). Scaling was performed so that the size of the subject’s head did not bias the image retrieval algorithm. The maximum extent of the head in the anteroposterior direction from the images of an image series was determined, and this was scaled to a constant value. The other images in that series were then scaled to this value to produce a magnified or minified version of the original volume. Shifting the image to the center and scaling do not bring the images into a standard alignment, but the method was adopted for ease of implementation and speed of processing.

We used the Automated Image Registration (AIR) program, version 3.0, of Woods et al (17) to bring all the image volumes into a common frame of alignment. This algorithm was chosen due to our requirement for minimal user intervention; it is based on matching of voxel intensities and has been tested for accuracy by using both intersubject and intrasubject registration. The output of the registration was a matrix that contained the translation, rotation, and scaling parameters to register to a reference standard image volume. Thus, 3D registration ensured that all the image volumes were aligned and scaled to a common reference image volume.

Spatial image filtering: An examination of the brain images reveals (for images centered in the field of view) that most of the information is in the central 60%, with the background filling the edges. Application of a two-dimensional gaussian-shaped filter centered in the field of view with an appropriate bandwidth serves to emphasize the relevant portions of the image. The two-dimensional gaussian-shaped filter is designed so that the image has the maximum intensity at the image center and is attenuated away from the center in both the x and y directions. The attenuation profiles in both directions are gaussian, with a standard deviation of 48 pixels. This filtering also served to reduce the influence of nonbrain structures, which are present at the object (brain) periphery. Another advantage of the gaussian filter is that selective attenuation of the image edges decreases the sensitivity of the matching algorithm to image orientation, since changes in patient orientation affect the central portions of the brain less than the edges (the central portions are closer to the axis of rotation). Training and query images were preprocessed with a gaussian filter in the x and y directions with a full width at half maximum (FWHM) of 48 pixels (image size = 256 pixels). A typical image and the corresponding gaussian filtered one are shown along with intensity profiles in Figure 3; the overlaid intensity profiles confirm the selective attenuation of the image edges by the filter.



View larger version (84K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 3a.  (a) MR image with a horizontal profile bar to show the intensities across the section. (b) Same MR image after two-dimensional gaussian filtering (full width at half maximum = 48 pixels) and the corresponding horizontal profile bar. The intensities along the horizontal profile bars are shown above each image.

 


View larger version (76K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 3b.  (a) MR image with a horizontal profile bar to show the intensities across the section. (b) Same MR image after two-dimensional gaussian filtering (full width at half maximum = 48 pixels) and the corresponding horizontal profile bar. The intensities along the horizontal profile bars are shown above each image.

 
Evaluation. The evaluation was performed on the synthesized images, and a quantitative index was assigned as follows: The retrieved image was assigned an index of 0 when it exactly matched the "original" image (from which the query image was synthesized). A nonzero value was assigned to the retrieval index if the match image was not the original image. The actual value of the index (when the retrieved image was different from the original image) depended on the spatial separation of the retrieved image from the original image. The spatial separation is a known quantity, since the training set consists of images spaced 1 mm apart. A similar quantitative index was used to evaluate the accuracy of the retrieval algorithm by using acquired images to query the training set, the difference being that the "original" image used in the evaluation of the synthesized images is now replaced by the image selected manually by an expert as the closest-match image. The quantitative evaluation of the retrieval provides an index of the anatomic distance between the closest match as determined by an expert and that retrieved by the algorithm. For small values of the index, the images retrieved by the automated algorithm will still contain the structures of interest extracted by the natural language processing algorithm from the free-text radiology report.

Training and Test Images. The training set consisted of 100 images derived from a T1-weighted imaging study with the following protocol: axial 3D spoiled gradient-echo sequence, repetition time (msec)/echo time (msec) = 36/9, flip angle = 30°, section thickness = 1.4 mm, field of view = 240 mm, matrix size = 256 x 256. The test set consisted of 96 images from two protocols: a T1-weighted protocol identical to that used for the training set (seven subjects) and a T2-weighted spin-echo sequence (one subject). The protocol for the T2-weighted sequence was as follows: axial two-dimensional spin-echo sequence, repetition time/echo time = 4,000/90, section thickness = 1.5 mm, field of view = 240 mm, matrix size = 256 x 256. The test set images thus included images with an entirely different contrast from that of the training set images.


    Results
 Top
 Abstract
 Introduction
 Methods
 Results
 Discussion
 References
 
A subset of the 100 images used in the training set is shown in Figure 4. The eigenimages with the top 16 eigenvalues are shown in Figure 5. The eigenimages in Figure 5 do not specifically correspond to any brain structures, such as the ventricle, frontal lobe, or orbits. The eigenimages only remotely resemble a brain and in this article are referred to as eigenbrains; they are a set of important features that describe the variations in the brain image set. Further, eigenimages with higher eigenvalues provide more information on the brain image variation than those with smaller eigenvalues. This is in contrast to the euclidean space representation, where all axes are of the same importance. This is also seen in Figure 5, where the lowest row of eigenimages with smaller eigenvalues contain less information and are noisier than those in the top row with higher eigenvalues. Intuitively, this translates to the possibility of discarding eigenimages with small eigenvalues, leading to a more compact descriptor of the brain images.



View larger version (120K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 4.  A subset of the images used in the training set.

 


View larger version (147K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 5.  Eigenimages corresponding to the top 16 eigenvalues of the training set images.

 
An important finding of earlier work in the principal component analysis technique for face recognition is that faces can be reconstructed (and also recognized) with sufficient accuracy by using only a subset of the eigenimages (6). A similar analysis was performed to assess the retrieval performance for brain images as a function of the number of eigenimages. Figure 6 shows the distribution of images retrieved with different percentages of eigenimages. The plot shows the number of times (expressed as a percentage) that the retrieved image (for the specified percentage of eigenimages) was identical to the image retrieved by using all the eigenimages.



View larger version (11K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 6.  Plot of the percentage of matches versus the percentage of eigenimages used in the image search algorithm. The percentage of matches corresponds to the percentage of retrieved images that matched the image obtained with the entire set of eigenimages.

 
Six representative sections from the training set were processed to synthesize images with known amounts of in-plane rotation, geometric scaling, and intensity scaling. The synthesized image was then submitted as the query image to determine the possible range of variations (in rotation, scaling, and contrast) that would still result in recognition of the correct match section. The Table lists the types of preprocessing and the section identified. (The exact match has the number 0 assigned to it; any other number indicates the distance of the identified section from the correct match.) Figure 7 shows the projection of the images onto the first three eigenimages for 48 images of the training set. The projection of the synthesized images (marked with a •) is also shown in the same figure. This provides a graphical illustration of the distribution of the eigenspace projections of the processed images in relation to the original and other images in the training set. Processed images corresponding to only one of the six representative sections are included in Figure 7 to avoid cluttering the plot. The distribution of the processed images of the other sections was qualitatively similar.


View this table:
[in this window]
[in a new window]

 
Retrieval Table for Images Processed with Various Degrees of Rotation, Geometric Scaling, or Intensity Scaling

 


View larger version (62K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 7.  Distribution of the training set images along the first three principal components. w1, w2, and w3 correspond to the projection coefficients of each image along the first three eigenimages. Each data point in this plot corresponds to an image in the training set (only data corresponding to 48 images are shown). Also shown are the projection coefficients of images that were synthesized from one of the images in the training set (synthesized and original images are indicated by •). Solid arrow indicates the location of the original image. Open arrows indicate the processed images that resulted in retrieval of the original image. Arrowheads indicate the processed images that resulted in retrievals far removed from the original image.

 
Figure 8 is a histogram of the retrieval distribution for a total of 96 images (12 images from eight patient data sets). Each data series corresponds to different image preprocessing steps and/or the retrieval index (subtraction or log-ratio index). The figure legend details the preprocessing steps(s) of the images as well as the retrieval index corresponding to a histogram. An expert manually assigned a match image (from the training set) for each query image. Images in the training set have a spacing of 1 mm, so that the extent of mismatch of the automated image retrieval can be quantified. For instance, a mismatch index of 0 implies that the match section was similar to the expert-selected image. If the automated image retrieval is a section away from the expert choice, an index of 1 is assigned to it (±1 mm away from the match section). An image retrieved by the automated algorithm was rated accurate if it was within 3 mm of the expert-selected section. Distances between sections are known, since the training set consists of images from one imaging series with known section locations and separations (1-mm separation). The percentage of sections (for the total of 96 images) that were identified accurately by the automated algorithm depended both on the retrieval index as well as on the preprocessing. The best performance was obtained by using the difference index on images that were normalized for both spatial orientation and intensity (Fig 8). The number of images that resulted in an accurate match for this choice of retrieval index and image preprocessing was 80 images out of the total number of 96 query images (75 images at 1–3 mm and five images exact match [Fig 8]), yielding a value of 83%.



View larger version (42K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 8.  Image retrieval accuracy for images processed in various ways. Series 1-4 used the subtraction index. Series 1 = two-dimensional alignment, histogram equalization. Series 2 = two-dimensional alignment, histogram equalization, gaussian filtering. Series 3 = 3D registration, original intensities. Series 4 = 3D registration, histogram equalization. Series 5 = 3D registration, original intensities, log index.

 
The algorithm and user interface were implemented in the Java programming language. Figure 9 is a photograph of a computer screen showing the user interface for the content-based image retrieval module. The training set images are shown in the left-most panel. Users can browse through the images using the slider located below the training set image panel. The next panel allows the users to browse through the eigenimages, which are ordered by eigenvalues. The query images can be selected from the third panel. The image from the training set that best matches the query image is shown in the last panel. This panel also contains all the training set images ordered by the extent of match to the query image.



View larger version (107K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 9.  User interface for the content-based image query module. Photograph shows the image data in the upper panel: training set images (first column), eigenimages ordered by eigenvalues (second column), query images (third column), and match images (fourth column). The lower panel provides instructions for use.

 

    Discussion
 Top
 Abstract
 Introduction
 Methods
 Results
 Discussion
 References
 
Image retrieval by using 45% or more of the eigenimages resulted in the same match image as that obtained from the entire set of eigenimages (Fig 6). Here, the comparison is between use of the entire set and a subset of eigenimages and not to the expert-selected match section. The retrieval accuracy decreases dramatically when less than 25% of the entire set of eigenimages is used. The finding that a smaller set of eigenimages is sufficient for retrieval is similar to observations in the domain of face recognition, where investigators have found that accurate face recognition is possible with a subset of the eigenimages (3,6). The earlier work by Bucci et al (12) evaluated content-based image retrieval for brain MR images and confirmed the finding that a subset of eigenimages can accurately retrieve images. The actual number of eigenimages that is sufficient for accurate retrieval depends on the training set data: The greater the variation among the images, the larger the number of eigenimages required for accurate retrieval. The advantage of use of a smaller subset of eigenimages is that it can increase retrieval speed when large image databases are searched. In this preliminary implementation, the image database was formed from the 100 images of the training set and computation time was not an issue.

The multifeature scatter plot shows that adjacent images of the training set are located close in feature space (Fig 7). This is expected, since adjacent images are visually similar in appearance and the training set is composed of images 1 mm apart. A consequence of this is that the automated match for a query image may be a section that is adjacent to the expert-selected match image. However, this is not too critical for the purpose of appropriate section selection, since the likelihood of the relevant structure being found in an adjacent section will be high. The Table shows that in-plane rotations up to 30° still resulted in the correct match. However, the match algorithm was more sensitive to scale and intensity changes. Figure 7 also shows the synthesized images to be clustered around the original image for a small range of image manipulations. However, for larger image manipulations, the synthesized image is far removed from the original image in feature space (see arrowheads in Fig 7). The dependency of the retrieval accuracy on image standardization can be gauged from the distribution of the synthesized images in eigenspace (Fig 7); a close cluster around the original image implies that the retrieval will be accurate. However, Figure 7 shows that tight clusters around the original image are obtained only for a limited range of image manipulations. This sensitivity made us consider standardization in both image orientation and intensity as a means to improve retrieval accuracy.

The histograms of retrieval distribution clearly show that simple preprocessing operations such as centering and scaling are not sufficient to yield high retrieval accuracy (Fig 8, series 1). There is some improvement after introduction of a gaussian spatial filter (Fig 8, series 2). However, this presumes that the area of interest is at the center of the field of view, which may not always be the case. The best retrievals were obtained by using images that were 3D aligned and histogram equalized, which ensured a standard orientation and contrast for the training and query images (Fig 8, series 4). Standardization of orientation alone was not sufficient to increase the accuracy of retrieval (Fig 8, series 3). This can be seen in Figure 10, which shows the match image for an image that is standardized only for orientation (Fig 10a, 10b) and the match image for the same image after intensity normalization (Fig 10c, 10d). However, the log index gave comparable results to series 4 on images that were only 3D aligned and not histogram equalized (Fig 8, series 5). This is as expected, since the log index is less sensitive to intensity scaling. The usefulness of the log index is also shown by the match image retrieved for a T2-weighted query image (Fig 11).



View larger version (80K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 10a.  (a, b) Query image (a) and match image (b) after 3D alignment. (c, d) Same query image (c) and corresponding match image (d) after 3D alignment and histogram equalization. It is clear that the match image with histogram equalization (d) is closer to the query image than is the match image without histogram equalization (b).

 


View larger version (81K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 10b.  (a, b) Query image (a) and match image (b) after 3D alignment. (c, d) Same query image (c) and corresponding match image (d) after 3D alignment and histogram equalization. It is clear that the match image with histogram equalization (d) is closer to the query image than is the match image without histogram equalization (b).

 


View larger version (141K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 10c.  (a, b) Query image (a) and match image (b) after 3D alignment. (c, d) Same query image (c) and corresponding match image (d) after 3D alignment and histogram equalization. It is clear that the match image with histogram equalization (d) is closer to the query image than is the match image without histogram equalization (b).

 


View larger version (119K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 10d.  (a, b) Query image (a) and match image (b) after 3D alignment. (c, d) Same query image (c) and corresponding match image (d) after 3D alignment and histogram equalization. It is clear that the match image with histogram equalization (d) is closer to the query image than is the match image without histogram equalization (b).

 


View larger version (112K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 11a.  Query image (a) and match image (b) obtained by using the log index. The images were also 3D aligned and histogram equalized. The log index can accommodate contrast differences between the training and query images (eg, T1-weighted training image and T2-weighted query image) better than the subtraction index.

 


View larger version (79K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 11b.  Query image (a) and match image (b) obtained by using the log index. The images were also 3D aligned and histogram equalized. The log index can accommodate contrast differences between the training and query images (eg, T1-weighted training image and T2-weighted query image) better than the subtraction index.

 
An important consideration for the image-matching algorithm is that all steps including the preprocessing steps be completely automated, as the final goal is to integrate it within an automated image selection module. Both 3D alignment and histogram equalization are performed by automated algorithms and required no user intervention. The subtraction and log indexes are comparable in performance when dealing with images preprocessed for geometry and intensity uniformity. However, in cases where the image contrast is markedly different from that of the training set, the log index is the optimal choice (Fig 11). The matching algorithm can also successfully handle variations from biologic differences (Fig 12). In the case of images with pathologic conditions, the matching algorithm selected the correct image from the training set of normal images (Fig 13). This depends on the extent of the pathologic condition and the consequent deviation from normal appearance for an image at that level. However, as the training set grows to include images with pathologic conditions, the matching algorithm will be able to select the image at the same level with a similar pathologic condition. Further, images from a given anatomic level can be assigned to several subclasses such as normal, tumor in frontal area, and resection in posterior lobe. Identification of a level could then be followed by a second eigenimage search using only images within that class. The idea behind the hierarchical search is that the use of images from one given class should emphasize the differences within that class. A second query to this subset of images may result in a finer classification that goes beyond allocation to an anatomic location to details within the section at that location.



View larger version (122K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 12a.  Query image (a) and match image (b) show that the algorithm can accommodate physiologic variations in shape and size. The match is good despite marked differences in brain shape.

 


View larger version (102K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 12b.  Query image (a) and match image (b) show that the algorithm can accommodate physiologic variations in shape and size. The match is good despite marked differences in brain shape.

 


View larger version (136K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 13a.  Query image (a) and match image (b) show that the algorithm can classify images to the correct level even in the presence of a pathologic condition.

 


View larger version (124K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 13b.  Query image (a) and match image (b) show that the algorithm can classify images to the correct level even in the presence of a pathologic condition.

 
In the context of other content-based image retrieval systems, the image-matching algorithm reported herein is used to label images by the anatomic level represented in the image. Other content-based image retrieval algorithms have been proposed for classifying the pathologic type (9,10) by retrieving an image with features matching the whole image or a user-identified subimage of interest. Obviously, each content-based image retrieval algorithm is tailored for the specific application. However, it is possible to integrate the algorithm proposed herein into the architectures described in references 7, 8, and 14. The eigenfeatures derived herein can also be part of the feature set used in the neuroimage indexing proposed in reference 11.

The current work extends the principal component analysis method proposed earlier (12) in several ways. An important contribution is the use of the new log index and its lower sensitivity to variations in the image intensity scale and image contrast. This is important, since histogram equalization does not necessarily match the contrasts of images acquired under different acquisition strategies. This is especially true for MR imaging, where tissue contrast is a strong function of the image acquisition parameters. Further, we have also evaluated use of a quantitative index of the retrieval accuracy; this serves to define the applicability of the algorithm to the automated image selection process.

The context of the image retrieval algorithm is the intelligent selection of images from a given study. The present preliminary study shows that with appropriate preprocessing for image standardization, images within a study can be assigned to the correct anatomic levels by the eigenimage-matching algorithm. The entire process is automated and can run off-line when the database is updated with an imaging study and its associated report. The selected images can then be labeled as such in the image database, so that applications requesting a study will be presented with the report and the relevant images of a study. Selective image presentation goes beyond the prefetching that has been proposed till now in picture archiving and communication systems (18) and constitutes the first step in intelligent filtering of a large imaging study. This will considerably reduce overheads related to transmission times of entire studies, storage requirements at workstations, and time spent in navigating large data sets for the sections of interest. Further, it would be a great help to the referring physician and radiology resident, since the relevant images will be filtered for their use. Another application of image selection is in the integration with the electronic medical record. Even though the need for inclusion of images into the electronic medical record has been identified as important to clinical case management, the large volume of data generated in imaging studies has hindered effective integration of images. The image selection procedure described herein could form the basis for effective image integration into the electronic medical record by automatically filtering relevant images from a large-volume image study. It is also possible to tailor the algorithm discussed herein to other image indexing problems that are of clinical relevance. For example, the algorithm can be trained to automatically classify brain images as normal or abnormal or to classify imaging studies according to the body part that is imaged. However, each application will require creation of the appropriate training sets as well as specific image preprocessing strategies.


    Footnotes
 
Abbreviation: 3D = three-dimensional


    References
 Top
 Abstract
 Introduction
 Methods
 Results
 Discussion
 References
 

  1. Faloutsos C, Barber R, Flickner M, et al. Efficient and effective querying by image content. J Intelligent Inform Syst 1994; 3:231-262.[CrossRef]
  2. Gupta A, Jain R. Visual information retrieval. Commun Assoc Comput Machinery 1997; 40:70-79.
  3. Pentland A, Picard RW, Sclaroff S. Photobook: content-based manipulation of image databases. Proc SPIE 1994; 2185:34-47.[CrossRef]
  4. Carson C, Thomas M, Belongie S, Hellerstein JM, Malik J. Blobworld: a system for region-based image indexing and retrieval. Presented at the Third International Conference on Visual Information Systems, Amsterdam, the Netherlands, June 2–4, 1999.
  5. Ma WY, Manjunath B. Netra: a toolbox for navigating large image databases. Proceedings of the International Conference on Image Processing. Vol 1. Piscataway, NJ: Institute of Electrical and Electronics Engineers, 1997; 568-571.
  6. Turk M, Pentland A. Eigenfaces for recognition. J Cogn Neurosci 1991; 3:71-86.
  7. Chronaki CE, Zabulis X, Orphanoudakis SC. I2Cnet medical image annotation service. Med Inform 1997; 22:337-347.
  8. Lowe HJ, Antipov I, Hersh W, Smith CA. Towards knowledge-based retrieval of medical images: the role of semantic indexing, image content representation and knowledge-based image analysis. Proc AMIA Symp 1998; 882-886.
  9. Kelly P, Cannon M. Experience with CANDID: comparison algorithm for navigating digital image databases. Proc SPIE 1995; 2368:64-74.[CrossRef]
  10. Shyu CR, Brodley CE, Kak AC, Kosaka A, Aisen AM, Broderick LS. ASSERT: a physician-in-the-loop CBIR system for medical imagery. Comput Vision Image Understanding 1999; 75:111-132.
  11. Yanxi L, Rothfus WE, Kanade T. Content-based 3D neuroradiologic image indexing and retrieval: preliminary results. Proceedings of the International Workshop on Content-based Access of Image and Video Databases. Piscataway, NJ: Institute of Electrical and Electronics Engineers, 1998; 1-25.
  12. Bucci G, Cagnoni S, De Dominicis R. Integrating content-based retrieval in a medical image reference database. Comput Med Imaging Graph 1996; 20:231-241.[CrossRef][Medline]
  13. Wang JZ. Pathfinder: multiresolution region-based searching of pathology images using IRM. Proc AMIA Symp 2000; 883-887.
  14. El-Kwae EA, Xu H, Kabuka MR. Content-based retrieval in picture archiving and communication systems. J Digit Imaging 2000; 13:70-81.[Medline]
  15. Taira RK, Soderland SG, Jacobvits RM. Automatic structuring of radiology free-text reports. RadioGraphics 2001; 21:237-245.[Abstract/Free Full Text]
  16. Spackman KA, Campbell KE, Cote RA. SNOMED RT: a reference terminology for health care. Proc AMIA Fall Symp 1997; 640-644.
  17. Woods RP, Grafton ST, Watson JD, Sicotte NL, Mazziotta JC. Automated image registration. II. Intersubject validation of linear and nonlinear models. J Comput Assist Tomogr 1998; 22:153-165.
  18. Wilson DL, Smith D, Rice B. Intelligent prefetch strategies for historical images in a large PACS. Proc SPIE 1994; 2165:112-123.[CrossRef]



This article has been cited by other articles:


Home page
RadiologyHome page
M. W. Vannier and R. M. Summers
Sharing Images
Radiology, July 1, 2003; 228(1): 23 - 25.
[Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Figures Only
Right arrow Full Text (PDF)
Right arrow Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when eLetters are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire