Automated Bleeding Detection in Wireless Capsule Endoscopy Videos
By: Rohit Unnimadhavan, Software Engineer at PathPartner Technology
Imaging of the human GI tract has been helping physicians in the detection and diagnosis of various pathological problems including infections, bleeding, ulcers, polyps and even tumors. The human GI tract mainly includes the mouth, the esophagus, the stomach, the small intestine (or small bowel) and the large intestine, as shown in Figure 1. The upper part of the GI tract can be observed using traditional endoscopic equipment, which in general is a wired camera that is pushed inside through the mouth. Similar equipment is used in colonoscopies in order to visualize parts of the large intestine. Such traditional equipment causes considerable discomfort to the patient and is often unable to access regions like the small intestine, due to its convoluted shape with many loops. It is in this context that non-invasive techniques like Wireless Capsule Endoscopy (WCE) finds great popularity among physicians. This particularly helps to diagnose GI pathologies in jejunum and ileum (See Figure 1), which can be the sites for bleeding, inflammations or in rare cases, tumors.
With a WCE procedure, the patient swallows a small capsule camera similar to the one shown in Figure 2 which typically captures two images per second and transmits the images via RF to an array of sensors attached over the patient's body . The images from the data recorder attached to the patient’s body are then downloaded to a computer and are examined by trained physicians to diagnose potential problems.
Figure 1: Human gastro-intestinal (GI) tract
Figure 2: An example WCE capsule camera
A significant hurdle in adopting WCE as a routine service is the relatively large amount of time required to analyze the WCE images. Since the capsule undergoes passive movement through the GI tract, it can take up to eight hours for the capsule to complete its journey . This results in large amounts of data (around 50,000 images at 2 fps) and hence poses serious problems by requiring a lot of time and effort from the physician to make a diagnosis. Also, since the length of the capsule endoscopy video is more, even experts may miss relevant cases due to oversight. Such problems can be solved considerably by using automated analysis of the WCE frames and thus resulting in a better diagnosis. This would also bring down the costs involved as the expert will have to spend a much reduced time per case, thus making it a more affordable method. Automated analysis of WCE images involves topographic video segmentation, bleeding detection, abnormal tissue detection, capsule retention detection, adaptive viewing speed adjustment, non-informative frame filtering, intestinal fluid and contraction detection etc. . In this paper, we focus only on bleeding detection in WCE images, as this is one of the most common issue physicians look for in WCE videos and is in itself a challenge.
Bleeding is a common symptom associated with many GI disorders and so detection of bleeding regions in WCE frames is a primary aim for automated WCE analysis. Once the bleeding frames are flagged, the physician can manually inspect the regions and make the diagnosis. However, automated bleeding detection is definitely a challenging task.Bleeding occurrences are not always clear; many of the cases cannot be recognized by an observer without medical training. Also, multiple forms of noise that tend to occur in endoscopic images, natural findings like digestive juices, food debris, other fluids and bubbles, and blurry images or light-related distortions, etc. are major challenges in automatically analyzing the WCE images for bleeding occurrence. Normal vascular patterns on the GI epithelium, which will have similar color patterns as bleeding regions, also complicate the task.
Despite significant difficulties, multiple methods of detecting gastrointestinal bleeding in WCE videos were developed. From a technical point of view, automatic bleeding detection is a combination of image processing and pattern recognition, sometimes involving machine learning as well. The bleeding detection algorithms, in general, have three basic stages – Preprocessing, Feature Extraction, and Classification.
Most of the existing algorithms employ some kind of preprocessing on the WCE frames before extracting the relevant feature and deciding whether it is a bleeding or a normal frame. The exact techniques employed vary with the feature to be extracted and the algorithm employed. Dark pixel removal to remove the black background of the WCE frame is fairly common. Contour processing, specifically edge detection and masking, is also employed by a lot of methods as edges often have characteristics similar to a bleeding region. Certain methods, especially the learning-based ones, employ one or the other kind of segmentation, from dividing the image into simple arbitrary tiles to super-pixel segmentation to better target the bleeding areas of a frame. Since bleeding regions mostly occur only in some portions of the frame, it is more sensible to look for bleeding segments in a frame rather than looking at the image as a whole and a good segmentation algorithm helps in this regard. Noise removal is also a fairly common step in the preprocessing stage. Some methods also involve basic image processing techniques like contrast stretch, gray-scale equalization, and morphological operations as well.
There are mainly two classes of features that help in differentiating a bleeding frame from a normal frame – color, and texture based features, based on which the WCE frame is classified as bleeding or normal .
As bleeding regions are most often characterized by different shades of red color, it is quite intuitive that this information can be utilized in detecting the presence of a bleeding region. Different methods make use of different color models like RGB, HSI, HSV, CIELab spaces, etc. One of the common ways of using the color information is to look for pixels in a specific range of values which correspond to the desired shades of red in the corresponding color space. The ranges can be determined by analyzing sample images in a training step or can be fixed manually. While some methods like Range Ratio method proposed by Al-Rahayfeh et al.  use the color ranges as the sole criterion for making a decision, it is more sensible to use such an approach as an initial filtering step and combine it with some other more sophisticated method. Apart from the use of raw value of pixels, there are a variety of color features that are discussed in literature that can be used to differentiate bleeding and normal regions. Multi-dimensional histograms and/or various statistical measures extracted from such histograms [4, 5, 6], Dominant Color , Chrominance Moment , MPEG-7 Scalable Color descriptor , Red Ratios , Color Coherence Vector  etc. are few of the significant color features explored in literature.
Texture features are particularly suited for the detection of tumor etc. but can also be used for detecting bleeding regions. Local Binary Pattern (LBP) [4, 8] is one of the most commonly used textural feature used for bleeding detection. In the computer-aided bleeding detection method proposed by Li et al. , various statistical measures are extracted from uniform LBP of I channel in HSI model and then these are used in a learning-based model to arrive at the decision. Yeh et al.  extracted similar statistical measures from a gray-level co-occurrence matrix that represents the spatial distribution and dependence of the gray levels, to build various texture models. Apart from such common texture measuring techniques like LBP and co-occurrence matrices, some methods also use lesser known approaches. The Reed-Xiaoli (RX) algorithm based on covariance matrix used by Penna et al.  for detection of anomalies is such an approach. While color features are more suited for the detection of active bleeding regions, textural features provide statistical, structural and spectral properties of the image and hence can assist in detecting spots of blood and similar bleeding abnormalities that may not be identified using color features alone.
Once the features are extracted from a frame in question, the decision has to be made whether it is bleeding or not. The most popular classification mechanisms are simple thresholding and machine learning based techniques. Thresholding is simple but the results depend heavily on the threshold chosen and the thresholds are not often robust to variations in input images. Well known classifiers like Neural Networks and Support Vector Machines offer a more sophisticated mechanism for arriving at the decision. Using a classifier necessitates a training phase and a proper database of images to be used in the training phase. Various configurations including multilayer perceptron neural networks [8, 13], probabilistic neural networks  and different kernels for SVMs [8, 11] are used in literature. Some algorithms also use K-Nearest neighbor classification with good results . In cases where the feature vector size is large, feature selection techniques could also be used. Methods like Principal Component Analysis (PCA) are often used in order to assist the classification by initial reduction of redundant data from the feature vector. It is to be noted that C4.5 Decision Trees are considered by some as the best decision mechanism for bleeding detection, compared to SVM and Neural Networks 
Figure 3: Flow chart of Training phase of a learning-based method
Figure 4: Flow chart of Detection phase of a learning-based method
Learning-based mechanisms typically have two phases – a training phase and a detection phase, as shown in Figure 3 and Figure 4. In the training phase, a suitable feature set capable of discriminating between bleeding and non-bleeding regions is extracted from each of the preprocessed images of the training database and based on this feature set, a properly chosen classifier is trained. A WCE video will serve as the input to the bleeding detection phase. Individual frames will be extracted from this video and each of the frames will undergo a preprocessing step similar to the training phase. A suitable set of features, as chosen in the training phase, will then be extracted from the preprocessed image. Based on the feature set extracted, the decision will be made whether the image contains bleeding regions or not, by feeding it to the chosen classifier along with the previously trained model. The decision mechanism may involve some other conditions and parameters along with the chosen classifier.
In the next section, a simple method based on thresholding and an SVM based method is examined in some detail. We choose SVM over other learning-based decision mechanisms because of its various advantages and easy implementation.
Practical implementations of bleeding detection algorithms
In this section, two basic bleeding detection methods are looked at in detail. The performance of the automated bleeding detection methods are quantified using the commonly used measures-sensitivity, specificity and accuracy, as defined below:
Sensitivity = TP/(TP+FN)
Specificity = TN/(TN+FP)
Accuracy = (TP+TN)/(TP+TN+FP+FN)
Where, TP=No. of True Positives (Bleeding frames identified by algorithm as bleeding)
TN= No. of True Negatives (Normal frames identified by algorithm as normal)
FP= No. of False Positives (Normal frames identified by algorithm as bleeding)
FN= No. of False Negatives (Bleeding frames identified by algorithm as normal)
A sensitivity of 100% implies that all the bleeding frames are detected as bleeding and a specificity of 100% implies that all the normal frames are marked as normal itself. Thus, the core aim of automated bleeding detection in endoscopic videos can be formulated as - to reduce the number of frames to be analyzed by the physician (i.e., to have high specificity) without missing any bleeding cases (without compromising a high sensitivity).
This is one of the simplest bleeding detection methods. It operates on each image, pixel-by-pixel. If the R, G and B values fall in the specified ranges, it is classified as a red pixel. Such a method is proposed by Al-Rahayfeh et al. , where the ranges specified are R>=75 and R < 128, G <= 25 and G >= 14, B <= 15 and B >=0 and if the RGB values of a pixel fall in that range, it is labeled as bleeding. Such methods can successfully detect bleeding regions with sensitivity close to 100% but sometimes can miss cases where blood occurs in blackish-red shades. The ranges need to be broad enough to include different shades of red so that the sensitivity can be increased but this will result in many false positives too, resulting in reduced specificity. Similarly, the ranges need to be sharp enough to discard normal regions (thereby reducing the number of images required to be manually analyzed by physician, which is the primary aim of automated analysis) but this might result in genuine bleeding cases to be missed, leading to reduced sensitivity, which is very much undesired in practice. Thus, there exists a trade-off between sensitivity and specificity with respect to the ranges used for thresholding. Hence, it is often advisable to use broad ranges and to use this method as a kind of initial filtering step, before more sophisticated algorithms are used. Similar to the RGB ranges , the ranges may be specified in other color domains too.
Fu et. al  describes a basic SVM-based method that operates on segments rather than pixels or image as a whole. Here, the image is first converted to CIELab space and the preprocessing involves edge detection in L channel. The detected edge regions are dilated and then masked. The images are then smoothened a bit by Gaussian filtering and divided into meaningful segments using super-pixel segmentation algorithm. Super-pixel segmentation takes into account closeness of adjacent pixels in spatial and color domain. The segments are then manually marked as bleeding and normal. Feature vector used is red ratios: R/G, R/B and R/(R+G+B), where R, G and B are the mean values of R, G and B values of pixels in a segment. The feature vectors extracted from these segments are then fed to SVM learning and the learned model is then used for classification. In testing phase, the feature set extracted from the super-pixels of a preprocessed frame are classified using the learned SVM model as bleeding or normal.
Such a method can be very robust as it does not depend on any arbitrary parameter that was pre-determined but at the same time, it will depend heavily on the quality of the training database used. The images used for training the SVM model should in practice be similar to the ones being tested and the annotation of bleeding and normal segments in an image should also be very accurate to get a good training model capable of discriminating between bleeding and normal samples. Thus, the training phase might often require expert knowledge. But once the model is learnt, predictions can be made easily. The quality of segmentation can also affect the training phase. Segmentation should be done ideally in such a way that each segment is either fully a bleeding region or a normal region without any partial cases. This might not be so in practice and hence the labeling of such regions can also affect the training model and subsequent prediction. Lack of sufficient relevant samples to model bleeding and other abnormalities is another significant problem for learning based methods. Often the training samples are skewed in favor of normal samples and some of the literature suggests this asymmetry might also affect the quality of the classification.
Another thing to note is that Fu et al.  uses a feature vector of size 3 and hence may not able to capture the distinguishing characteristics of bleeding and normal regions accurately. Methods which use a larger feature vector size have also been explored in literature and are demonstrated to have the potential to more reliably capture the distinguishing characteristics . However, such methods might need to use some kind of feature selection in order to efficiently learn the training model.
One of the promising areas in WCE is real-time bleeding detection, carried out within the capsule. Typically bleeding detection happens offline, i.e., from a remote computer after all the images have been downloaded from the data recorder attached to the patient's body. Real-time bleeding detection, on the other hand, occurs inside the capsule itself. This will help in reducing the time and storage requirements while transferring the images from the data recorder attached to patient's body to the computer and will make it a faster process. However, with the current level of technology it may be advisable to not eliminate any frames completely but instead only to flag the suspected frames, so that the inspecting physicians do not lose any relevant or contextual information as well. Real-time bleeding detection algorithms unlike the offline detection algorithms should target not only a high accuracy of detection but also a high efficiency, in order to work in real-time . Even though it poses some serious challenges in terms of hardware and algorithm complexity, power requirement, etc., a future where most of the automated analysis is carried out within the capsule itself is not so difficult to imagine. Apart from real-time analysis, future research may also examine the scope of using temporal features for analyzing a WCE video.
Compared to traditional endoscopy, the WCE technology suffers certain disadvantages like inability to carry out controlled inspection, biopsy, targeted load delivery etc. but its ability to potentially image the GI tract in its entirety and detect obscure symptoms ensures that WCE technology is here to stay. Among the various methods proposed in literature for the automated analysis of WCE videos, classifier based methods promise a better specificity and robustness without much compromise in sensitivity, compared to simple thresholding based methods. Preprocessing the WCE images using various image processing techniques improves the accuracy of detection in general but does not come without its share of problems. Techniques like edge masking can mask genuine bleeding regions since changes in color are also often detected as edges. Similarly, some endoscopy capsules have text and other data overlaid on the images and this may not be completely removed in preprocessing. Such pixels will cause potential problems in segmentation and may affect the classification. Lack of standard annotated WCE databases in public is another impediment in developing learning-based methods and reliably comparing their performances.
 "Capsule Endoscopy - State of the Technology and Computer Vision Tools after the First Decade" - Michal Mackiewicz (2011)
 "An Overview of Image Analysis Techniques in Endoscopic Bleeding Detection"- Adam Brzeski, Adam Bloku and Jan Cychnerski (2013)
 "Detection of Bleeding in Wireless Capsule Endoscopy Images Using range Ratio Color" - Amer A. Al-Rahayfeh and Abdelshakour A. Abuzneid (2010)
 "A technique for blood detection in wireless capsule endoscopy images" - B Penna, T Tillo, M Grangetto, E Magli (2009)
 "Bleeding detection in wireless capsule endoscopy using adaptive colour histogram model and support vector classification" - Mackiewicz, Fisher & Jamieson (2008)
 "A Histogram Based Scheme in YIQ Domain for Automatic Bleeding Image Detection from Wireless Capsule Endoscopy" - A. K. Kundu, M. N. Rizve, T. Ghosh, S. A. Fattah and C. Shahnaz (2015)
 "Bleeding detection in wireless capsule endoscopy based on color features from histogram probability" - S Sainju, FM Bui, K Wahid (2013)
 "Bleeding detection in wireless capsule endoscopy based on MST clustering and SVM" - Y Xiong, Y Zhu, Z Pang, Y Ma, D Chen (2015)
 "Computer-Aided Detection of Bleeding Regions for Capsule Endoscopy Images" - Baopu Li and Max Q.-H. Meng, Fellow, IEEE (2009)
 "Mpeg-7 visual part of experimentation model version 8.0" - A. Yamada, M. Pickering, S. Jeannin, L. Cieplinsky, J. R. Ohm, and M. Kim (2001)
 "Computer-aided bleeding detection in WCE video" - Y Fu, W Zhang, M Mandal (2014)
"Bleeding and Ulcer Detection Using Wireless Capsule Endoscopy Images" - Jinn-Yi Yeh, Tai-Hsi Wu, Wei-Jun Tsai (2014)
 "Automated Bleeding Detection in Capsule Endoscopy Videos Using Statistical Features and Region Growing" - Sonu Sainju & Francis M. Bui & Khan A.Wahid (2014)
 "Bleeding Detection in Wireless Capsule Endoscopy Based on Probabilistic Neural Network" -Guobing Pan, Guozheng Yan, Xiangling Qiu & Jiehao Cui (2011)
 "Real-Time Bleeding Detection in Gastrointestinal Tract Endoscopic Examinations Video" - Adam Blokus, Adam Brzeski, Jan Cychnerski (2013)
Rohit Unnimadhavan is Software Engineer at PathPartner Technology. The company, based out of California and Bangalore, is a leading provider of products and services for multimedia centric embedded devices. PathPartner has extensive experience in audio & video codecs, video analytics & vision, imaging, multimedia middleware and application development. Contact him at firstname.lastname@example.org.