PROTEIN CLASSIFICATION BY ANALYSIS OF
CONFOCAL MICROSCOPIC IMAGES OF SINGLE CELLS
Tanja Steckling?, Olaf Hellwich?®, Stephanie Wiilter®, Erich Wanker"
? Technical University Berlin, Computer Vision and Remote Sensing, Sekr. FR 3-1, Franklinstr. 28/29, 10587 Berlin,
Germany, Phone: +49-30-314-22796, Fax: +49-30-314-21104, e-mail: hellwich@fpk.tu-berlin.de
? Max-Delbrück-Centrum für Molekulare Medizin (MDC), Berlin-Buch, Robert-Róssle-Str. 10, 13092 Berlin,
Germany
Commission WG V/3
KEY WORDS: image analysis, feature selection, classification, medical image processing, microscopic imagery
ABSTRACT:
Proteins being present in a living cell fulfil a certain task in the cell. As a consequence of its functionality a protein is located in
certain parts of the cell. If it is made visible the resulting patterns can help to identify the protein, as the spatial distribution of the
visible structures depends on the functionality of the protein inside of the cell and, therefore, characterises the protein. The cells used
for the experiments were COS-1 cells typically allowing easy microscopic data takes as the cells are much larger than their nuclei.
With the help of a suitable parameterisation the proteins can be automatically identified. In order to derive such a parameterisation,
features describing the spatial structure of the protein are extracted. The stochastic behaviour of the features is of major importance
for the performance of the method.
1. INTRODUCTION
A protein present in a cell can be made visible by a chemical
treatment with antibodies. The spatial distribution of the visible
structures depends on the functionality of a protein inside of the
“cell and characterises the protein. Therefore, it allows or at least
helps to identify the protein. In this work a method to
automatically classify proteins on the basis of single cell images
is described.
The imagery of COS-1 cells used here has been acquired by
fluorescence confocal microscopy. From a data take, i.e. a focus
series of images, the image optimally showing the spatial
distribution of the protein has been selected. A single cell
extracted from such an image constitutes the input to the
algorithm described.
In order to derive a parameterisation identifying proteins,
features describing the spatial structure of the protein have to be
extracted. An interactive classification of proteins by a human
operator has shown that a classification accuracy of 95 to 100
9$ is possible. Similar classification accuracy can be achieved
by an automatic analysis when suitable features are selected. As
the consecutively following steps of the procedure and the facts
being their basis, such as probability density distributions, their
derivation from training data, the choice of a classification
method, and the derivation of a classification decision, are well
known, feature selection or feature reduction is the crucial step
of the procedure. The importance of feature reduction
corresponds to the fact that in human vision, particularly in
deriving decisions from visual information, the large amount of
data/information in images based on high spatial and
radiometric resolution is first severely reduced before being
extended again by associating knowledge, e.g. about objects
and context, in order to derive new knowledge or decisions in a
process of thinking (BECKER-CARUS, 1981).
Using our method, previously unknown proteins can be
identified as long as the protein shows an individual spatial
structure inside of the cell. With an automatic procedure, from a
specific spatial structure conclusions with respect to the
chemical role of the protein could be drawn, as the molecules
appear where they are chemically active. This means that image
analysis can provide a new method to proteomics research,
possibly of efficiency previously unknown. It is our long term
goal to derive and test such a method.
2. PREVIOUS WORK
BOLAND et al. (1997) describe a method to classify cellular
protein localization patterns based on their appearance in
fluorescence light microscope images. Numeric features were
used as input values to either a classification tree or a neural
network (BOLAND et al, 1998). MARKEY et al. (1999)
developed methods for objectively choosing a typical image
from a set of images, emphasizing cell biology. The methods
include calculation of numerical features to describe protein
patterns, calculation of similarity between patterns as a distance
in feature space, and ranking of patterns by distance from the
center of the distribution in feature space. The images chosen as
most typical were in good agreement with the conventional
understanding of organelle morphologies. MURPHY et al. (2000)
describe an approach to quantitatively describe protein
localization patterns and to develop classifiers able to recognize
all major subcellular structures in fluorescence microscope
images. Since fluorescence microscope images are a primary
source of information about the location of proteins within
cells, MURPHY et al. (2001) strive to build a knowledge-based
system which can interpret such images in online journals. They
developed a robot searching online journals to find fluorescence
microscope images of individual cells. BOLAND & MURPHY
(2001) used images of ten different subcellular pattems to train
a neural network classifier. The classifier was able to correctly
recognize an average of 83 % of the patterns. Fluorescence
microscopy is the most common method used to determine
Inter
subc:
previ
subc«
They
on th
cell €
imag
analy
two
visua
appli
sets
repoi
are fi
resol
netw
test
netw
perf
indi
In th
to v
The
mak:
judg
reco
direc
reco
are
iden
from
a pr
be u
insic
Prio
incl
acqu
foun
orga
ima;
had
part:
chos
allo
Gol,
On
app:
inva
defi
poir
is si
1).
gol