Abstract : A vision-based robot localization system must be robust: able to keep track of the position of the robot at any time even if illumination conditions change and, in the extreme case of a failure, able to efficiently recover the correct position of the robot. With this objective in mind, we enhance the existing appearance-based robot localization framework in two directions by exploiting the use of a stereo camera mounted on a pan-and-tilt device. First, we move from the classical passive appearance-based localization framework to an active one where the robot sometimes executes actions with the only purpose of gaining information about its location in the environment. Along this line, we introduce an entropy-based criterion for action selection that can be efficiently evaluated in our probabilistic localization system. The execution of the actions selected using this criterion allows the robot to quickly find out its position in case it gets lost. Secondly, we introduce the use of depth maps obtained with the stereo cameras. The information provided by depth maps is less sensitive to changes of illumination than that provided by plain images. The main drawback of depth maps is that they include missing values: points for which it is not possible to reliably determine depth information. The presence of missing values makes Principal Component Analysis (the standard method used to compress images in the appearance-based framework) unfeasible. We describe a novel Expectation-Maximization algorithm to determine the principal components of a data set including missing values and we apply it to depth maps. The experiments we present show that the combination of the active localization with the use of depth maps gives an efficient and robust appearance-based robot localization system.