**Abstract** : We consider feedforward neural network such as multi-layer perceptrons as non-orthogonal bases in a function space, bases which span submanifolds of that space. The basis functions of that base are the functions computed by the neurons of the hidden layer. A function to be approximated is then a vector in the function space. The projection of that vector unto the submanifold spanned by the base is the function approximated by the neural network. That approximation is then optimal when the distance between the function to be approximated and its projection unto the submanifold is minimal by some metric. We compute this distance in sample space, i.e. that subspace of function space the dimensions of which correspond to the input samples we have of the function to be approximated. The objective of learning in such a network is thus to minimize the distance between the function to be approximated and its projection unto the submanifold. This is achieved via dynamically rotating and shifting the base in such a way that the distance above is minimal. That rotation and shifting is executed through modification of the parameters of the basis functions of the network. A convenient way of computing the projection is with the help of metric tensors, a tool from differential geometry. We call this new approach to learning projection learning. The basis functions to be used are arbitrary : Gaussian, Gabor, sigmoid, etc, etc., except that they must be differentiable in some sense with regards to their parameters/weights. We present the application of the paradigm and learning rule to multi-layer perceptrons as well as bases of multivariate Gaussians, discuss some other potential applications and present alternatives to the use of the metric tensor.