**Abstract** : The purpose of this paper is to present a parallel domain decomposition algorithm for a Lagrangian based discrete particle method (DPM). The DPM deals with thermal decomposition of solid e.g. biomass fuel particles and their interactions. The thermal conversion of solid-fuel particles in a combustion chamber involves a large number of particles with different shapes and complex three-dimensional geometry. The simulation requires large computational resources hence can not be performed on a single processor within a reasonable time. While the computation time can be shorten by parallelizing the DPM, the method used for partitioning or decomposing the DPM is very important for the efficiency of the parallelization [1]. The particle interaction in the combustion chamber are short-range interactions so the thermal conversion of a particle is affected by its nearest neighbor particles only. Therefore, a geometric based parallel domain decomposition algorithm for the DPM parallelization is proposed. The algorithm is based on an Orthogonal Recursive Bisection(ORB). Geometric based algorithms are very fast and suitable for repartitioning also called dynamic load balancing. Moreover, handling boundary particles is easy. For applying the algorithm the simulation space is divided into cubical cells. The size of the cell is defined so that the largest particle can fit into a cell. With the cell approach, the problem of finding all particles interacting with a given particle is reduced to checking the particle's own cell and its nearest-neighboring cells. This will significantly reduce the time for the neighboring particle search algorithm. For handling boundary particles, each subdomain is extended to include the so-called halo or ghost cell layer [2]. The ghost cells hold copies of the corresponding active cells in the neighboring processors. Therefore each processor has all the necessary data locally and proceeds with the computation independent of the other processors. The parametric studies conducted to evaluate the performance of the algorithm are presented. Here performance means both the effectiveness of the algorithm to achieve load balancing and its execution time. For measuring the performance different number of particles are considered. The results show that the algorithm is able to achieve less than 7% (2.5% - 6.3%) load imbalance for uniformly distributed particles. For 4096 particles non-uniformly distributed, the algorithm produces only 3.8% load imbalance. The parallel implementation of the algorithm shows that using 2 processors (to create 4 subdomains), there is a reduction of 20% compared to the sequential execution of the algorithm for a large number of particles such as 32768. Using more processors (e.g. 6 processors) does not actually give better performance compared to 2 processors. This is due to the fact that there are more inter-processor communications.