DESIGNING OPTIMIZED PARALLEL INTERLEAVER ARCHITECTURES FOR TURBO AND LDPC DECODERS
Conception d’architectures d’entrelaceurs parallèles pour les décodeurs de Turbo-Codes et de LDPC
Résumé
We live in the era of high data rate wireless applications (smart-phones, net-books,
digital television, mobile broadband devices…) in which advanced technologies are included
such as OFDM, MIMO and advanced error correction techniques to reliably transfer data at
high rates on wireless networks. Turbo and LDPC codes are two families of codes that are
extensively used in current communication standards due to their excellent error correction
capabilities. For high throughput performance, decoders are implemented on parallel
architectures in which more than one processing elements decode the received data. However,
parallel architectures suffer from memory conflict problem. It increases latency of memory
accesses due to the presence of conflict management mechanisms in communication network
and unfortunately decreases system throughput while augmenting system cost.
To tackle memory conflict problem, three different types of approaches are used in
literature. In first type of approaches, “architecture friendly” codes are constructed with good
error correction capabilities in order to reduce hardware cost. However, these codes originate
problem at the channel interleaver. In the second type of approaches, flexible and scalable
interconnection network are introduced to handle memory conflicts at run time. However,
flexible networks suffer from large silicon area and increased latency. The third type of
approaches are design time memory mapping approaches in which the resultant architectures
consist of ROM blocks used to store configuration bits. The use of ROM blocks may be
sufficient to design parallel architecture that supports single codeword or single application.
However, to design hardware architecture that supports complete standard or different
applications, ROM based approaches result in huge hardware cost. To reduce hardware cost,
optimizations are required to use as less ROMs as possible to support different applications.
In this thesis, we aim to design optimized parallel architectures. For this purpose, we
have proposed two different categories of approaches. In the first category, we have proposed
two optimized design time off-chip approaches that aim to limit the cost of final decoder
architecture targeting the customization of the network and the use of in-place memory
architecture.
In the second category, we have introduced a new method in which both runtime and
design time approaches are merged to design flexible decoder architecture. For this purpose,
we have embedded memory mapping algorithms on-chip in order to execute them at runtime
to solve conflict problem. The on-chip implementation replaces the multiple ROM blocks with a single RAM block to support multiple block lengths and/or to support multiple
applications. Different experiments are performed by executing memory mapping approaches
on several embedded processors.
Les codes correcteurs d’erreurs sont largement utilisés dans des domaines allant de l’automobile aux communications sans fils. La complexité croissante des algorithmes implémentés et l’augmentation continue des débits applicatifs constituent des contraintes fortes pour la conception d’architectures matérielles. Un tel composant utilise (1) des éléments de calculs, (2) des mémoires et des modules de brassage de données (entrelaceur/désentrelaceur TurboCodes, blocs de redondance spatio-temporelle des systèmes OFDM/MIMO…). La complexité et le coût de ces systèmes sont très élevés; les concepteurs doivent pourtant parvenir à minimiser la consommation et la surface total du circuit, tout en garantissant les performances temporelles requises. Dans ce cadre nous nous intéressons à l’optimisation des architectures des modules de brassage de données.
Différentes solutions sont proposées dans la littérature, nos travaux se focalisent sur la définition d’approches de placement de données en mémoire permettant d’optimiser le coût matériel de ces architectures. Ainsi, nous présentons deux approches méthodologiques. Premièrement, nous proposons deux solutions de placement mémoire s’appliquant au moment de la conception du système: (1) placement mémoire avec personnalisation de réseau (dite Relaxation de réseau); et (2) placement mémoire garantissant un placement des données dit in-place afin de générer architecture optimisée.
Deuxièmement, nous présentons une approche se basant sur l’exécution des algorithmes de placement de données directement dans le système via l’intégration d’un composant matériel dédié.
Loading...