next up previous
Next: About this document

large4

Christof Paar, ``Efficient VLSI Architectures for Bit Parallel Computation in Galois Fields,'' Ph.D., Institute for Experimental Mathematics, University of Essen, Germany, November 1994.

Advisor: Han Vinck

The thesis describes various efficient architectures for arithmetic in Galois fields of the type tex2html_wrap_inline22 . ``Efficient'' refers to the fact that the architectures require a small number of elementary gates that are logical AND and exclusive OR. All architectures are bit parallel. The work focuses on the basic operations: multiplication, constant multiplication, and inversion. The architectures are based on algorithms which make extensive use of the decomposition of fields tex2html_wrap_inline24 into tex2html_wrap_inline26 , the latter of which will be called composite fields.

Two efficient algorithms which are related to composite fields are developed. One algorithm finds matrices which map binary field representations to composite field representations. The second algorithms performs a fast test to determine whether a polynomial over tex2html_wrap_inline28 is primitive.

First, previous bit parallel architectures for fields tex2html_wrap_inline30 and composite fields are reviewed. We comment on some of the previous architectures. A suboptimum algorithm for constant multiplication with a reduced number of gates is introduced.

A general architecture for multiplication in composite fields is developed based on the Karatsuba-Ofman algorithm. The algorithm is closely investigated with respect to a parallel hardware implementation. It is shown that multiplication of two polynomials of degree less than m over tex2html_wrap_inline34 is of order tex2html_wrap_inline36 . Through an exhaustive search, primitive polynomials are determined which perform modulo reduction with low complexity. Detailed descriptions of efficient parallel multipliers for field orders tex2html_wrap_inline38 are given.

It is shown that for certain field orders a combined optimization of the polynomial multiplication and modulo reduction further improves the gate count and the delay of multiplier architectures. The gate counts achieved for some field orders are the lowest ones reported in technical literature.

A comparative synthesis maps four different types of multiplier architectures to the gate-array library of the TC 160G family. It is found that the improved theoretical gate count of composite field multipliers results in netlists with an improved number of gate equivalences if gate-arrays are used. A performance estimation of the composite field multipliers results in a data throughput of up to 3.8 Gbit/sec.

An algorithm from Itoh and Tsujii for inversion over composite fields is applied to elements in standard base representation. A relationship between this algorithm and an architecture over tower fields proposed by Morii and Kasahara is developed. For the fields tex2html_wrap_inline40 and tex2html_wrap_inline42 architectures for parallel inverters with moderate gate count are provided.

A new combined software/hardware approach for systems involving finite field arithmetic is proposed. As an application, a composite field multiplier over tex2html_wrap_inline44 is attached to a 16 bit DSP. We implemented a shortened (10,8) Reed-Solomon code which allows decoding at a speed of up to 1.9 Mbit/sec.

For more information contact the author at: ECE Department, Worcester Polytechnic Institute, Worcester, MA 01609, USA, email: christof@ece.wpi.edu





next up previous
Next: About this document



Ramesh Rao
Mon Jan 22 14:41:29 PST 1996