

# **Technical Backgrounder**



| Contents                                     | Page |
|----------------------------------------------|------|
| Introduction                                 | 2    |
| What is DSP?                                 | 2    |
| The Broadband Revolution – DSP Challenges    | 3    |
| Using FPGAs for High-Performance DSP         | 5    |
| The Xilinx XtremeDSP <sup>™</sup> Initiative | 8    |
| The Xilinx Commitment to DSP                 | 13   |
| Further Information                          | 14   |
| DSP Glossary                                 | 15   |

## Introduction

This technical backgrounder is designed to inform the reader on the industry dynamics that are causing the need for high-performance Digital Signal Processors (DSPs). It starts by presenting why the use of digital signal processing (DSP) has become widespread and highlights some typical applications. The document then goes on to outline some of the challenges that lie ahead for DSPs and explains why Xilinx is uniquely positioned to address these challenges through its XtremeDSP Initiative.

## What is DSP?

DSP essentially involves the processing of signals by digital means. Signals can either be digital or analog. In the latter case before the signal can be processed digitally, it must be converted into digital form using an analog-to-digital converter (ADC) of some kind. Consequently the signal is processed back into analog using a digital-to-analog converter (DAC) after it has been processed.

There are a number of reasons why signals may want to be digitally processed. One reason is to minimize "electrical noise" that the signal may have acquired as a result of interference.

A very common DSP function is a filter. A signal is passed through a filter to select certain frequencies of the signal, while rejecting others. Filtering was traditionally done using analog filters but over the years, DSPs have demonstrated a superior ability to perform this task, because of better control, higher quality and lower production cost. A typical digital implementation of a filter is a Finite Impulse Response (FIR) filter.

Another common DSP example is the Fast Fourier Transform (FFT). The FFT is a fast algorithm that translates a time-domain signal to its frequency-domain representation and vice versa. A time-domain signal can be described in a graph with time on the x-axis and amplitude or on the y-axis, while a frequency-domain signal can be described in a graph with the frequency on the x-axis and the amplitude on the y-axis.

Important uses of DSP also include signal compression and decompression. In DVD players for example, the video signal on the DVD is typically compressed (to increase storage capacity) and must be decompressed before the recorded signal to be reproduced.

DSP technology has gained considerable acceptance in the electronics industry over the last ten years and today it is widely used in applications such as mobile phones, multimedia computers, CD players, hard disc drive controllers and DSL modems.

DSPs are high-performance processors. Although DSP architectures have some similarities with microprocessor architectures, DSPs are specifically designed to carry out mathematical operations such as multiply and accumulate. As a result, DSPs can process millions of samples per second.

The worldwide market for DSPs (i.e. general purpose DSPs, DSP core-based ASSPs/ASICs etc.) has grown to reach over \$11 billion and is projected, by leading industry analysts, to grow at a CAGR of 34% between now and 2004. However, the industry is expected to face new challenges that will present opportunities for new products and new suppliers.

## The Broadband Revolution - DSP Challenges

Changes in the market environment are projected to alter the way DSP is implemented over the next few years. Most notably, it is the broadband revolution that is expected to present the greatest challenge.

The broadband revolution has been initiated by the convergence of many technology disciplines that were traditionally separate. Examples included computing, telecom/wireless, video/imaging and networking. Figure 1 highlights some of the new applications that are emerging as a result of this convergence.



The impact of this convergence is that the amount of analog and digital data that is transceived by these emerging applications is likely to increase exponentially. This in turn will amplify the need for DSPs that can process signals at rates far beyond the capability of conventional DSPs that are in mass production today. Even if Moore's Law is applied to the fastest DSPs available today there will clearly be an ever-widening gap between the level of performance that will be needed and the performance capability of DSPs (see Figure 2). It is therefore evident that a new way of processing data will be needed if DSPs are to meet the challenge laid down by the broadband revolution.



Additionally, in today's fast changing market place, products take less time to get to market, and as competing products become available at a quicker pace, products tend to stay in volume for a shorter period. Hence time-to-market, a message that FPGA vendors have evangelised, has become even more crucial. Whereas in the past being late to market may have meant lower profits, today it often means no profits and in some cases business failure!

What this means is that DSPs need to become flexible enough to meet evershortening time-to-market demands.

## Using FPGAs for High-Performance DSP

Traditionally DSPs from mainstream manufacturers such as Texas Instruments, Analog Devices and Motorola have been available as general-purpose products or as application specific standard products (ASSPs). General-purpose DSPs have the benefit of being used in a wide range of products and applications by virtue of their programmability. Application specific products however, were specifically targeted at a particular application such as an ADSL modem (ASSP with DSP core) or at specific customers (ASIC with DSP core). In some applications where only a low level of performance is required, even microprocessors have been used for DSP.

Conventional general-purpose DSPs utilize a common architecture, known as the Von Neumann architecture, or extensions thereof. This type of architecture is serial by nature and performance is restricted as a result. Multiply and Accumulate (MAC) units within conventional DSPs are typically shared resources.

FIR filter filters an incoming signal by multiplying each data sample of the signal by several constant values (coefficients) and then accumulating the result. The more MAC operations that are applied to the incoming signal, the higher the accuracy of the filter. For example, a 256 Tap FIR filter requires 256 MAC operations on the each data sample before the next sample can be processed. (see Figure 3).



FPGA vendors have also offered DSP functionality on their products. For example, many cellular base stations around the world use Xilinx Virtex-E FPGAs. Cellular base stations process tremendous amounts of data in order to connect people, most of which is done using DSP of some kind.

The performance of FPGAs already exceeds 128 Billion MACs per second, which is significantly higher than that of conventional DSPs that are available from mainstream suppliers. With Virtex-II, performance will 600 billion MACs per second. FPGAs achieve this by utilizing a more parallel architecture in order to process the incoming signal (see Figure 4).

Using this architecture, each sample in the 256 Tap FIR filter example can be processed in a single clock cycle, hence significantly improving the performance and efficiency of the DSP.



With access to both state of the art FPGA and DSP technologies and leading edge development tools, Xilinx, the worldwide leader in programmable logic, is uniquely positioned to meet the broadband challenge head on. Additionally, Xilinx is already an entrenched supplier to leading electronics OEMs in each of the technology disciplines identified in Figure 1.

In November 2000, Xilinx unveiled a worldwide DSP initiative to meet the performance requirements of tomorrow's products.

7

## The Xilinx XtremeDSP<sup>™</sup> Initiative

The Xilinx XtremeDSP initiative is geared to meet the high-performance challenge of the broadband revolution. However, performance is only one aspect of the initiative. Other attributes address the need to have flexibility to optimize DSP designs for important constraints such as area and system frequency. The XtremeDSP Initiative also includes the announcement of development tools that bridge the gap that has traditionally existed between DSP and FPGA design methodologies. The next three sections describe the main components of the Xilinx XtremeDSP initiative.

#### Xtreme Performance

There are a number of unique features that are offered on the Xilinx Virtex-II family that enhance DSP performance. These include:

- A 10 million gate FPGA fabric that enables tremendous amounts of parallel processing.
- Up to 192 (18x18) embedded multipliers that operate up to 250MHz depending on word size, with zero latency for fast multiplication in minimal silicon area.
- Up to 3.5-Mbit of true dual-port block RAM for optimized data buffering and storage.

Architectural enhancements in the new Xilinx FPGA family, Virtex-II, will give Xilinx a unique advantage in performing computational intensive algorithms. Benchmarks are available that show Xilinx FPGAs running over 100 times faster than the industry's fastest DSPs (see Figure 5). As a result, a single FPGA will often replace so called DSP farms, an array of DSP processors.

Figure 5

| Xilinx                                                              |                                             |                        |                                   |  |
|---------------------------------------------------------------------|---------------------------------------------|------------------------|-----------------------------------|--|
| Function                                                            | Industry's Fastest<br>DSP Processor<br>Core | Xilinx<br>Virtex-E –08 |                                   |  |
| MACs per second<br>- Multiply and accumuluate<br>- 8x8 bit          | 4.4 Billion                                 | 128 Billion            | 600 Billion                       |  |
| FIR Filter<br>- 256-tap, linear phase<br>- 16-bit data/coefficients | 17 MSPS<br>@1.1 GHz                         | 160 MSPS<br>@160 MHz   | <b>180 MSPS</b><br>@180 MHz       |  |
| FFT<br>- 1024 point, complex data<br>- 16-bit real & imag. Comp.    | 7.7 μs<br>@800 MHz                          | 41 μs<br>@100 MHz      | <mark>&lt;1 μs</mark><br>@140 MHz |  |

#### **Xtreme Flexibility**

In addition to the traditional flexibility benefits offered by SRAM-based FPGAs such as the ability to be changed easily and upgraded whilst in the field, more features have been added as part of the XtremeDSP initiative to enhance architectural flexibility.

Two such features are distributed resources and segmented routing (see Figure 6).

- **Distributed DSP Resources** Enable designers to tailor a DSP architecture based on algorithm and performance requirements in order to optimize the design for area/cost or performance/system frequency.
- Segmented Routing (using ActiveInterconnect<sup>™</sup>) Consistently delivers highest performance independent of device size and other system functions integrated onto the FPGA.

Figure 6



As a result of the FPGA's flexibility and time-to-market advantages over ASSPs/ASICs, they will inevitably replace these components in many high-performance DSP applications.

#### **Xtreme Productivity**

Perhaps the most exciting aspects of the Xilinx XtremeDSP initiative are those that enhance productivity. As part of the Xilinx XtremeDSP initiative, Xilinx and its partners have made significant advances to dramatically shorten the FPGA DSP design cycle. Two of the most interesting advances include the MathWorks System Generator for Simulink and the Filter Generator.

 System Generator for Simulink – Traditionally the design flows for FPGAs and DSPs have been very different. Utilizing its partnership with The MathWorks, Xilinx has developed a System Generator that bridges the gap between FPGA and conventional DSP design flows (see Figure 7).





Filter Generator – As previously mentioned, digital filtering was one of the most common uses for DSPs. To do this, a number of DSP suppliers offer filter generators that can speed up the otherwise lengthy process of filter design. However, FPGA filter generators that are on the market today have the capability to generate filters that are either fully serial (and yield the most area efficient but lowest performance implementations) or fully parallel (and yield the highest performance but least area efficient implementations). However, most DSP designs require neither fully-serial nor fully-parallel implementations but lie somewhere in between. The Xilinx Filter Generator is constraint driven and has the ability to generate the most optimal area for a given performance level, or degree of parallelism (see Figure 8).



To enhance productivity, Xilinx has also added eleven new optimized DSP algorithms (see Figure 9), a power estimator tool (Xpower<sup>TM</sup>) to enable low power implementations more easily and has added new features to the ChipScope ILA tool to rapidly reduce debugging time.

In addition to the eleven new algorithms, Xilinx provides a large library of DSP building blocks to enhance engineering productivity. A complete list can be found on the Xilinx IP Center at www.xilinx.com/ipcenter.

| Figure | 9 |
|--------|---|
|--------|---|

| Algorithms                        |                  |                               |      |  |
|-----------------------------------|------------------|-------------------------------|------|--|
| Product                           | Application      | Part Number                   | P    |  |
| Filter Generator                  | General DSP      | Included with Xilinx SW tools |      |  |
| Multiplier Generator              | General DSP      | Included with Xilinx SW tools |      |  |
| Parameterized MAC                 | General DSP      | Included with Xilinx SW tools |      |  |
| DCT, iDCT                         | Image Processing | DO-DI-FIDCT                   | \$1! |  |
| G.711 PCM Codec                   | Compression &    | DO-DI-G711                    | \$7  |  |
| G.711 PCM Compressor              | Expansion of     | DO-DI-G711C                   | \$4  |  |
| G.711 PCM Expander                | Speech           | DO-DI-G711E                   | \$4  |  |
| Color Space Converter - RGB2YCrCb | Image Processing | DO-DI-RGB2YCRCB               | \$1  |  |
| Color Space Converter - YCrCb2RGB | Image Processing | DO-DI-YCRCB2RGB               | \$1  |  |
| Color Space Converter - RGB2YUV   | Image Processing | DO-DI-RGB2YUV                 | \$1  |  |
| Color Space Converter - YUV2RGB   | Image Processing | DO-DI-YUV2RGB                 | \$1  |  |

# The Xilinx Commitment to DSP

Xilinx is uniquely positioned to meet the performance requirements of DSPs that will be demanded by emerging applications in the broadband revolution. The XtremeDSP Initiative paves the way for a major company wide long-term commitment to the DSP market. In addition to embarking on a roadmap towards high-level language FPGA DSP design (see Figure 10), new products will also shortly be unveiled. The company is also developing a University DSP program, a DSP design services program and enhancing its unique position through key acquisitions.





# **Further Information**

For more information on the Xilinx XtremeDSP Initiative or products, please visit the Xilinx DSP website or contact your local Xilinx representative/distributor.

Website: www.xilinx.com/dsp

# DSP Glossary

#### Algorithm

A fixed set of mathematical calculations and procedures incorporated in software, which programs a device to perform a specific function.

#### ASIC

Application Specific Integrated Circuit (usually designed for one customer)

#### ASSP

Application Specific Standard Product (usually used by many customers)

#### **Base Station**

Central transmitter in a communications system, that acts as the cell hub for communicating with mobile terminals.

#### Bit

Binary Digit represented as a 1 or 0.

#### Byte

A group of 8 bits.

#### DSP

Digital Signal Processing – The method of processing an electronic signal digitally.

#### DSPs

Digital Signal Processors – An IC that is optimized for DSP. Often based on a Von Neuman Architecture.

#### FFT

Fast Fourier Transformation. A computationally efficient mathematical technique that converts digital information, from the time domain to the frequency domain, for rapid spectral analysis.

#### Filter

A circuit that passes frequencies within a specified frequency range and attenuates frequencies outside that frequency range.

#### **FIR Filter**

Finite Impulse Response Filter. The impulse response is "finite" because there is no feedback in the filter. If a single impulse is entered into the filter (that is, a

single "1" sample followed by many "0" samples), zeroes will eventually come out after the "1" sample has made its way in the delay line past all the coefficients.

#### FPGA

Field Programmable Gate Array

#### HDL

Hardware Description Language used for describing IC designs (e.g. Verilog HDL).

#### HLL

High Level Language used for developing code to program DSPs (e.g. C++).

#### MAC

A Multiply and ACcumulate is the operation of multiplying a coefficient by the corresponding delayed data sample and accumulating the result. Most DSP microprocessors implement the MAC operation in a single instruction cycle.

#### TAP

A digital filter process a signal by multiplying each data sample by several coefficients. Each multiplication of the data sample and the coefficient is called a TAP. The coefficient values, the bit width and the number of TAPs determine the characteristics of the filter.