A Video Optical Interface Architecture for Neural Network Image Processing

Christopher L. Forthman University Undergraduate Research Fellow, 1994-95 Texas A&M University Department of Electrical Engineering

10 APPROVED Undergraduate Advisor = 7 Exec. Dir., Honors Program

To the memory of my friends,

Joel Aaron Johnson (7/22/73 - 3/16/95) & Geneva JoAnne Peltier Johnson (9/6/72 - 3/16/95)

Texas Aggies - Class of 1995

## Contents

| I.         | Introduction                                                                                                                                                                                                                                                                               |                                                                                                                                    |  |  |  |  |  |  |  |
|------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------|--|--|--|--|--|--|--|
| II.        | Optical Imaging<br>A. Charge-Coupled Devices<br>B. Discrete Photodiodes                                                                                                                                                                                                                    |                                                                                                                                    |  |  |  |  |  |  |  |
| III.       | <ul> <li>Neural Network Image Processing.</li> <li>A. History</li> <li>B. Theory</li> <li>C. Applications</li> <li>D. The Universal Chip</li> <li>E. Contrast Sensitivity using an Auto-zero Scheme</li> <li>F. Contrast Sensitivity using a Silicon Retina</li> <li>G. Summary</li> </ul> | 7<br>7<br>9<br>11<br>12<br>14<br>15                                                                                                |  |  |  |  |  |  |  |
| IV.        | <ul> <li>A Video Interface Architecture</li> <li>A. Overview</li> <li>B. IC Design</li> <li>C. Simulations</li> <li>D. Layout</li> <li>E. Control Interface</li> <li>F. Experimental Results</li> <li>G. Conclusions</li> <li>H. Acknowledgments</li> </ul>                                | <ul> <li>16</li> <li>16</li> <li>17</li> <li>19</li> <li>19</li> <li>24</li> <li>28</li> <li>28</li> <li>28</li> <li>28</li> </ul> |  |  |  |  |  |  |  |
| References |                                                                                                                                                                                                                                                                                            |                                                                                                                                    |  |  |  |  |  |  |  |

# List of Figures

| 1.  | Charge generating and storage device of a CCD | 4  |
|-----|-----------------------------------------------|----|
| 2.  | Charge transfer in a CCD                      | 5  |
| 3.  | Profile of a p-n junction                     | 6  |
| 4.  | Two-dimensional CNN                           | 8  |
| 5.  | CNN cell schematic                            | 9  |
| 6.  | CNN image processing scenario                 | 10 |
| 7.  | CNN universal chip cell schematic             | 12 |
| 8.  | CMOS compatible Darlington photosensor        | 13 |
| 9.  | Photosensor with autozero scheme              | 14 |
| 10. | Retina model and VLSI implementation          | 15 |
| 11. | Video architecture components                 | 16 |
| 12. | 3 X 3 optoarray video architecture            | 18 |
| 12. | Optoarray interface block diagram             | 20 |
| 13. | 3 X 3 chip image with output                  | 19 |
| 14. | 3 X 3 optoarrray simulation                   | 20 |
| 15. | Pixel layout                                  | 21 |
| 16. | Cell layout                                   | 22 |
| 17. | Optoarray layout                              | 23 |
| 18. | Optoarray interface block diagram             | 24 |
| 19. | External interface schematics                 | 25 |
| 20. | 15 X 21 optoarray timing diagram              | 26 |
| 21. | TTL state machine implementation              | 27 |
| 22. | Circuit board mask                            | 29 |

## I. Introduction

Charge-coupled device (CCD) image sensors have evolved quickly over the past twenty years. In the process, they have transformed the world of video technology [19]. These analog integrated circuits rapidly convert spatial distributions of radiation (i.e. optical images) into electronic output. The output is in the form of a time-distributed voltage signal which can be digitally processed, modulated, and broadcast. Likewise, the signal can be demodulated and reconstructed on a television screen. As a result, CCD's have seen remarkable success in practically every type of TV camera as well as in singlefield image acquisition systems [18].

Nevertheless, the need for real-time image processing cannot be met by CCD's. Standard CCD's rely on external digital processing engines to filter out the relative illumination level (i.e. contrast) and to extract edges, detect motion, etc [18]. Consequently, valuable time is wasted transferring the image from the CCD to DSP. In the early eighties the need for high-speed real-time processing of images led to the implementation of neural networks and cellular automata; two analog informationprocessing systems [6]. Neural networks provided the advantage of processing signals in real-time and cellular automata introduced the concept of interconnected circuit clones. From the start these architectures evidenced the real-time parallel processing capabilities of networks; an ability which has recently seen remarkable success in processing images [7,8,12,13,14].

Within the last five years a vast breadth of literature has been published on advances in analog image processing tasks such as real-time machine vision, robotics, motion detection, range finding, etc. If fact, a specific neural network architecture, the CNN (cellular neural network), has been used effectively as an analog image processing computer [5]. Other image processing architectures, such as the silicon controlled retina, have also been successfully implemented [1,2,3]. Still, the CCD's used in current image capturing devices perform faster and more reliably than the experimental neural networks. In fact, no CNN image processor to date can function at the high frequencies needed for

video acquisition. But as feature sizes shrink and manufacturing technologies improve the exploitation of CNN processors for video applications is inevitable.

In the near future the prospect of using CNN processors in conjunction with CCD video imagers is good. CCD's may soon have on-chip neural networks capable of performing quick and valuable processing of the image before it is scanned. In spite of that, it is unlikely that the first dual imager/processor chips will use charge coupling. Instead, the first video imaging/processing chip will probably be a pin for pin neural network replacement for an existing CCD product. Rather than integrate the charge storage and transfer methods of CCD's, the prototype chip will likely retain the isolated photoconducting pixels associated with current neural network architectures. This scenario will allow the network to be thoroughly tested as a video imager/processor. Afterwards, a scheme will be designed to integrate neural network architecture with the channel MOS structure associated with CCD's. Thus CCD charge transfer methods will replace the discrete pixels and allow the chip to be fully compatible for Raster scanning.

The incorporation of analog processors into video technologies will require an interface which is compatible with existing CCD chip designs. This paper describes an initial video interface architecture which has been designed for current neural network processing systems. The on-chip interface translates a processed image from a network array to a serial digital data stream. The data is then carried off chip to a computer screen for display. This paper details the image collecting and processing features and notes the utility of this video interface as an important step towards a dual imaging/processing IC.

The next two sections explain relevant image processing information to the reader. Section II, describes optical imaging by looking at both the charge coupled device and CMOS photodetecting pixels. Section III describes image processing with "smart-pixel" neural networks. In section IV the video interface is explained including the CMOS chip design and the hardware and software necessary for test and operation. Conclusions and acknowledgments follow.

## **II. Optical Imaging**

#### Overview

The evolution of monolithic systems for image processing has seen the success of two important building blocks. First, charge-coupled devices have been widely used as video imagers [19]. Second, neural network algorithms have been successfully implemented as VLSI image processors [4]. This section discusses the imaging techniques used in CCD's as well as those used in neural network applications. Together these devices have the potential to perform video acquisition and processing simultaneously. First, however, we must fully understand their operation.

#### **Charge-Coupled Devices**

The most widely used method of gathering optical data is the charge-coupled device. In essence, a CCD is made up of metal-insulator-semiconductor junctions operating in the deep depletion mode. This regime is a special case of semiconductor inversion which never reaches thermal equilibrium. A voltage applied to the metal electrode will repel electrons and create a depletion region similar to that in a reversed biased p-n junction. The time that is takes electrons to leave the semiconductor region is much less than the time it takes thermally generated holes to congregate in the potential well. As a result the well exists for enough time for external charge to be introduced to the region. Since the MOS junctions are placed adjacent to one another the charge can be transferred in an analog shift register fashion.

Consider the case of several adjacent electrodes like that shown in Fig. 1. Each electrode is a MOS junction which can be independently biased. Additionally, each MOS junction acts as a photodiode generating charge in proportion to the incident light energy.



Figure 1 The charge generating and storage device of a CCD

In Fig. 2 a three electrode charge transfer scheme is shown. Suppose at t=0 there is a potential applied to the first electrode. At this time photogenerated holes will migrate to the potential well. In order to transfer this charge to the second MOS device a potential well must be induced under the second electrode while the original potential well is being reduced. This is done by applying a potential to the second electrode while gradually decreasing the potential on the first electrode. The same mechanism is used to pass the charge onto the third electrode and eventually to an output stage. The miracle of video is that a large array of photo generated charge can be quickly transferred off of the CCD by applying appropriate voltages to adjacent electrodes.



Realistically, a typical CCD contains thousands of adjacent pixels aligned in a compact 2-D array. The structure is highly efficient and can even be made immune to noise by using buried channels (i.e. P+ implant). Unfortunately, the continuous transfer scheme used in CCD's coupled with the need for high density CCD arrays will not accommodate analog processing circuitry at each pixel. Until this problem is solved the discrete pixel imaging technique of CNN's must be used in analog image processors.

#### **Discrete Photodiodes**

Neural network image processors are a special type of optoelectronic integrated circuit. Their hybrid nature provides the advantages of both technologies including high-speed parallel optical computing and data transmission with reduced crosstalk, high-density logic, and standard interfaces. Recently, optoelectronic CMOS memory circuits and neural networks have been successfully tested. In both cases the optical detection was accomplished by discrete photodetectors. These pixels are placed in repeated cells as light dependent current sources. Fig. 3 shows a typical photodevice. As with the CCD, light

detection occurs by photoconduction. Photons incident on the semiconductor generate extra carriers whose density is proportional to the input light.

The photogenerated carriers of these photosensors have many practical applications. Often they are used in interface circuitry with optical cables or in optical memory devices [15]. For image processing purposes, however, these devices provide the inputs to the interconnected cells of neural network imagers. The network can then processes the image.



Figure 3 Profile of a p-n junction photodiode in a p-well CMOS fabrication process

## **III.** Neural Network Image Processing

## History

In 1988 Leon O. Chua, an IEEE Fellow, combined the best features of neural networks and cellular automata into a new architecture called the cellular neural network (CNN). His ground breaking papers, "Cellular neural networks: Theory" and "Cellular neural networks: Applications," suggested several general uses for CNN's including pattern recognition and computer vision. Two years later, Chua along with several other researchers elaborated on the applications of CNN's in "CNN Cloning Template: Connected Component Detector [12]", "CNN Cloning Template: Hole Filler [13]", and "Image Thinning with a Cellular Neural Network[14]." Soon after Chua and J. M. Cruz introduced the first monolithic CNN chip for connected component detection.

#### Theory

The fundamental unit of most neural networks is the cell; a self-contained circuit connected via inputs and outputs to its neighbors. Fig. 4a shows a 2-D 5x5 CNN. Each cell is the CNN is represented by a square. The r-neighborhood of a cell, C(i,j) consists of all the cells within a distance 'r' of C(i,j). The simplest neighborhood and that used most frequently in CNN development is the r=1 neighborhood shown in Fig. 4b. Three variables are associated with each cell; u is the input, x is the cell state, and y is the cell output. The summation of the inputs and outputs of interconnected neighbor cells (see Fig. 4c) and a cell's own positive feedback determine the input and output variables. The state variable follows from circuit analysis. Additionally, the weights associated with the input control and feedback summation are programmable.



**Figure 4** A 2-D 5X5 CNN is shown in (a) with C(i,j) denoted in each cell. The r=1 neighborhood of C(3,3) can be seen in (b). The input control and feedback interactions are indicated in a 3X3 CNN in (c).

The cell is implemented with capacitors, resistors, independent sources, and linear and nonlinear voltage controlled current sources. Through simple analysis a circuit equation can be derived for a cell. For example, the state equation for the circuit in Fig. 5 is:

$$C\frac{dv_{xij}(t)}{dt} = -\frac{1}{R_x}v_{xij}(t) + \sum_{C(k,l)\in N_x(ij)}A(i,j;k,l)v_{ykl}(t) + \sum_{C(k,l)\in N_x(ij)}B(i,j;k,l)v_{ukl} + I$$

where A and B are the feedback and input control operators determined by  $I_{XU}$  and  $I_{Xy}$ , I is the independent current source, and  $N_r(i,j)$  is the r-neighborhood of C(i,j). The set of all such nonlinear differential equations from each cell characterize the CNN. These equations can be modified by changing the feedback and control operators, A and B. Thus there is a certain programmability to the CNN which allows for the various applications that Chua and others have explored [4,5,7,8,11,12,13,14,].



Fig. 5 Chua's original CNN cell schematic including an input voltage, linear circuit components, and a nonlinear output voltage-controlled current source, I<sub>vx</sub>.

#### Applications

Chua published a paper on CNN applications simultaneously with his theoretical description of the neural networks [7]. Whereas the theoretical description emphasized the steady state-behavior of CNN's, the paper on applications highlighted their transient behavior. The fact that CNN's settle to equilibrium in a dynamic fashion makes it possible to extract features from a picture. This behavior also allows for various other image processing tasks.

In essence, any analog input image to a CNN can be mapped into a specific output image with binary values. The output image will vary depending upon the dynamic rule of the particular CNN. For example, a CNN may be implemented with a dynamic rule which gives the circuit the ability to recognize and extract certain patterns from input images.

Chua originally suggested the use of CNN's for noise filtering and feature extraction. He showed that CNN's are effective for removing noise in image processing as long as the objects are relatively large and contain few corners. Unfortunately, CNN's suffer from the same problem experienced by two-dimensional low-pass filters (i.e. the high frequency components which delineate sharp corners are filtered out with the noise). In feature extraction applications the first CNN's simulated performed extremely well. They performed the same task that digital processors do but in less time -- usually about 1µs independent of array size. Thus, as VLSI techniques improve, it will be possible to implement larger-sized cellular neural networks capable of processing very large images at high speeds.

Within the last several years Chua and others have fabricated CNN chips which perform tasks vital to image processing. Chips have been implemented for connected component detection, image thinning, hole filling, etc. Consider the simplified tree and road scene in Fig. 6a. An intelligent motor vehicle of the future would need to have this image processed. Each of the CNN chips above could perform one step in the image processing. Of course, a chip which performs all three steps (and more) is needed. This concept will be discussed later. Fig. 6b shows the output of an image thinning CNN. Similarly, Fig. 6c shows the output of a hole filling CNN. Finally, Fig. 6d reveals the completely processed image. Each task is performed by analog parallel processing. This technique differs from the digital and sequential methods used in the past. Most importantly, the experimental CNN chips implemented these global image processing tasks by using simple local interconnection topology patterns. Again, it is clear that CNN chips are ideally suitable for VLSI implementation.



**Figure 6** CNN image processing scenario. The original image (a) is processed by a CNN image thinner (b) or a hole filler (c) or both (d).

#### The Universal Chip

Within the last year a CNN chip has been designed and tested which implements multiple dynamic rules [5]. The chip, built by a team of researchers at the University of Sevilla, is basically an analog ALU where a CNN invokes the instruction set. The "CNN Universal Chip" consists of an array of 32 x 32 completely programmable CNN cells capable of realizing any CNN application. A density of 33 cells/mm<sup>2</sup> has been implemented in 1 $\mu$ m technology.

Fig. 7 shows the cell architecture of the CNN universal chip [5]. Every cell in the programmable CNN incorporates a photosensitive device. Thus the chip can be initialized optically or electrically. In either case, the processed image is downloaded via 32 I/O bonding pads on a row by row basis.

The external chip control is accomplished by digital circuitry. Likewise, analog weights (i.e. the cell coefficients) are stored in on-chip digital memory. As a result, cell parameters for each template value are insensitive to process variations. Furthermore, each cell has a four-bit static memory (LLM), a programmable two-input digital gate (LLU), and initialization and control circuitry (LCCU). The memory allows the chip to store four complete images. These images can be used as input, U, or as initial conditions, X, of the network. The network, in turn, performs the eight instructions stored in the analog and logic program registers (APR and LPR). Therefore the array can perform multiple processing tasks on any given image. Once a CNN universal <u>video</u> machine can be implemented it will have numerous applications in the fields of robotics, control systems, prosthetic devices for the blind, smart-vehicle navigation, etc.



Figure 7 Schematic cell architecture of a CNN universal chip

In summary we have seen that cellular neural networks have the ability to carry out significant image processing at each photosensor cell. This is because each pixel in a CNN is accompanied by an analog computing unit which interacts with the cells of nearby pixels. Of course, such "smart-pixels" are not currently possible in CCD technology. Nevertheless, the goal is to eventually incorporate the intelligence of CNN universal machine pixels into charge-coupled devices. In the meantime, however, the discrete photosensors of current CNN's must be capable of driving networks faster and faster --- ideally into the realm of video.

#### Contrast Sensitivity using an Auto-zero Scheme

To reduce processing time in CNN's, Dr. E. Sanchez-Sinencio of Texas A&M University has collaborated with the design team at the University of Sevilla to produce an contrast sensitive analog design technique for smart-pixel CMOS chips [9]. The photosensors in their current-mode CNN are simple photodiodes which have been made more light sensitive by using a vertical CMOS-compatible BJT. This technique provides a  $\beta$ +1 current gain compared to the typical well-substrate photodiode. A Darlington phototransistor further amplifies the current to a level in the range of 10µA. Fig. 8 shows the CMOS compatible Darlington photosensor cross-section and schematic.



Figure 8 CMOS compatible Darlington photosensor (a) cross-section and (b) schematic

In order to insure proper behavior under different illumination conditions, Sanchez et. al. used a simple auto-zero scheme to set an average photosensor current. By replicating the photosensor current twice and routing one replica to a global-node SUM, the CNN cell input current is made relative to the average current  $I_{TH}$ . Thus the light threshold is automatically adjusted to the average illumination. For example, in Fig. 9 suppose the photogenerated current,  $I_s$ , is 10µA. This current is mirrored twice in the PMOS devices. Since  $M_{N1}$  is diode connected to the global-node SUM, the average current,  $I_{TH}$ , will flow through  $M_{N1}$ . If  $I_{TH} = 7.5\mu$ A then a resultant 2.5µA must flow to the SUM node and to the CNN cell input,  $I_o$ . Clearly, the cell's input is a current relative to the image average. This property, termed contrast sensitivity, prevents the CNN from having to waste time deciphering contrast levels. Thus more time is available for edge detection, etc.



Vdd

Figure 9 Photosensor with auto-zero scheme

### Contrast Sensitivity using a Silicon Retina

Ironically, nature solved the contrast sensitivity issue long ago. The biological processing algorithm in the outer-plexiform layer of the vertebrate retina performs contrast sensitivity and edge detection. The process involves two non-spiking neurons and electrical gap junctions, all of which can be modeled with analog VLSI. Fig. 10a is a one-dimensional model of neurons and synapses in the outer-plexiform layer. The photoreceptors, like CMOS photoconductors, produce currents in proportion to light intensity. In biological systems the current is carried via excitatory chemical synapses to horizontal cells. The horizontal cell/photoreceptor pairs are interconnected by electrical synapses. As a result, currents can flow from one cell to another. CNN cells communicate via similar local connectivity (see Fig. 4c).

In contrast to the global auto-zero scheme previously discussed, the vertebrate retina produces a local average light intensity. The horizontal cells compute this value and adjust the cone membrane conductance proportionately. The net result is that the cone's response to input light changes depending on the ratio of the photoreceptor current to the local average current. This "local-automatic gain control" provides contrast sensitivity.

Andreas Andreou of Johns Hopkins University mapped this biological algorithm onto an analog Silicon Controlled Retina (SCR) [1]. Fig. 10b shows the neurocircuitry as reported in [1,2,3]. Chemical synapses are modeled by presynaptic voltage control current

sources and the cone is implemented by a light-sensitive BJT in saturation. Finally, MOS diffusors model the porous gap junction membranes. In [2,3] Andreou reports a fully functional 48,000 pixel, 590,000 transistor SCR. The contrast sensitivity and local-automatic gain control of his CMOS imager improved upon the gray level images from a standard CCD camera.



Figure 10 One dimensional model of neurons and synapses in the outer-plexiform layer (a) and its analog VLSI implementation (b)

#### Summary

The current trend in feature size reduction coupled with the successful implementation of cellular neural networks could soon yield optoarrays with programmable on-chip CNN architectures. Photosensors compatible with CMOS technology are allowing us to link the process of video acquisition with neural network image processing. The successful implementation of local and global contrast sensitivity on imaging chips has paved the way for complex real-time video processing. A single universal video chip will incorporate the image processing algorithms of CNN's as well as the imaging techniques of current CCD arrays. To reach this goal a functional imager/processor chip must be built. Initially, this CNN array will retain the isolated photoconducting pixels associated with current neural network architectures. Later, charge-coupled technology may be added to the device.

#### III. A Video Interface Architecture

#### Overview

The current attempts to link optical imagers with neural network processors have yielded remarkable image processing devices. Nevertheless, little focus has been placed on implementing neural network image processors into existing video technology. In recognition of this problem, a simple video interface architecture has been designed. The interface will allow further investigation of video/neural network issues and will allow TAMU to pursue research in this area.

The first step in implementing a CNN video processor is the design of an interface compatible with existing CNN's and with video protocols. On standard CCD chips the image is sent to a serial readout register and then to a DSP for processing. The first video CNN's will need to be pin for pin compatible with current CCD's so that the devices can be tested in off-the-shelf video cameras, etc. The video interface architecture described here allows video to be directly generated from the CNN. Unlike other CNN output architectures, this prototype video design sends data off-chip via a serial readout register.

As seen in Fig. 11, the CNN video interface architecture consist of three components; an optical chip, external control hardware, and a personal computer. An image projected onto the optical chip is captured by a 15 X 21 array of pixels. Then, under management of the external control hardware, the array converts the image to a one-dimensional digital output stream. Standard read signals from the PC pipe the data to the EISA bus on the computer. Software then reassembles the two-dimensional image and displays it on the CRT.



Figure 11 Video architecture components (a) the optical chip w/image, (b) the external control hardware, and (c) the PC.

### **IC Design**

The IC mimics a 15 X 21 CNN capable of running at video speeds. Each cell has a CMOS compatible Darlington photosensor [9] as seen in Fig. 8 as well as digital video interface logic. Fig. 12 shows the video architecture as it would appear for a small 3 X 3 array. The standard cell has been outlined. It consists of several logic gates and a D-type flip-flop. The CNN analog computing portion of each cell has been omitted for this first generation prototype. The output shift register row is also made up of D flip-flops.

Pixel information flow is handled by four control inputs; I\_CLK, D\_CLK, DP\_CLK, and O\_CLK. D\_CLK clocks the D flip-flops in each cell and O\_CLK clocks the shift register at the bottom of the chip. The other inputs are used as masks to select the input to the flip-flops. Each of these control inputs is generated by the external control hardware.

Essentially, chip operation consists of three phases. First the optoarray captures an image. During this phase, I\_CLK is high and each pixel is latched into a flip-flop. During phase two the image is shifted down a row. This is accomplished by latching the output of each flip-flop into the input of the flip-flop below. I\_CLK must be low during this phase. Also during phase two the bottom row of the image moves into the output shift register. Phase three involves transferring this image row off-chip. DP\_CLK must be high and O\_CLK active during phase three. Phases two and three can then be repeated until each image row has been shifted off-chip. Then another image can be recaptured and off-loaded in the same manner. The faster this process is accomplished the closer we will be to real-time video. Unfortunately, the time required by an on-chip CNN to process the image bottlenecks the procedure. Furthermore the complexity and size of each CNN cell decreases the image resolution drastically. We must overcome both of these issues as we build the first neural network video processors.



**Output Shift Register** 

## Figure 12 3 X 3 optoarray video architecture

#### Simulations

The large number of transistors in the optoarray made an entire chip simulation unfeasible. However, the repeated cell nature of the device allowed the chip to be scaled down to a practical size for HSpice analysis. A transient simulation was performed on a 3 X 3 optoarray like that in Fig. 12. Preliminary results revealed a problem with the on-chip D flip-flops. Rather than shift the information down row by row, the flip-flops latched the data. This caused an unpredictable progression of data through the array. Since the device was already in the fabrication process an external solution was implemented. The correction involved changing D\_CLK and O\_CLK to very narrow glitches several nanoseconds long.

The image of Fig. 13 was used to test the 3 X 3 array. The expected results are given in Fig. 13 and the actual results are in Fig. 14. The figure below shows that the bottom left should be shifted out first. Five clock pulses later the top three bits follow. This occurs because the image is shifted off the chip from the bottom to the top. Simulations verified that image data was time distributed left to right and bottom to top as expected.



Figure 13 3 X 3 Chip Image with Output

#### Layout

The 15 X 21 array was laid out using the Berkeley Magic software. A standard CMOS digital technology was used with a 1.2 $\mu$ m minimum feature width. Metal and polysilicon wires were used for interconnects. Second metal covers the entire array, shielding the substrate from undesirable photo-generated carriers. All digital transistors are minimum size (2.7 $\mu$ /1.8 $\mu$ ). Fig. 15 shows the pixel layout. Each pixel, including the Darlington amplification and auto-zero scheme, is 54 X 57  $\mu$ m. Fig. 16 and Fig. 17 give the cell layout and the chip layout respectively. The cells occupy 114 X 78  $\mu$ m and the entire optoarray is 1475 X 1715  $\mu$ m.



Figure 14 3 X 3 optoarray simulation





| 99<br>8 |             |            |                   |             |            |             |           |            |            |               |        |              |          |            |             |            |             |              |                                       |            |            |                   |
|---------|-------------|------------|-------------------|-------------|------------|-------------|-----------|------------|------------|---------------|--------|--------------|----------|------------|-------------|------------|-------------|--------------|---------------------------------------|------------|------------|-------------------|
|         |             | 7          |                   |             | <b>7</b>   |             |           | Z          |            | de -<br>rient |        |              |          |            |             | <b>.</b>   | <b>.</b>    |              |                                       |            |            | 21101101401401401 |
|         | G.          |            | <b>.</b>          |             | <b>.</b>   | <b>X</b> .  |           |            | <b>M</b> . |               | 0      | <b>7</b> 1   |          |            | <b>A</b>    | <b>.</b>   |             | <b>.</b>     |                                       |            |            |                   |
|         | <b>71</b> . | P.         |                   | <b>Z1</b>   |            |             | <b>71</b> | 2          |            |               |        |              | <b>.</b> |            | II.         | <b>2</b>   | L.          | <b>C</b> ili | 2                                     |            | <b>K.</b>  |                   |
|         |             | <b>7</b> 1 |                   |             | ×.         |             |           | ×.         |            |               | Z.     |              |          |            | <b>.</b>    | <b>a</b> . |             |              |                                       |            |            |                   |
|         | d           |            |                   |             |            |             | k         |            | <b>7</b> 1 | <b>R</b> anii |        |              |          | <b>a</b>   | •           | -          | 6           |              |                                       | <b>.</b>   |            |                   |
|         | 21          | l Zm       |                   | <b>7</b> 1: | ×          | •           |           | <b>1</b> . | a.         |               |        | <u>a</u> t.  | ⊿        | Cł.        | <b>1</b> 1  | <b>Z</b>   |             | n.           | Z                                     |            | 2.         |                   |
|         |             | <b>*</b> 1 | 2.                | n.          |            | 2           | al.       | 2          |            | ai.           | ø.     |              |          | 2          |             | <b>a</b>   | •           |              | · · · · · · · · · · · · · · · · · · · |            |            |                   |
|         |             |            | <b>~</b>          |             |            | <b>¤</b> ., |           | <b>.</b>   | <b>M</b>   | u,            | . Circ | <b>7</b> .   | Dixe     |            | <b>.</b>    | D,         |             | <b>"</b>     | •                                     | 1          | •          |                   |
|         | <b>5</b> 1  | P.         |                   | <b>1</b>    | 2          |             | <b>M</b>  |            |            | <b>7</b> 1    |        |              |          |            |             | <b>Z</b> . |             | <b>1</b> 1   | . 🗖                                   | E.         | <u>ک</u> ، |                   |
|         |             | <b>1</b> 1 | <b>.</b>          | U.          | <b>Z</b>   |             | •         | <b>Z</b>   | C.         | Li.           | 2      | . <b>D</b> . | 2,       |            | <b>.</b>    | <b>.</b>   |             |              | <b>.</b>                              |            |            |                   |
|         | Ld,         | <b>.</b> , | <b>Z</b>          |             | •          | <b>X</b>    | <b>.</b>  |            | A          |               | •      |              |          | ٦.         | •           |            | <b>1</b> 1. | <b>⊠</b> i,  | L.,                                   | <b>%</b> . | 2          |                   |
|         | <b>21</b>   | <b>.</b>   |                   | 21.         |            | . <b>.</b>  | <b>71</b> | 2          | <b>.</b> . | <b>71</b> ),  |        | ΩĻ.          |          |            | <b>a</b>    | ⊿.         |             | <b>a</b> )   | •                                     | 11         | <b>2</b>   |                   |
|         |             | *1         | <b>2</b>          | <b>.</b>    | <b>2</b> . | . 2         | <b>.</b>  | ₽,         |            | <b>.</b>      | Ø.,~   |              | •        | <b>Z</b> : | æ.          | <b>2</b>   | Π.          |              |                                       |            |            |                   |
| -       |             |            | 1. 1.<br>1.<br>1. | :<br>:<br>: | I<br>;     |             |           | 1          |            |               |        | 1            |          |            | 1           | •          | 1           | .1           |                                       |            | I.         |                   |
| +       |             |            |                   |             |            |             |           |            |            |               |        |              |          |            | <del></del> |            | •           |              |                                       |            |            |                   |
|         |             |            |                   | h           |            |             |           |            |            |               |        |              |          |            |             |            |             |              |                                       |            |            |                   |

#### **Control interface**

The external management of the chip is all digital and is represented by the block diagram in Fig. 18. It is also given schematically in Figure 19. Software on a personal computer generates an I/O read signal (/IOR) which is fed from the EISA bus to a PLD. The PLD monitors other standard EISA signals (ALE, AEN, etc.) and generates the appropriate control signals with each /IOR pulse. The control signals are then routed through flip-flop correction circuitry which generates glitches for two of the signals. The corrected signals cause the optoarray to capture images and to output the data in a one-dimensional fashion. The serial data steam is sent back to the EISA bus where software reassembles the image to its two-dimensional form. It is then displayed on a CRT.

Fig. 20 shows the timing diagram for the four input signals. The PLD state machine implements these signals. For reference, however, the original TTL state machine implementation is given in Fig. 21 Note also that a flip-flop correction block is included in the figure below. It consists of simple feedback connected D flip-flops which generate the short glitches for each input clock signal. The feedback path uses inverters to provide multiple delay times.



Figure 18 Optoarray interface block diagram









#### **Experimental Results**

A circuit board has been built to realize the schematics of Fig. 20. It is two sided as seen in Fig. 22a and 22b and includes the flip-flop correction circuit. The board fits into the standard EISA slot on the XT. An image is focused onto the chip by an 8mm video camera lens. Preliminary test results are expected to be available for presentation at the time of the honors symposium.

#### Conclusions

The video optical interface is expected to capture and display images at high speeds. The design will then be modified to operate at video frequencies. Afterwards, analog circuitry will be added into each cell to form a functional CNN. The microelectronics group at Texas A&M University will pursue this work in conjunction with previous neural network research [9, 10, 16, 17]. The end result will be a fully functional CNN video imager/processor.

#### Acknowledgments

I would like to thank the MOSIS foundry for fabricating the optoarray and R.C. Waits of TAMU for building the circuit board. Finally, I am grateful for the guidance and encouragement that Joe Varrientos has provided during my endeavor.





## References

- A.G. Andreou and K.A. Boahen. "Neural Information Processing (II)," Chapter 8, in *Analog VLSI Information Processing*, M. Ismail and T. Fiez eds., McGraw-Hill, 1994.
- [2] A.G. Andreou and K.A. Boahen. "A 48,000 pixel, 590,000 transistor contrast sensitive, edge enhancing, CMOS imager," Submitted to *ISSCC '95*.
- [3] A.G. Andreou and K.A. Boahen. "A 48,000 pixel, 590,000 transistor silicon retina in current mode subthreshold CMOS," Presented at the *Midwest Sym. on Circuits and Syst.*, Aug. 3-5, 1994.
- [4] G.F. Dalla Betta, S. Graffi, G. Masetti, Z.M. Kovacs. "Design of a CMOS Analog Programmable Cellular Neural Network," Presented at the *Second International Workshop on Cellular Neural Networks and their Applications*, Oct., 1992.
- [5] R. Dominguez-Castro, S. Espejo, A. Rodriguez-Vazquez, and R. Camona. "A CNN Universal Chip in CMOS Technology," *Third IEEE International Workshop on Cellular Neural Networks and their Applications*, Rome, Italy, Dec. 1994.
- [6] L.O. Chua and L. Yang. "Cellular Neural Networks: Theory," *IEEE Trans. Circuits Syst.*, vol. 35, pp. 1257-1272, Oct. 1988.
- [7] L.O. Chua and L. Yang. "Cellular Neural Networks: Applications," *IEEE Trans. Circuits Syst.*, vol. 35, pp. 1273-1290, Oct. 1988.
- [8] J.M. Cruz and L.O. Chua. "A CNN Chip for Connected Component Detection," *IEEE Trans. Circuits Syst.*, vol. 38, pp. 812-817, July 1991.
- [9] S. Espejo, A. Rodriquez-Vazquez, R. Dominguez-Castro, J.L. Huertas and E. Sanchez-Sinencio. "An Analog Design Technique for Smart-Pixel CMOS Chips," *Proc. of the 1993 European Solid-State Circuits Conference*, Held in Sevilla, Sept. 1993.
- [10] S. Espejo, A. Rodriquez-Vazquez, R. Dominguez-Castro, J.L. Huertas and E. Sanchez-Sinencio. "Smart-Pixel Cellular Neural Networks in Analog Current-Mode CMOS Technology," *IEEE Journal of Solid-State Circuits*, vol. 29, pp. 895- 905, Aug. 1994.
- [11] S. Espejo, R. Dominguez-Castro, and A. Rodriquez-Vazquez. "Electrically & Optically Driven 16 X 16 Pixels Cellular Neural Networks for Connected Component Detection," Testing Report from the National Center of Microelectronics, University of Sevilla, July 1993.

- [12] T. Matsumoto, L.O. Chua and H. Suzuki. "CNN Cloning Template: Connected Component Degetor," *IEEE Trans. Circuits Syst.*, vol. 37, pp. 633-635, May 1990.
- [13] T. Matsumoto, L.O. Chua and R. Furukawa. "CNN Cloning Template: Hole-Filler," *IEEE Trans. Circuits Syst.*, vol. 37, pp. 635-638, May 1990.
- [14] T. Matsumoto, L.O. Chua and T. Yokohama. "Image Thinning with a Cellular Neural Network," *IEEE Trans. Circuits Syst.*, vol. 37, pp. 638-640, May 1990.
- [15] A.H. Sayles and J.P. Uyemura. "An Optoelectronic CMOS Memory Circuit for Parallel Detection and Storage of Optical Data," *IEEE Journal of Solid-State Circuits*, vol. 26, pp. 1110-1115, Aug. 1991.
- [16] J.E. Varrientos, E. Sanchez-Sinencio, and Jaime Ramirez-Angulo. "A Current-Mode Cellular Neural Network Implementation," *IEEE Trans. Circuits Syst. II*, vol. 40, pp. 147-155, March 1993.
- [17] A. Rodriquez-Vazquez, S. Espejo, R. Dominguez-Castro, J.L. Huertas and E. Sanchez-Sinencio. "Current-Mode Techniques for the Implementation of Continuous- and Discrete-Time Cellular Neural Networks," *IEEE Trans. Circuits Syst. II*, vol. 40, pp. 132-146, March 1993.
- [18] K. Yonemoto, T. Iizuku, S. Nakamura. "A 2 million pixel FIT-CCD image sensor for HDTV camera systems.," *IEEE International Solid-State Circuits Conference*, pp. 214-225, 299, 1990.
- [19] B. Zovko-Cihlar, I. Matanic, B. Kisani. "Aliasing effect on CCD professional TV cameras," Presented at the *Fourth International Conference on Television Measurements*, pp. 20-23, June 1991.