The design of a Xilinx FPGA-based processor
Yoav Freund and Rene Dorta, CMPE202 FALL 1990
The REGIS CPU was implemented in hardware using two XILINX 3020 LCA chips.
These are very flexible and powerful Logic Cell Array chips can be
programmed, using an external ROM chip, to perform general logic manipulations.
Their design involves 64 CLB units, each capable of performing a simple
binary function (3 to 2 or 4 to 1) and two flip flops. By programming
the CLB's and their connections any sequential logic can be implemented
provided there are enough CLB's and provided the routing between CLBs is
The final hardware on which REGIS was implemented included two LCA chips,
two EPROMs for programming these chips, a RAM that was emulated by an emulator
that runs on a PC computer, and several pull-up resistors, anti-noise
capacitors and the like. The connections between the chips where doen using
a development breadboard used for general design projects.
The implementation of the REGIS CPU involved the extensive use of computer
aided design tools. This section reports the exact tools used and the tips
that we have gathered while using them and that might be of use for others
that will follow a similar route.
The design is separated into two. The control and the data-path.
The data-path was designed on the circuit level using the FUTURENET
schematic editor. The control's general design was done on the scematic
level, but most of the actual logic was entered as text (ascii) files
that describe the state machine and the decoder. These design entries can
be viewed as the "source code" for the design of the hardware. From these
source files, after many compilation, checking and translation steps
a "bit-stream" is generated. This bit stream is stored into the EPROMS and
programs the LCA chips after power-up. This bit stream is the equivalent of
the machine code that is the final result of software development.
The many steps between the source files and the final EPROM chips that are
used in the circuit are summarized in Figure 1. as most of the steps
are automatic, they were executed by a MAKE file that is also in the report.
In the following I shall give a brief description of each step.
All the scematic design tools run on PC computers.
This is the schematic editor tool on which most of the time is
spent, it is a good tool although it takes some time to get used to it.
- It is worthwhile to use the command lines,
(things like /l;left 5;/l;/l;right 5; down 3)
it generates a very regular schematic which is easier to correct.
- The attributes cause most of the errors,
the important ones to understand
are SIG, PINI, PINO, FILE.
It seems also nice to use PART and LOC.
- It is worthwhile to learn to use BUSes.
- It is important to name all signals,
in the final designs most many signal
get arbitrary names and then the simulator:
SUSIE is very hard to use.
These programs work on the individual drawings in the
the errors they find are usually easy to locate and correct,
most of the warnings can be ignored as they arise from the fact
that they look at a partial design (one file at a time).
- This is the main linker - in this stage most of
errors between the parts in the hierarchical design will be reported.
- This is a link program that links the .xnf
files generated by the
schematic tools to the .xnf files generated by the logic design tools and
imported from the unix stations on which they run.
- These are the tools that translate
the functional design into a design using the CLB's in the LCA chips.
In this step problems of incorrect buffering routing are detected,
- the clock should be driven by a special buffer
and its net should be declared as long and critical.
- internal tristate buffers and external tristate buffers
are not the same!
- defining internal busses as ``long''
lines aids the mapper/router in choosing a good mapping.
- The router takes by far more time then all the other
steps combined (about 45 minutes per chip) and it might fail for
complex designs even if they use only a small part of the chip.
We were lucky as the APR managed to route
our data path although it used 54 out of the 64 CLB's.
- This is the tool that shows you how your chips actually looks
like internally after it has been routed.
It can actually be used to program the LCA on the actual CLB level
and it has many aids for doing that, we used it
only for doing a final checking (DRC) of our designs and generating the
bitstreams to be loaded into the EPROM.
- TOPC and PAL:
are programs that run on the digital lab's dlab2PC that programs
EPROMS, they are lousy but do the job.
- the simulator is an important design tool in this methodology
in which most of your design is inaccessible - in the LCA chip internal
With respect to its importance, it is definitely a great dissappointment:
- It user interface is bad, they way menus
and selections are done is very inconvenient.
- The mouse driver it uses is non standard and taking the driver
off in order to use another program requires
turning the machine off!
- The signal names accepted by the simulator
are only 9 characters long, highly incompatible
with the other design tools that generate very long names from
the hyrarchical design (especially in the .xnf files).
The only way we have for overcoming that is to use an editor
(the SED) with a command script, to
run on the .xnf files and replace the names with shorter names.
- The simulator has a bug - it will not read correctly
signals that are tied to the ground (constant low).
- the simulator can run on one LCA chip final design
(DPBUFFf.LCA) at a time, thus full timing simulation of
designs involving more than one LCA such as ours
are not possible.
The design of the control logic was done on the SUN-stations (the beans
connected to DAIZU).
The design of the state machine was entered in a special format (see clogic.fsm)
in this format, each line relates with a specified previous state and
partly specified input (with possibly some ``dont cares'') an output condition
and a next state.
This file is first run through MUSTANG which assigns each state a binary
encoding by which it is stored in the state-register. Using the -l
option, as we did, results in an encoding in which for each state one
of the bits is one and all the others are zero.
As after RESET the LCA chip is in a state where
all the flip-flops are at zero we had to get into the .PLA file generated by
mustang and assign this encoding to the reset state in which control unit starts
- was used to derive and minimize
equations for implementing the logic required to
execute the state machine's function.
- translates the results of misII to the .xnf format that can then be
transported to the PC and linked into the schematics.
The final hardware was done on a design kit, this is a good kit for fast
breadboarding and testing but is sometimes unreliable. It also resulted in
an unavoidable spagetti of wires that luckily did not cause us too many errors
but is probably partly responsible to the high noise level on our lines.
It seems that the debounced toggle button electronics was not able to drive some
of the signals that we needed (the PGM/DN open drain signal) and we had to use
Another tool that was used in the final stage is the ram emulator, this tool
is more or less convenient (once you can get access to it!), two notes might
- The emulator simply maps the onboard pin-compatible socket into address
c000:0 and up (LOW-mem socket) or c200:0 (high-mem socket) so using DEBUG -
the DOS debugger (or better yet, if you have one of those full-screen low-level
debuggers for the PC) you can easily monitor and change the memory. This is
a much better solution than using the READMEMH,WRITMEMH programs etc. (you can
also load a program from a file using the DEBUG program LOAD command).
- The ram-emulator (at least the one we used pays no attention to the
signal cs1, and is activated/deactivated only according to cs2.
The hardware was able to run exactly as planned when the clock was run manually
or at rates up to about 1Khz, at higher rates errors would result. This is most
probably a result of extremely noisy signal that are a result of the very long
lines and the poor connections.
Pak K. Chan
Wed Oct 18 10:35:29 PDT 2000