Reimplementing Transputers

Home

FPGA CPU Speeds >>
<< Reimplementing Alto

Usenet Postings
  By Subject
  By Date

FPGA CPUs
  Why FPGA CPUs?
  Homebuilt processors
  Altera, Xilinx Announce
  Soft cores
  Porting lcc
  32-bit RISC CPU
  Superscalar FPGA CPUs
  Java processors
  Forth processors
  Reimplementing Alto
  Transputers
  FPGA CPU Speeds
  Synthesized CPUs
  Register files
  Register files (2)
  Floating point
  Using block RAM
  Flex10K CPUs
  Flex10KE CPUs

Multiprocessors
  Multis and fast unis
  Inner loop datapaths
  Supercomputers

Systems-on-a-Chip
  SoC On-Chip Buses
  On-chip Memory
  VGA controller
  Small footprints

CNets
  CNets and Datapaths
  Generators vs. synthesis

FPGAs vs. Processors
  CPUs vs. FPGAs
  Emulating FPGAs
  FPGAs as coprocessors
  Regexps in FPGAs
  Life in an FPGA
  Maximum element

Miscellaneous
  Floorplanning
  Pushing on a rope
  Virtex speculation
  Rambus for FPGAs
  3-D rendering
  LFSR Design
 

Google SiteSearch
Newsgroups: comp.sys.transputer,comp.arch.fpga
Subject: Re: Emulating a transputer on FPGA
Date: Tue, 10 Aug 1999 10:19:54 -0700

Ram Meenakshisundaram wrote in message <37B04CFB.DA77A8F0@olf.com>...
>What would it take to emulate say a T225 or T425 transputer on a FPGA
> (without any external links).  Since I am new at this, what would be an
>ideal FPGA to do this.  Can I accomplish this on a XC4003E??  How many
>gates would I need on the FPGA to do this.  Thanks.

I have thought on this before.  (If you configure a big array of FPGAs as
CPUs, you might as well have them talk to each other.)

The answer: it depends.  By "emulate" do you mean drop-in-replace?  If so, I
can't say, because I never studied the transputer instruction set
architecture, and because it depends upon whether you are willing to handle
divide, floating point, etc., in software.

If you mean "build a new machine in the transputer *style*", with high
integration of on-chip components and instruction set support for message
sends and fast task switching, etc., then here's a few back-of-the-envelope
estimates for you.

I recall the inmos T414, 1985, which had a 32-bit processor, 2KB of on-chip
RAM, 4 10? Mb/s links, and an integrated DRAM controller.  A T414-20 had a
20 MHz clock and did about 10 million transputer instructions per second.
(I think.  I am not a transputer expert.)

A pipelined 16-bit datapath for a two-cycled 32-bit RISC requires about
8x9=72 CLBs, and its control unit could be ~50 CLBs.  A full 32-bit datapath
adds another 72 CLBs.

Consider the 2 KB of on-chip RAM.  In an XC4000, each CLB can store 32 bits.
2KB * 8 b/B * 1 CLB/32 bits = 512 CLBs, or about half of an XC4025E.  Or
just 4 of the 512 B embedded RAM blocks on a Virtex part.  I assume you were
hoping to target an XC4000 so let's reduce our on-chip RAM requirements to
128 B (32 CLBs) and move on.

An EDO DRAM controller w/ page mode support requires a 12-bit register and
comparator, a 12-bit mux, and some state machine logic.  Call it 20 CLBs.

The four serial links (10 MHz is slow and easy) are each ~12-16 CLBs,
although if you time multiplex them you may be able to make four links from
one implementation plus a register file.

Totals:

CLBs    What
72-144  CPU datapath
50      CPU control
32      128 B on-chip RAM
20      DRAM controller
20-64   4 serial links
----
194-310 CLBs

An expert might fit it in a 196 CLB XC4005E/XL, more likely you would
require 1) an XC4008E or XC4010XL and 2) hand-mapping and floorplanning
experience.

(The XC4003E, with 10x10 CLBs, will probably prove too small.  You could
build a minimalist 16-bit wide datapath in about 4x9 CLBs and could build a
CPU there, but you probably won't have room for links or memory controller.)

If you are more interested in exactly implementing the transputer
instruction set architecture, and IIRC it is a stack machine, then consider
the example of the MSL16 microprocessor.  See Leong, P.H.W, P.K. Tsang, and
T.K. Lee, "A FPGA based Forth microprocessor", pp. 254-255, in Proc. IEEE
Symp. on FPGAs for Custom Computing Machines 1998, and at
http://www.cse.cuhk.edu.hk/~phwl/msl16/msl16.html, and Paul Lee Wai Lun's
thesis and presentation, "FPGA Implementation of a Forth Processor", at
http://home.hkstar.com/~wail/project-7260/project.htm.

To implement a legacy ISA in an FPGA, I advise building a simplified
implementation hidden behind a binary rewriting system!

Jan Gray

Copyright © 2000, Gray Research LLC. All rights reserved.
Last updated: Feb 03 2001