fpgacpu.org - Java Processors

Java Processors

Home

Forth processors >>
<< Superscalar FPGA CPUs

Usenet Postings
  By Subject
  By Date

FPGA CPUs
  Why FPGA CPUs?
  Homebuilt processors
  Altera, Xilinx Announce
  Soft cores
  Porting lcc
  32-bit RISC CPU
  Superscalar FPGA CPUs
  Java processors
  Forth processors
  Reimplementing Alto
  Transputers
  FPGA CPU Speeds
  Synthesized CPUs
  Register files
  Register files (2)
  Floating point
  Using block RAM
  Flex10K CPUs
  Flex10KE CPUs

Multiprocessors
  Multis and fast unis
  Inner loop datapaths
  Supercomputers

Systems-on-a-Chip
  SoC On-Chip Buses
  On-chip Memory
  VGA controller
  Small footprints

CNets
  CNets and Datapaths
  Generators vs. synthesis

FPGAs vs. Processors
  CPUs vs. FPGAs
  Emulating FPGAs
  FPGAs as coprocessors
  Regexps in FPGAs
  Life in an FPGA
  Maximum element

Miscellaneous
  Floorplanning
  Pushing on a rope
  Virtex speculation
  Rambus for FPGAs
  3-D rendering
  LFSR Design

Subject: Re: New Reconfigurable Computing Threads. -- Java machines
Date: 16 Feb 1996 00:00:00 GMT
newsgroups: comp.arch.fpga

In <fliptronDMtsD6.HGq@netcom.com> fliptron-@netcom.com (Philip Freidin)
writes: 

>>> What will it take to get reconfigurable computing off the ground?
>>The reconfigurable FPGA JAVA processor.  Say, what about modifying
>>Phil Friedin's small RISC into a JAVA interpreter?

(Well, an FPGA based Java machine might not be dynamically
reconfigured, so I'm not sure how this helps, except to glamourize
processor implementations in FPGAs.)

>Where do I get a spec so I can start on this. (at least half serious).
>Maybe just a sw interpretor running on the existing R16 would be fine.
>Do Java interpretors tend to be big or small (i.e. lines of C).

The Java VM spec is at //java.sun.com/doc/programmer.html/...  A good
Java microprocessor should have a 32-bit datapath.  XC4013Es not
XC4005s...

Java bytecodes are a cross between the Smalltalk-80 virtual machine
bytecodes and Microsoft C compiler pcode.  That is, 32-bit oriented,
stack oriented, with a locals frame, constant pool, generically typed
object instructions, plus several varieties of explicitly typed numeric
opcodes (e.g. explicit iadd vs. ladd vs. fadd etc.).  Requires a
runtime object system providing object typing and garbage collection.

A simple Java interpreter plus object system would be a few thousand
lines of code.  A good one would be much more sophisticated.

(Folks interested in implementing Java VMs should go read the last
third of "Smalltalk-80: The Language and Its Implementation", and most
of "Smalltalk-80: Bits of History, Words of Advice".  Plus look at
Deutsch and Schiffman's Smalltalk-80 interpreter, Self, SOAR, and
possibly the various hardware LISP implementations for ideas.  And
don't forget the ACM Architectural Support for Programming Languages
and Operating Systems conferences' proceedings!)

Many of the instructions are easy to do in hardware "at speed".  On the
other hand, many of the instructions, such as new, invokevirtual,
putfield, athrow, or floating point, are more involved, and for those
you would want to drop down to "microcode" to emulate.  Which could
mean that a microarchitecture with unaligned ifetch hardware, stack
orientation, and possibly type tags (I'm still trying to understand if
tags help) together with a fast underlying RISC datapath could be a
good start.  Or it might be exactly the wrong way to go!

The fundamental design issue is how sophisticated is your download-time
translation pass over the bytecodes, in order to canonicalize or
regularize them?

For example, a register file can emulate a frame + stack, so a
translator which tracks stack contents can translate
  "iload local #1"
  "iload local #2"
  "iadd"
  "istore local #1"
into
  "add r29, r29, r30".

Too little canonicalization, and your microarchitecture is too complex.
Too much, and congratulations, you have just written a Java-to-RISC
optimizing compiler on top of a simple pipelined RISC.

For Java, the SOAR approach seems pretty good.  Translate to a
RISC-like instruction set, probably once, at code download time.  (I'm
not sure there is much value to *dynamic* Java bytecode translation,
and therefore it probably doesn't justify much/any hardware assist.  A
little dynamic in-line self-modifying code might be apropriate,
however!)  Then add additional hardware to accelerate tag checks,
putfield GC space testing (e.g. for remembered sets), frame management,
or whatever else takes up the time.

I used to worship the Alto and Dorado architects.  Now FPGAs put their
tools (and more) into my hands, our hands.  Who needs ECL and multiwire
PCBs [ed] when we have LUTs and PIPs?  Ha ha ha ha ha ha ha!

Jan Gray