Jan Feb Mar
Apr May Jun
Jul Aug Sep
Jan Feb Mar
Apr May Jun
Jul Aug Sep
Oct Nov Dec
Apr Aug Sep
Oct Nov Dec
Remembering Pierre Elliott Trudeau, 1919-2000.
I recall a New Year's Eve speech at Nathan Philips Square, bitterly
cold, being unexpectedly and deeply moved by this immensely charismatic
man and his vision for Canada.
[updated 00/08/04] Justin Trudeau's
eulogy for his father
Here at ESC, one station in the Lineo
booth is demonstrating uCLinux
running on a LEON SPARC
implemented in a Xilinx Virtex-800 in an XESS XSV-800.
Lineo announces Linux on FPGA Cores.
To my knowledge, this marks the first time that any flavor of Linux has
run on a monolithic FPGA CPU. A milestone. From Why FPGA CPUs?,
"a number of these designs will be published under GPL or put in the public
domain. There will be communities of users of certain free CPU designs,
similar to the open software movement. There will be GCC tools chains,
lunatic fringe Linux ports, etc."
Congratulations to Jiri Gaisler (ESA) and Jeff Dionne (Lineo) and co.! By the way, sorry about that friendly
perjorative lunatic fringe -- this work is certainly very relevant
to mainstream embedded systems developers.
This week I'll be at
Embedded Systems Conference 2000. On Thursday, Sept. 28, 10:30-12:00,
I will join Tom Cantrell in presenting class #524, "Roll Your Own RISC".
See you there.
This being ESC week, expect lots of FPGA CPU and SoC news.
Xilinx Virtex-II 3.125 Gbps serial links
Xilinx Establishes Leadership Direction for High-Bandwidth I/O Technologies.
"On tap are LDT, POS-PHY4, 3.125Gbit, Infiniband, XAUI, RapidIO, and
Fibre Channel for FPGAs; Virtex-II architecture to break 10 Gb/sec
barrier next year." Plus support for gig and 10-gig ethernet.
'"The integration of very high bandwidth interconnects and the previously
announced IBM PowerPC processor core within our Virtex-II architecture
will offer customers the most flexible, fastest time-to-market development
platform in the industry," said Dennis Segers, senior vice president
and general manager of the Xilinx Advanced Products Group.'
Xilinx and Conexant Announce Licensing Agreement of SkyRail 3.125 Gbps Serial Transceiver Technology.
'"In creating our next-generation Virtex-II FPGAs, we felt it was critical
to include high-speed serial interconnect technology, as leading system
architects are increasingly employing gigabit serial interconnect
technologies as the means to connect high-bandwidth elements within
their systems. ... "
said Erich Goetting, vice president of Product Development for the
Xilinx Advanced Products Group.'
Murray Disman, ChipCenter:
Xilinx Releases High-Speed I/O Plans.
Altera Excalibur hard processor cores
Altera Showcases Excalibur Embedded Processor Solutions at Embedded Systems Conference.
Altera Unveils Technical Details of Its ARM- and MIPS-Based Excalibur Product Offerings.
Altera Extends Excalibur Design Flow to Include Industry-Standard Processor Cores.
Altera reveals the hard core aspects of their SoPC strategy.
According to the press releases,
the new XA family is based on the ARM922T core, and provides an MMU, 8 KB
I- and D-caches, and the Thumb small footprint instruction set extensions.
The XM family is based on the MIPS32 4Kc, with 16 KB I- and D-caches,
and a multiply/divide unit. Both familiies run at 200 MHz. Both share
a "stripe" of hard cores of memory interfaces (SDRAM, SRAM, FLASH),
embedded memory, and peripherals (UART, counters/timers, interrupt controller,
trace/debug, and PLLs). Both use AMBA AHB bus interfaces to soft cores
implemented in programmable logic.
Now the on-chip bus landscape comes into focus. Altera with
Of course, differing processor architectures + differing on-chip buses = customer lock-in.
Murray Disman, ChipCenter: Altera Unveils Excalibur Details.
Craig Matsumoto, EE Times:
Altera, Xilinx hop diverging buses in SoC plans.
Jayant Mathew, EDN:
Altera Unveils Excalibur Technical Details.
Richard Goering, EE Times:
Verisity, ARM employ Amba bus to tackle SoC verification.
Bryan Hoyer and Martin Won of Altera, in
Embedded Systems Development 9/00:
Marrying Processors and Programmable Logic.
Here are some other interesting FPGA SoC articles by Martin Won:
The Future of System Design: Configurable Cores and CPLDs.
Programmable Logic and the Challenges of System-on-a-Chip Design.
New debug strategies combine on-chip and off-chip tools.
Tom Williams, Embedded Systems Development: [Triscend A7]
SoC Houses ARM Core and PLD Matrix.
Xilinx Student Edition 2.1i is shipping. ISBN: 0-13-028907-8.
FatBrain has it
in stock" as does XESS,
whereas Amazon.com says
The significantly-cheaper $55 Student Ed. 2.1i appears to target the XCV50
(and therefore the bit-compatible XC2S50 (Spartan-II-50), about $13 q1), making Virtex
development much more accessible to budget conscious designers. And it's
perfectly adequate for developing and testing reusable Virtex IP cores,
including processors, for subsequent integration into larger Virtex/E/II
designs (which require the professional tools).
We're in for lots of fun -- and some astonishing student projects. See
Teaching. Hats off to Xilinx and Prentice-Hall.
Thanks to products like this, I think we will witness more new processor,
signal processing, and SoC designs in the next two years than in the whole
history of computing.
Alas, unlike the 1.3 and 1.5 editions, the 2.1i edition does not seem to
bundle Vanden Bout's
Practical Xilinx Designer Lab Book.
That book (together with an XS40 board and XSTOOLS) is a great,
Xilinx tools-focused, hands-on introduction to digital design with FPGAs --
which is why XSOC targets that platform. XESS:
"Prentice Hall has lowered the price of the XSE-2.1i package,
and we have passed the price reduction on to you.
But Prentice Hall no longer includes
The Practical Xilinx
Designer Lab Book in the package! To correct this shortcoming,
XESS Corp. will make a sequence of tutorials and labs available online
We welcome autumn with a wee little bit of site tweaking.
Anthony Cataldo, EE Times:
Embedded CPUs break out baseband functions for 3G apps and
Intel proposes modular design for XScale systems.
Marketing the concept of an additional general purpose "application
processor" to run alongside the radio/comms DSP(s) in 3G phones.
"The applications and client should be developed separately
from the communications stack. We want to free them so that they can
develop at their own pace."
Perhaps the most fascinating spectator sport of the next two years will
be watching and handicapping the many, many big companies (software,
hardware, telcos, etc.) scheming and jockeying for a position
to influence or control the all important emerging 3G wireless
phone computing platform.
Richard Goering, EE Times:
Xilinx acquires formal verification tools.
"Xilinx plans to use the technology with the 10-million gate Virtex II
FPGA architecture. The company will work with EDA vendors to incorporate
the technology into high-level FPGA design flows."
I don't understand. I thought the purpose of formal equivalence checking
tools is to prove (sans simulation) that your specification model
matches your implementation, in flows where the implementation is not
mechanically derived from the specification model. But in FPGA designs
where the flow is HDL --> (synthesis) --> EDIF --> (PAR) --> config bits,
how would equivalence checking help? It seems to me that only in the
case where there is a failure in the synthesis or place-and-route tools
would it uncover a discrepancy. Can someone explain?
[update 00/09/24] Xilinx
press release. Perhaps its external use has more to do with future
FPGA/hard core hybrids.
Green Mountain Computing Systems
releases the GM HC11,
a free HC11 CPU core.
This is the third FPGA CPU core announcement this week.
"The GM HC11 CPU Core package includes the synthesizable core, projects,
self-checking testbenches and a debugger."
"We have synthesised the CPU core for both Xilinx and Altera FPGAs using FPGA Express from Synopsys. The design used 1076 slices and runs at 31MHz on the Xilinx Virtex 400E part. On the Altera APEX 20K100 part, the design ran at 32MHz and used 2142 slices."
There must be considerable interest in FPGA CPU and SoC development,
for in the past six months fpgacpu.org has received 62,000 page views from
9,000 unique visitors. (Even discounting spider robot visitors and
repeat dial-up visitors, that's a lot of interest.)
There have been 650 downloads of the latest beta of XSOC,
and there are currently 110 subscribers to the
Thank you for visiting. "Let's have fun."
(Yet Another RISC Design) FPGA processor. Sixteen 16- or 32-bit
registers, 16-bit instructions (two operands), two stage
pipeline, single cycle execution, one branch delay slot,
hardware return stack. Written in VHDL, it "push-button"
synthesizes under Synplify to run at 40 MHz in an XC2S100-5.
Tested on an Insight Spartan-II demo board.
Xilinx Virtex Power Estimate Worksheet. Cool... :-)
Speaking of power estimation, yesterday I wrote "total energy per cycle"
but Philip Freidin points out that a better figure of merit for energy
consumption is "Joules per task" or perhaps Joules per standard benchmark.
I agree. This is much better than quoting some apples-to-oranges
number like mW/MHz, since one clock cycle on one implementation may accomplish
quite a bit more work than one cycle on another implementation.
on comparing and benchmarking FPGA CPU cores.
A theme of my work: for FPGA CPU cores, simple is beautiful.
In my experience,
- simpler is smaller
- smaller is cheaper
- smaller is faster
- smaller is more power frugal
- simpler is easier to test
Simpler is smaller: The simpler the processor design, the fewer
gates and wires are required to implement it. Programmable interconnect,
even local wiring, is usually slower than the logic it connects,
and 2-input multiplexers are as expensive as register files and adders.
An easy way to gauge the complexity of a processor design
is to simply count the multiplexers in the datapath.
Smaller is cheaper: A three-stage (fetch, decode, execute) pipelined RISC CPU really needs
only 300-400 logic cells (existence proof: xr16).
Another way is to count the number of lines in its HDL model.
One can write a perfectly adequate RISC processor (for integer C code)
in a couple of hundred lines of Verilog.
A small core allows the SoC designer to ship an SoC in a smaller,
cheaper part. A small core fills less than half of a XCS20XL or
an XC2S30, leaving half free for the rest of your system design.
A small core also opens up the delicious possibility of building
a 4- or 8-way multiprocessor SoC in a Spartan2.
Smaller is faster: In the slow-interconnect FPGA world, the fewer
columns of logic that a pipeline clock enable or similar signal must traverse,
the "closer" the I-cache to the datapath, the faster the minimum cycle time.
An FPGA-optimized RISC processor core should target a minimum cycle
time of approximately the execution stage recurrence cycle time (e.g.
operand-register clock-to-out delay plus adder delay plus result mux delay
plus forwarding mux delay plus operand-register setup time), which is
less than 15 ns in a slow Virtex device.
Smaller is more power frugal: The fewer wires' capacitance you
charge and discharge each cycle, the less total energy per cycle
(proportional to CV^2) you waste.
A small design is easier to optimize via retiming, floorplanning, and
explicit technology mapping of critical paths.
A smaller design, in a smaller device, will require less power.
Simpler is easier to test:A simpler design will exhibit
fewer cases and paths; therefore the test bench
required to elicit these situations will be smaller, simpler, and easier
to achieve full test coverage.
Craig Matsumoto, EE Times:
Startup puts a fresh spin on programmable cores.
eASIC is building fast dense programmable
logic technology for ASICs. The logic cells are programmable via lookup
tables and the interconnect is configurable via "jumper" connections in the
top metal layer.
Today's Scripting News:
"Companies. Why does the software conversation always revolve around
companies? What other artform forces its artists to become CEOs to
be taken seriously (and even then not). Stop everything and
read this section
of a very early DaveNet piece, written long before anyone had
heard of open source. This has been bothering me ever since I got started
in the software business. A huge disconnect. I don't want to work for
a company. I like making software. Now what?"
Why Companies. Classic DaveNet.
"It's always been frustrating to me to have my products evaluated based
on the size of the company it comes from."
On the other hand, you have to marvel at the army Altera marshaled and mobilized
for the Nios launch: marketing, PR, development, testing, documentation,
legal, business development, FAEs, distributors, reps, seminars,
trade show presence, Cygnus support, development board design
and fabrication, kit fulfilment, billing, tech support, etc.
Jiri Gaisler's LEON SPARC
implementation is now running at 25 MHz in
80% of an XCV300-4.
Rob Finch announces
FPGA CPU, with 24-bit instructions and data, and an on-chip MMU and I-cache.
We also have a new 24-bit instruction word 16/32-bit RISC core
in the works.
Markus Levy, EDN:
Processors drive (or dive) into programmable-logic devices.
"Will more and more processor core-based systems turn to PLDs? Quicker
time to market and design flexibility, along with better process
technology to lower prices and increase density would indicate that they
will. Nevertheless, it will be interesting to see whether this trend
toward PLD-friendly processors can meet the increasingly difficult
demands of embedded systems." Alas, no mention of us.
Craig Matsumoto, EE Times:
Actel aims anti-fuse FPGAs at handheld appliances.
"Actel ... found its groove in MP3 players and has sold chips into
90 percent of the players on the market ..."
Murray Disman, ChipCenter:
Actel Targets e-Appliance Consumer Applications.
Alexander Wolfe, EE Times:
Startup rekindles Java core race with Moon IP.
"Vulcan ASIC Ltd. ... unveils a Java hardware core called Moon. ...
Vulcan built Moon as a member of Altera's consultants alliance program,
implementing first silicon on Altera's Apex 20K technology."
working on many fronts
I attended an Altera Nios Hands-On Design Workshop, and I have a
report that I'll polish up and post here sooner or later.
The evening following the seminar, I designed a new even
simpler 16/32-bit RISC core for students, very similar to the xr16,
but with an integrated I-cache memory and no pipelining.
Static timing analysis says it should run at about 40 MHz in a slow Virtex.
Less than 200 lines of Verilog.
I've been implementing test designs on my XESS XSV-300,
including an 1152x864 color frame buffer.
I've been back to work on the xr32 test suite and tools and finished
the new "100% synchronous interface" async SRAM memory controller.
I also polished CNets to the point that it could emit EDIF that
the Xilinx tools could place and route. Then I went back and redesigned
everything for wide buses and registers, so at the moment it's all
"taken apart" and non-functional again.
I also learned the Python programming language. It is simple, clean,
and you get good results fast. I may rewrite the xr assembler in
Python for fun.
FPGA CPU News, Vol. 1, No. 3
Back issues: Aug, Apr.
Opinions expressed herein are those of Jan Gray, President, Gray Research LLC.