Stierlitz, the Fearless, Driver-Less Bus Analyzer.
The tool described in this post may be helpful to other ab initio machine-architecture developers. If any exist. The rest of Loper will remain in my private code repository, because it is not a collaborative project.
Meet Stierlitz [1], perhaps the world's strangest bus analyzer. For basic use, it requires no software at all on the PC end, except for a reasonably-recent version of Linux.
Are you designing a new machine architecture from scratch? Got a bus 31 or fewer bits wide, on which you would like to perform arbitrary reads and writes?
Stierlitz will sit on your bus and imitate a USB Mass Storage device. The latter will appear to have a FAT16 file system on it, containing a single binary "memory image" file. (Remember, the maximum file size under FAT16 is 2GB. Hence 31 bits.)
Mount it (under Linux, using "mount -o sync ...") and perform block-sized arbitrary reads/writes by reading or writing blocks within the image "file." If you have an ill-behaved kernel which does not fully respect "-o sync", you may need to perform "read/write/modify" cycles to get the intended results.
Make sure to use the O_DIRECT flag (or the "raw" block device, if your kernel still has one) to disable ALL caching of this "disk" on the PC end. AFAIK, you cannot do this at all on Windows or Mac OS. Therefore I did not bother testing Stierlitz on those poor, crippled systems. Stierlitz contains a reasonably-complete implementation of a USB Mass Storage stack, and, in principle, ought to work on PCs other than my own.
If you snip out the FAT16 emulation, you can get 41 bits of address space.
You will need an FPGA development board with a Cypress CY7C67300 USB controller, such as the Xilinx ML501, and a copy of my "EZOTGDBG" utility. The only contents of Cypress's SDK which you will need to build Stierlitz is "QTASM.EXE", the assembler.
On the FPGA end, you need this reasonably-compact (20 or so Virtex slices) state machine. The price paid for this compactness is that the interface does not use the USB controller's DMA capability. This results in I/O speeds somewhat slower than the theoretical maximum (The CY7C67300 is a USB 1.1 device.) My tests currently show ~500KB/s reads and ~200KB/s writes. This is adequate for most of the purposes I have in mind.
To use Stierlitz in your Verilog project, instantiate something like this:
stierlitz foo(.clk(hpi_clock), /* 16MHz clock */ .reset(usbreset), /* Reset (active-high) */ .enable(1'b1), /* Hardwired ON for now */ /* Control wiring */ .bus_ready(...), /* "Short Bus" - user interface */ .bus_address(...), .bus_data(...), .bus_rw(...), .bus_start_op(...), /* CY7C67300 connections */ .cy_hpi_address(...), .cy_hpi_data(...), .cy_hpi_oen(...), .cy_hpi_wen(...), .cy_hpi_csn(...), .cy_hpi_irq(...), .cy_hpi_resetn(...) );
There is a basic demo. It gives you a 128K "file" mapped to SRAM (you need an FPGA with at least this much "Block RAM.")
But wait, there's more!
Here is a very basic set of demo routines in Common Lisp (needs an SBCL-only POSIX plugin):
Open a Stierlitz image:
(defconstant +O_DIRECT+ #x4000) (defun open-stierlitz-image (pathname) "Open Stierlitz image using sb-unix with blocking I/O." (multiple-value-bind (fd errno) (sb-unix:unix-open pathname (logior sb-posix:o-rdwr sb-posix:o-sync +O_DIRECT+) 0) (unless fd (error "Could not open Stierlitz image: ~A!~%Errno: ~a~%" pathname (sb-int:strerror errno))) (sb-sys:make-fd-stream fd :input t :output t :element-type '(unsigned-byte 8) :buffering :none :pathname (make-pathname :name pathname)))) (defvar *stierlitz* (open-stierlitz-image "/mnt/usb/LOPERIMG.BIN"))
Seek within the image:
(defun stierlitz-seek (pos) (file-position *stierlitz* pos)) (stierlitz-seek 512) ;; Seek to start of second block...
Read a block:
(let ((buf (make-array 512 :element-type '(unsigned-byte 8)))) (read-sequence buf *stierlitz*) ... )
Modify a block:
(let ((buf (make-array 512 :element-type '(unsigned-byte 8) :initial-element #xAA))) (write-sequence buf *stierlitz*) ... )
And so forth.
In the unlikely event that some other person actually tries to make use of any of the above, said person is invited to comment here...
So, what's next?
1) Logic to drive other peripherals found on the ML501 (Video, Ethernet, etc.) Coming soon.
2) My own, proper logic synthesis system (Goodbye, Verilog. Goodbye, hundred-line FSM boilerplate and other such abominations. And, eventually, goodbye, Xilinx toolchain.) Coming soon...
3) And, last but not least, my CPU architecture, rationally-designed and, incidentally, entirely and meaningfully-unlike any existing one (including the Lisp Machines.)
Don't expect too many posts like this one in the near future; they are quite labor-intensive to write. And, judging by my server stats, most of my readers would much prefer to see ever-hotter hot air. Yes, there will certainly be more of that.
[1] Named in honor of an appropriate (but admittedly obscure in the English-speaking world) fictional hero.
I find this type of post is much more useful than the others so, I'm sad to hear that not many will be posted.
A replacement for the Xilinx toolchain would be fantastic. I have no idea how people can do more than toy projects with that, the amount of erratic behavior that I encountered was quite alarming* and the interface was a mess.
*- maybe they change that in the past year but, when I used it, the result of every command/button was determined by a random coin toss, with the probability of the coin landing on neither side increasing exponentially with the amount of time the IDE stayed open
Dear Milton,
The Xilinx toolchain is usable (at least under Linux) if you avoid the IDE (and all other GUI elements) entirely. See this project for an example of a complete makefile.
Yours,
-Stanislav