Saturday, November 28, 2009

My broken SS-5 is just too fast

Few days ago I wrote I have a world's fastest broken SS-5. The problem is that it is so fast that this alone makes it broken.

It looks like at least some PromDiag/POST/OBP tests fail just because qemu doesn't emulate cpu cycle-exact. It can be they wait that an irq would happen while they execute like 100 nops, but qemu nop is much faster than a real one, so an irq comes too late. "nop" is just an example here, I didn't disassemble the tests yet, but it looks very much like it: the timer test passes if I make the timer tick 256 times faster.

Probably the other tests fail due to the same reason. So the OBP timer/irq tests are probably useless.

Sunday, November 22, 2009

Hidden OBP feature found

debugging the initial Power-On-Self-Test of OBP 2.29 I found a secret level a cool undocumented feature, PromDiag. Whenever I turn it on, instead of getting a usual OBP "OK" prompt I get:

PromDiag
NOK>

I wonder what is "NOK"? Does it mean "Not OK"? Anyway, I played with it a little. It runed out that it can launch single POST tests, and there are some more features, which have to be discovered yet. All in all it accepts just a few symbols: numbers, dot, comma, c, h, l, q, r, s:

Saturday, November 21, 2009

IRQ/Timer puzzles

I've got two puzzles a puzzle concerning slavio irq/timer behavior:
  • qemu doesn't seem to behave as specified in the slavio documentation, I get an irq when I expect none.(no, it's ok, my test was just wrong)
  • a real SS-20 doesn't seem to behave as specified in the slavio documentation, I don't get an irq, when I expect one.


I already found some places where the documentation is not precise, for instance it claims that reserved bits "read as 0, write has no effect", but they don't always read as 0, (may be they aren't really reserved?).

I miss my oscilloscope and direct access to the hw. If someone has a sun4m machine and an oscilloscope, please get in touch!

Sunday, November 15, 2009

Lucky bug

After submitting the performance/irq fix upstream it turned out the fix should have never worked! I missed a logical "not" in the expression, and did exactly the opposite to what I intended, clearing all the irqs which had not to be cleared, and not clearing the irqs which had to be cleared.

The fact that this wrong code is working means that for some unknown reasons, the interrupts are additionally raised and cleared somewhere else. For the timer it's 99.5% of interrupts: without the improper fix I get ~ 100 spurious interrupt complains per second, with the improper fix it is 1 complain every 2 seconds.

And the fact that the wrong code improves the emulation (NetBSD 1.3.x-1.5.x is working) means there are some counterpart bugs in the code...

Saturday, November 14, 2009

The World's fastest broken SS-5

Fixed a bug in the IRQ routing and now I have a machine gun, ho-ho-ho the World's fastest [broken] SparcStation-5! According to the Solaris 2.6 and Solaris7 output, it's faster than 1 GHz:

cpu0: FMI,MB86907 (mid 0 impl 0x0 ver 0x4 clock 1083 MHz)


Remember, last week I told that after fixing the performance problems I'm going to get back in the XXI century? Well, I lied. I did another quick stop in the past:

WARNING: clock gained 3987 days -- CHECK AND RESET THE DATE!


Guess, which OS is it?

Thursday, November 12, 2009

sparc64's name is Legion

Recently I get a lot of questions about sparc64 emulation in qemu. The only answer I can give, is the same one as "The Zombies" sang in 1960s: "She's not there".

But there is another Open Source (the project's page claims it is CDDL, in the sources I've seen GPL) project which targets emulating Sparcs. Actually, OpenSparc. So, if you are interested in the Solaris 10+ emulation, take a look in the Project Kenai's Legion Sparc Simulator.

If you already have a 64 bit Solaris machine, you can download a pre-built all-in-one (including the Solaris 10 image) package here.

The bad news are, there is no network card emulation, and currently build doesn't work under Linux. Should work under the x86 Solaris though, so it is not completely useless. Also it should be possible to port it to linux, since SunStudio is also available there.

But for now I'd be sticking to 32 bits and qemu.

Saturday, November 7, 2009

Another week - another Solaris version (tm)

I'm still in the 20th century, but making progress.

SunOS Release 5.7 Version Generic_106541-08 [UNIX(R) System V Release 4.0]
Copyright (c) 1983-1999, Sun Microsystems, Inc.

# uname -a
SunOS 5.7 Generic_106541-08 sun4m sparc SUNW,SPARCstation-5
# ls -l /
total 122
drwxr-xr-x 2 root sys 512 Oct 15 1999 a

The next stop is going to be 21 century. But going to look at the performance problems first. Waiting 6 hours for the '#' is a bit boring (and the problem is definitely not the CPU speed).

Thanks to Sergey Dionidis (a.k.a sdio @ LOR) for helping to test it.

Friday, November 6, 2009

Things missing in the vanilla qemu

Things which can be fixed in the vanilla qemu:

For OBP:

- Floppy. Instead of fixing it, I broke it completely, so OBP doesn't try to initialize it and hang. Actually it maybe not the fdc itself, but the irq handling. There are OBP tests which may help to understand what is currently going wrong. I didn't need it, does it actually work with OpenBIOS?

- [SparcStation-5] 0x6e000000 AFX. OBP tries to access it and fails with "unassigned address exception".
- [SparcStation-20] 0xef8010000 DBRI, 0x9000X00X FCode SIMMs. Same problem here.

AFX, DBRI and FCode SIMMs can be implemented as stubs. Yet better would be if SBUS probing would do a proper fault. This devices are optional.

Solaris 2.5.1 - 7 have problems with

- interrupt handling. Due to errors in irq handling, the boot takes ~7 Hours. Working on it.
- MMU (?). Solaris tries to access memory after translation failed. Actually Debian/linux has similar problems, but it ignores traps, while Solaris doesn't.
- MMU (?). The message "hsfs_putpage: dirty HSFS page" means that a page was modified, although it wasn't supposed to. May have to do with the cacheabilty tweaking.
- [SparcStation-20] PAC. Solaris hangs where it would normally say that physical address cache is enabled.

Additionally Solaris 8-9 have problems with

- Spurious interrupts.

Nice to haves:
- The ability to send STOP-A to the serial console. Would greatly help to use Solaris kernel debugger (kadb) when the kernel hangs.

- Network boot. Looks like something which can easily be fixed. Currently it fails with the message
Internal loopback test -- Wrong packet length; expected 36, observed 64

Last updated on 15.12.2009.

Sunday, November 1, 2009

Another week - another Solaris version

After re-fixing the bug I fixed before, and fixing the third one in the Sparc CPU emulation, I got Solaris 2.6 going. This version doesn't say how much did the clock gain since the release, so I can not estimate, how good am I doing in comparison to the reference 4900 days. Probably it was released in year 1997 on July the 18th.

SunOS Release 5.6 Version Generic [UNIX(R) System V Release 4.0]
Copyright (c) 1983-1997, Sun Microsystems, Inc.

NOTICE: SBus clock frequency out of range.
# ls -ld /a
drwxr-xr-x 2 root sys 512 Jul 18 1997 a

It also complains that

NOTICE: hsfs_putpage: dirty HSFS page

this may mean the current qemu workaround for non-emulating CPU cache is not good for Solaris. On the other side, who needs the hsfs module :).

Again, thanks Carey for the Solaris 2.6 disk!