Next Previous Contents

1. What State Is Linux/Alpha In?

Linux/Alpha is for real: pretty much everything is up and running: there is XFree86, LaTeX, ghostview, Netscape, Emacs, gcc, C++, NFS, automounter, all sorts of shells, perl, python, Java, Tcl/Tk, scheme, apache HTTP server, and pretty much anything else that's freely available. X11 works well on several video cards (see below). Thanks to Dave Taylor and Linus Torvalds, there is now even a Quake binary for Linux/Alpha! Since April 1997, it is now also possible to run many Linux/x86 binaries through the em86 emulator. (See section em86 ) The list of applications supported by em86 includes gems such as applix, Netscape, and acrobat. The emulator has been made available free of charge by Jim Paradis of Digital Semiconductor.

Linux/Alpha presently runs on most of the Alpha boxes that come with the PCI or EISA bus. This excludes the old TURBOchannel based DEC 3000 series of workstations.

1.1 Supported Drivers

Drivers that are known to work (let us known if there is something new):

1.2 Known Bugs And Workarounds

This section lists known bugs in Linux/Alpha and discusses how they can be avoided or worked around. As things are under constant development, this section is rather volatile. Just because it isn't listed here doesn't mean the problem isn't known already. On the other hand, if you run a recent distribution, it's likely that most of the problems have been addressed already. In any case, before sending mail off to axp-list, be sure to check this section first. If you discover a new problem/workaround, we would appreciate if you could send us a report (preferably in linuxdoc SGML format).

Kernel hangs or panics when trying to mount root file system:

The Linux kernel currently has /dev/sda2 hard coded as its default root file system. Thus, if your root file system is on any other disk or partition, you will have to specify the boot option root=/dev/root-partition. For example, if the root file system is on /dev/hda1, you'd specify root=/dev/hda1.

ELF gdb behaves odd w.r.t. shared functions.

When using gdb on a dynamically linked binary, it is best to force eager resolution of dynamic symbols. To do this, simply issue the command set env LD_BIND_NOW=1 from within gdb. Otherwise, you may see unexpected behavior when trying to step into or over a shared function. The source of this problem is known, but nobody has had time yet to fix the problem.

Kernel reports 2.88MB floppy drive:

On the Alphas, the kernel always reports floppy drives as having 2.88MB capacity even if a smaller capacity drive is installed. This is nothing to worry about: normally, the floppy driver automatically detects and selects the correct capacity so everything will work fine. The only exception to this rule is when formatting a new floppy disk. To do this, you'll need to select the device name with the correct capacity. For example, if the system has a 1.4MB drive, format /dev/fd0H1440 instead of /dev/fd0.

Unaligned accesses:

The Alpha, like all real RISC CPUs, requires that memory accesses are naturally aligned. For example, reading a 4 byte integer from memory requires that the address of the integer be a multiple of 4. Similarly, 8 byte integers need to start at an address that is a multiple of 8. If the CPU attempts to access a word that is not properly aligned, the CPU will trap into the kernel and issue a warning message. The kernel will then go ahead and emulate the unaligned access so that the user-level process executes as if nothing had happened (except for a substantial slow-down due to the fault).

Typically, an unaligned fault message looks like this:

X(26738): unaligned trap at 000000012004b6f0: 00000001401b20ca 28 1
        
What this means is that the process executing command X (the X11 server) with process id 26738 caused an unaligned fault accessing address 0x1401b20ca. This access was performed by the instruction located at address 0x12004b6f0. The other numbers are less important, but if you check the kernel sources, you'll find that they tell you more info on what kind of instruction caused the fault (e.g., a load vs. a store).

You do not need to be overly alarmed when seeing such a message. The program causing the faults will work correctly. Eventually, all unaligned accesses will be fixed, but in the meantime, just ignore these messages (if you're a programmer, please take a minute and fix the source of the unaligned access instead...).

Linker issues warning: using multiple gp values message:

This is a warning message that is often issued by the linker when building a large program. Unless you're into low-level hacking, you don't want to know what it means. The good news is: you can safely ignore this message and this warning will be optional in the future.

IDE driver causes time to run slow:

The default configuration of the IDE driver disables interrupts for extended periods of time. This causes the kernel to loose timer interrupts and as a result, time runs slow. To avoid this, use the following command on all of the IDE drives in your system:

                hdparm -u 1 /dev/hd?
        
This reconfigures the IDE driver to re-enable interrupts as quickly as possible.

minlabel,fdisk fail to update kernel partition table:

Do not attempt to use a system after changing the partition table. Even if minlabel and/or fdisk show the correct values you will have to reboot the machine before the new values take effect.

tar xvMf /dev/fd0 hangs.

(This bug should not occur on GNU libc-based systems.) Due to a bug in the malloc package that comes with libc-0.43, multi-volume tar archives do not work. Recompile and link with the gmalloc stand alone package, or get an updated libc.

Clock seems to be off by 20 years:

This is not really a bug, but many people seem to have problems with it. Here's Jay Estabrook's Definitive Solution.

ARC console and SRM console keep dates in the time-of-year (TOY) clock
in slightly different formats (actually, only the "year" field differs).

The "/sbin/clock" binary normally expects the format which SRM uses; you can,
however, tell it to expect ARC format instead, using the "-A" flag.
Thus, to read the clock when its kept in ARC format, say "clock -r -A", and
to write it, "clock -w -A". If its not written in the expected format, the
console (ARC or SRM) will prolly complain about it the next time it has a
chance... :-\

The best way to ensure you're using the correct format, is to set the date via
the console's date-setting facility; under ARC, it's a menu item some place,
under SRM it's a command (IIRC; try "help date").

Then you must ensure that the "clock" call in the RH script
/etc/rc.d/rc.sysconfig KNOWS WHAT FORMAT TO READ THE TOY IN!!!!

If you're using ARC console to boot MILO/kernel, do:

1. running RH 4.1, make sure /etc/sysconfig/clock contains:

        CLOCKMODE="ARC"

2. running RH 4.2, make sure /etc/sysconfig/clock contains (at least):

        ARC=true

Now, If you're using SRM console to directly boot a kernel, then:

1. RH 4.1, same file, set CLOCKMODE=""

2. RH 4.2, same file, set ARC=false

Refer to /etc/rc.d/rc.sysconfig for details about how the above are used to
call "clock" with the appropriate arguments.
        

Clock gets set to a random date and time

This occurs on the PC164/LX164/SX164 mainboards. This is due to a slightly different version of the TOY clock hardware on these boards. As seen above, your system clock gets set from the TOY clock during bootup, using clock. To test if your setup has this problem try the following command:

while true; do /sbin/clock -r [-A]; done
        
(use the -A option when your hardware TOY clock is in ARC format)
If you see any inconsistent results, you need to upgrade your /sbin/clock. Get one of:
gatekeeper.dec.com:/pub/DEC/Linux-Alpha/Kernels/clock-pc164-rh4.2
gatekeeper.dec.com:/pub/DEC/Linux-Alpha/Kernels/clock-pc164-rh50
        

0>0>0>0>0>0>

Standard MILO images are configured to talk to the first serial port as well as the screen. When you have a modem connected it will talk back. To resolve this, either make sure your modem is turned off at boot time, connect it to a different port, or build your own MILO, disabling serial port echo.

fdisk doesn't recognize my disk's partitions.

This may occur when you're using BSD-style partitioning, e.g when partitioned using Digital Unix's disklabel utility. Just go into fdisk's BSD mode and you will be all right.

vi handles keystrokes in batches of four

In fact, other apps will show the same behavior: it really is a ncurses problem. It may be related to the termio vs. termios programming error described in the section below. A workaround is to issue stty eof '^a' before starting vi.

X will not start on Ruffian (164UX), UP1000 or UP1100

Starting X fails with "Failed to set IOPL for I/O". Cause: the stock 5.1 GLIBC doesn't recognize the RUFFIAN system type. This is fixed in most recent distributions if you still encounter this do exactly the following (as root):

ln -s EB164 /etc/alpha_systype
        
For UP1000,UP1100 systems change the system type from above to Tsunami:
ln -s Tsunami /etc/alpha_systype
        
The following distributions are known to have this problem for the UP1000, UP1100: RedHat < 6.2, Debian < Woody.

ipfwadm fails.

RedHat 5.1 and 5.2 for Alpha shipped with a buggy ipfwadm. The common workaround is to use ipfwadm from RedHat 5.0. (Note: when you're running a 2.1.* or 2.2.* kernel you'll be using ipchains instead.)

Instable configurations with Adaptec SCSI controllers.

This seems to occur with Adaptec 2940 on PC164 in particular. Improvement has been reported after turning off the autodetection of device speed, width and termination. Get the utility from:

http://www.windowsnt.digital.com/support/drivers/drivers.asp/
        
Put it on a floppy, or any FAT partition, and select "Run a program" from the ARCBIOS menu. (Newer systems will allow you to configure SCSI controllers by running the onboard utility through the i386 emulator in the firmware.)

XL266 refuses to boot after setting time.

When you forget to use the -A option when setting the hardware clock on an XL266, the ARCBIOS may see an invalid time, and refuse to boot any OS until this has been corrected. Unfortunately, when the setting is sufficiently invalid, it will not allow you to do so. (This is definitely a bug...) To recover you need a modified version of linload.exe. (Thanks to Juergen Schroeder this is available from ftp://ftp.ub.uni-marburg.de/pub/unix/linux/alpha/linload_auto.exe.) Put it on a floppy, together with your favorite MILO, and use the "Run a Program" option to start it. Once in MILO you can boot linux, and set the clock again. Be sure to use -A this time...
(I believe the modification to linload.exe is that the location of MILO is hardwired into the program.)

1.3 Porting to Alpha: the long and short of it

Here is a somewhat random collection of popular ways of shooting yourself in the foot on Unix when programming in C. This has practically nothing to do with Linux or Alpha, but since Linux/Alpha is among the pioneers in 64-bit land, these errors are more likely to show on such systems.

sizeof(long)!=32

Many programs assume a long is 32 bits wide. This is non-sense. The ANSI C standard does not specify anything like that. For example, on an Alpha running a grown up operating system such as DEC Unix or Linux, the fundamental C types have the following sizes and alignment restrictions:

Note that the above implies that you cannot cast a pointer to an integer without loosing bits. In fact, Alpha binaries by default are purposely arranged in such a way that if you try to do this, they'll dump core---it is much better to learn about such program errors via a core dump than through some subtle errors.

If you need a variable with exactly n bits in it, you can use the following types in Linux applications (and most other systems that are based on GNU libc):

In the kernel, use the following types instead: However, the availability of these types is somewhat system dependent. In particular, on a 32 bit machine, the 64 bit integers are typically available only when using GNU C. Also, keep in mind that there are still machines out there that have odd word sizes, such as 36 bits. So, for the sake of portability, these types should be used sparingly.

Error return value of inet_addr()

It is a common myth to assume that inet_addr() returns -1 in case of error. In fact, even the Linux man-page propagates this superstitious belief. But don't be misguided: in truth, inet_addr() returns INADDR_NONE in case of error. This manifest constant is defined in netinet/in.h. An even better solution is to avoid this function all together. Reasonably modern libraries provide the inet_aton() function that has an unmistakable return value to indicate success or failure.

struct termio does not equal struct termios

Many Linux programs incorrectly assume it is all right to mix and match struct termio and struct termios and their ioctl() calls. Well, not quite. The two interfaces are in fact incompatible on many systems (for historic reasons, this can't be fixed easily). Thus, if you use struct termio, then be sure to use the termio calls only (TCGETA, TCSETAF, TCSETAW, and TCSETA). In contrast, if you use the termios structure, be sure to use its calls only (TCGETS, TCSETSF, TCSETSW, and TCSETS).

Atomicity of sub-word loads/stores

It is generally not safe to assume that reading or writing a quantity that is smaller than the machine's word size is atomic. In particular, all early Alpha chips do not have atomic instructions to read or write a byte or a short (16 bits). Unless you're into kernel hacking where you need to synchronize with devices and/or interrupts, you probably won't care. But even in user-space this can cause problems in case your program is sharing data with another process through shared memory, for example.


Next Previous Contents