PowerPC

PowerPC Issues

This page provides a repository for information about using PowerPC systems for anyone who is used to using the MC680x0 family on vxWorks/Tornado. If you find any other information that ought to go on this page, please let me know.

Online CPU Documentation

EPICS Target Architecture

The version of gcc that ships with Tornado doesn’t have explicit support for all of the different PowerPC CPU types, but you can easily find out what the correct settings to use are by looking at what is used to build your vxWorks image. For the Motorola MVME2700, which has a ppc750 CPU, the correct target architecture is ppc604. If you find that you need to create new EPICS configuration files for a different CPU such as the ppc603 or ppc860, please send copies to Andrew Johnson so they can be included in future EPICS releases.

Relocation error

If on trying to load the EPICS binaries you get the error message Relocation value does not fit in 24 bits you are using the wrong target architecture for your vxWorks IOC. Here’s why this happens and how we fixed it:

The PowerPC relative branch instruction is limited to jumps between +/- 32MB of the current instruction (24 bits = +/- 4M instructions, 4 bytes per instruction = +/- 32MB). Unfortunately the vxWorks kernel gets put into the bottom end of RAM, but it loads all application code at the top end. If the two are separated by more than 32MB (should you have 64MB or more on board) then when it tries to load the application code those calls that use these relative branch instructions to vxWorks routines can’t be resolved within 24 bits, and the loader prints the message you’re seeing.

How do you solve this? I used to suggest several possible solutions for this, but there’s really only one that is of much use so I deleted the others to avoid confusion. In versions of Base from R3.14.2 onwards you should use the target architecture named ppcXXX_long for these IOCs, which adds -mlongcall to the flags passed to the vxWorks C compiler.

The -mlongcall flag tells gcc not to use relative branch instructions for calls to routines outside of the current object file. This means that all such calls take up one or two extra instructions than without it, but they can always be resolved.

Floating Point arguments in the vxWorks shell

It is not possible to pass a floating point (either float or double) argument to a function that is called from the vxWorks target shell. It won’t complain, you’ll just end up not getting the value you typed into the relevent variable. The PPC EABI requires that float and double values be passed in the floating point registers, whereas integers are passed in the normal registers. The shell assumes that all function arguments are integers. The relevent Wind River SPR just says:

spr 6201

From the shell level, calling functions with parameters of type float does not work on architectures where there is a different parameter passing mechanism for floats and integers, namely, PowerPC, MIPS, PA-RISC and ARM.

There is no fix available (nor likely given how old that SPR is).

I don’t know whether this also affects the host shell or not, although I suspect it will.

You can set variables to fp values, and do fp math using the standard builtin operators +-*/ but math.h function calls such as fmod() and sin() are out. However passing a float variable as a function argument doesn’t work, unless you arrange to pass a pointer to that variable instead.

Missing Floating-Point Functions

The following functions were provided for the 68K family but are not available in the Tornado 2.x libraries on PowerPC:

  • cbrt()
  • infinity()
  • irint()
  • iround()
  • log2()
  • round()
  • sincos()
  • trunc()

Floating Point Exceptions

By default vxWorks sets up Floating Point for the PowerPC to behave slightly differently than on the 680×0, which may cause software to crash that doesn’t on the 68k. It is possible to change the exception behavior to match the 68k, but unfortunately this has to be done individually for each task; if the exception is occurring in a driver or device support that will be called by an EPICS scan task it is usually simpler to change the software so it checks for and avoids doing the calculation that is causing the exception rather than disable it.

The differences are Underflow and Inexact exceptions, and possibly Division by Zero. These may occur under several circumstances:

  • Inexact: When storing a double value into a float variable, where the double value is too small to be represented in the float representation (e.g. 1e-40). The best solution here is to use the same type for both variables – don’t mix float and double, the overhead of using double everywhere is not usually large.
  • Underflow: When the result of a calculation is non-zero but too small to be represented in the variable. Dividing a very small number by a very large value causes this, as does multiplying two very small numbers. Often these are really the result of some other problem such as a loss of an input signal. Examine your expressions carefully to see if there is some limit that you can compare one or other subexpression against to discover the problem before doing the calculation that causes the exception.
  • Division by zero: Tests for this can often be merged into the Underflow check. Software should never rely on division by zero not causing an exception of some kind, even on 68k family CPUs.

If it proves impossible to modify the source to avoid the exception, the following code can be run at task startup to disable both Inexact and Underflow exceptions for code running in that task. There is also some code available in the tech-talk archives that will do this automatically for all tasks.

#if CPU_FAMILY == PPC
#include "arch/ppc/archPpc.h"
#include "arch/ppc/vxPpcLib.h"
#endif

/* ... */

#if CPU_FAMILY == PPC /* Disable underflow and inexact exceptions */
    vxFpscrSet(vxFpscrGet() & ~(_PPC_FPSCR_UE | _PPC_FPSCR_XE));
#endif

Signed/unsigned character variables

The ANSI C standard does not mandate whether a char is signed or unsigned by default. The particular implementation can take that decision depending which is simpler or more efficient on the specific hardware in question. On the 68K family, the standard definition makes characters into signed numbers, meaning that you can have chars whose value is negative. Software that has been written to rely on this behavior is likely to fail on the PowerPC as the PPC EABI specifies that a char is unsigned on this architecture. There are two ways to handle this.

The quick and dirty solution is to add the flag -fsigned-char to the compilation of the relevent code on the PowerPC. The original code is not portable however, and it is much better to change the source where this matters so it contains explicit declarations of signed char or unsigned char as necessary.

Interrupts

The PowerPC CPU has no concept of an interrupt vector table, and only provides a single interrupt pin and interrupt vector (it does have a number of exception vectors for other purposes though). This means that a VMEbus interface has to do in software what the MC680x0 family does in hardware, namely read the vector number from the VMEbus interface chip, then look up and call the service routine connected to this vector number. This explains why PowerPC interrupt response times are not about a tenth that of an MC68040 CPU, unlike most other speed measurements.

In vxWorks for the PowerPC, the vector lookup (and the core functionality for the intConnect() routine) are the responsibility of the BSP. Motorola BSPs allow more than one interrupt routine to be connected to the same vector, calling them all one after the other. This means that the 68K technique of using intConnect() to disconnect an interrupt vector (by reconnecting the “unconnected” stub to it) doesn’t work, and driver writers are advised to find a different approach.

You may also find that your BSP doesn’t provide a function for use by intVecGet(), although it is reasonably easy to do so. Without this function, the veclist() routine provided with EPICS is useless. I have a version for the mv2700 which I am happy to provide on request; it should work on other Motorola PPC boards as well. I also saw a definition for anintDisconnect() function on the comp.os.vxworks USENET newsgroup recently although I don’t recommend relying on this as it’s not a standard vxWorks API.

VME Bus Errors

Experience on the MVME2700 has shown that VME bus error are not handled very well by the PowerPC CPU family, as the architecture lacks the facilities to do so cleanly. The following discussion assumes that the CPU board uses a Tundra Universe-II chip for the interface to the VMEbus, although the problem does not occur only with this particular chip and is probably endemic to the whole PowerPC architecture.

The only way to flag a PowerPc read or write cycle as having failed is to generate a Machine Check Exception. According to section 4.5.2 of the IBM PowerPC 740/750 RISC Microprocessor User’s Manual “the resulting machine check exception is imprecise and unordered with respect to the instruction that originated the bus operation,” which means that it is impossible to accurately identify which instruction actually caused the bus error.

Without more detailed documentation on the internals of the CPU it is not possible to know how far out the indicated instruction could be – a subroutine call could take it into the vxWorks kernel. It seems unlikely that there could have been a task switch, but I can’t guarantee that based on current knowledge. In practice however we have noticed that for read cycles on an mv2700 the location indicated is usually correct. For a read cycle, the CPU needs the data that was read and can’t advance much further in the program without it. Write cycles are a different story however, as the CPU only needs to queue the write request details before moving to the next instruction. The write cycle can be executed at some later moment when the local bus is free.

The CPU is not the only place where a write queue exists either. The Universe VME interface chip contains a FIFO buffer which is usually enabled for the A24 and A32 master windows to decouple VMEbus write cycles from the associated PCIbus cycle. This allows the PCI bus to run much faster, as it doesn’t have to wait for the slower VMEbus cycle to complete when writing data. With this write posting enabled however a VME bus error will occur long after the CPU has moved on to following instructions, and it may have switched to a different task by then.

Sensibly the Universe chip won’t generate a machine check exception from a write posted cycle that bus errors, but this also means that software has no direct way of discovering that the problem occurred. The Universe can be programmed to cause an interrupt though, and it also saves the bus error location in a small buffer so this information can be logged. If the access is through the A16 window however the write posting is usually disabled, and the imprecise nature of the machine check exception on write is observed.

Unfortunately neither the machine check exception nor the write post interrupt are provided in the standard WRS board support package for the mv2700. I can provide some code for the former, but don’t have the interrupt support available yet – email me if you want a copy.

Bad vme interrupt 0

The above message appears to be caused by the write queueing discussed above in conjunction with the need for an interrupt service routine to have flushed any “clear interrupt” write cycle out to its VMEbus card before returning. Unfortunately it seems that even reading the register back after writing to it might not be sufficient to prevent this message from occurring. The error itself is benign though, so here at APS I replaced the logmsg() call in the BSP with a simple counter, so we can monitor it if we ever decide we need to but in the meantime it doesn’t fill up our logfiles.

DEC21x4x Network Drivers

UPDATE: The paragraph below no longer applies if you have installed the latest Tornado 2.0.2 drivers patch from WRS – they changed the meaning of the user flag bits. Look at the drv/dec21x40end.h header file for the new meanings, and if you work out how to tell the MII what it is and isn’t allowed to negotiate please let me know…

If your PPC CPU board has a DEC 21x4x chip on it, you may need to change the initialization flags to enable full duplex connections if your network switch supports them. Look for the setting of DEC_LOAD_STRING in the configNet.h file – the last section of this is a bitmap that controls various settings to do with the interface. These bits are documented in the relevent header file for your particular chip in the target/h/drv/end Tornado 2.0 directory. The chip that your board uses can be found from the DEC_LOAD_FUNC in the configNet.h file. For the MVME2700 we need the userFlag set to 0x80C00000.

VME A16 D32 Bus Cycles

The VMEbus was designed to be usable on both 3U single-height and 6U double eurocard systems, thus all the signals necessary to perform D16 bus cycles in the A16 and A24 address spaces are found on the P1 connector. It was originally assumed that the A16 address space would mainly be used for small I/O modules which might only use the P1 connector, meaning they would be unable to support data widths wider than D16. In order to improve performance on such systems, it was common for a CPU’s VMEbus interface to automatically convert every A16 D32 cycle into a pair of A16 D16 cycles, and the WRS (and APS) BSPs for the mv167 and relatives are set up to do this. Unfortunately the use of this functionality became ingrained in several device drivers for cards that really only support D16 cycles.

In introducing the PowerPC BSPs at APS, I have turned off the conversion, so it is now possible to do real A16 D32 cycles on the VMEbus. There are a few VMEbus device drivers that break as a result – if you get bus errors from a driver running on a correctly configured PowerPC that don’t occur on an otherwise identically configured MVME167, this may be the cause. The driver should be fixed to perform all A16 accesses using volatile unsigned short pointers.