Month: June 2011

PIC18F Software Project Survival Guide 5

This is the fifth installment of a series of posts regarding software programming on PIC18F cpu family. Previous installments: first, second, third and fourth.
Lost in memories
The first problem you are faced with when designing your application is the PIC memory architecture. And saying that this is a problem is like saying that a multi-head nuke missile is a sort of firecracker. You could say that most of the problems you’ll find working with PIC18s can be classified in two sets: those who stem from memory and those who impact on memory.
First problem is size.
PIC 18F if programmed in C (and you want to use C and not assembly to keep you mental sanity for what is left of your life out of the working hours) tends to be quite memory hungry. Code density is low even if compared to the early CPUs developed by mankind.
To make things worse, Harvard Architecture isn’t going to help you – since pointers are implemented differently down into the assembly instructions level whether they point to Ram (file register) or to flash (Program Memory), you will need to code the same function twice (or more). Consider that the standard library strcpy function has four different implementations because of the four combination you get from its arguments (copy ram to ram, ram to flash, flash to ram and flash to flash).

I read about a C compiler that masks these differences away, (with a pointer wrapper, I presume) but according to the website is far from being production level completeness. Also such an approach penalizes execution time when you know where data are located and that their position is not going to change.
If you really need to handle data both in data memory and program memory you can write a wrapper. I need to sequentially read bytes from any source, so I wrote two wrapper classes ByteReader (and ByteWriter with limited functionalities). The additional benefit is that you can adapt the wrapper to read from an external flash memory as well.

When you let your hardware engineers decide which PIC18 to put on board of the device you are developing, take care about some flash memory subtleties.
All PICs 18F can program their own program memory. But there is a sub-family of PICs (18Fx(456)K2x) whose members have a dedicated small (usually 1k or less) data flash attached to the CPU core by a third memory bus.
You may wonder why the Microchip engineers went the chore of adding a third bus and differentiate on chip memory and addressing. Well, they had, indeed, very good reasons:

  1. program memory can be written one byte at time, but needs to be erased one 4k page at time. 4K on 128k is quite a fraction, but worse, you are forced to juggle with data you want to preserve and considering you don’t have 4k of data memory it is going to be quite a juggle.
  2. When you write the program memory the corresponding bus is stalled, since this is the bus where the program instructions are taken and there is no instruction cache, the CPU is stalled. Typical erase/write times for a flash memory causes the CPU to stall for 5-10ms, possibly more and that can be a showstopper for a real time application.

If you need persistent storage and there are no PICs with the feature list you need, you may resort to an external flash memory connected either by I2C or SPI.

Anyway, regardless of the application, always beg for the device with the largest memory (i.e. 128k in the current production)
Since you are going to sell billions of devices there will be some pressures about picking a small memory footprint device. Resist! You must not lose this battle! You can argue that PIC with different memory sizes are pin-to-pin compatible and there’s no need to add risks on the development when you can downsize the memory in pre-production or at the first technical review.
128k of program memory may seem a lot for an embedded system, but given the low code density and the optimizer naivete is not that much.
On some devices of the 18F family (the ones with a high pin count) you can extend program memory with an external memory.
For our application we managed to fit everything in the base memory and we used the extra pins to connect a LCD screen (the parallel port and memory bus share the same pins). Also we employed an external flash for data only connected either via an SPI or an I2C depending on the specific device we were developing.

Harvard Architecture
Although it could seem a good idea from a theoretical point of view, having two distinct addressing spaces for program and data, isn’t a good idea and having distinct instructions, with distinct addressing modes makes things ugly.
In fact you do want store data in non volatile memory – initialization data, constants, lookup tables – would be only for you have at most 4k of data memory.
The compiler trying to favor performance over conformance doesn’t help much.
Let’s start from the void pointer (void*). In C this kind of pointer is just a (temporary) container for any pointer. You take any typed pointer and you can convert it into void* and then, when you need it, you have to convert it back to the original type.
With MCC18 you have two main problems stemming from the default storage chosen for pointers. In facts pointers are data memory pointers by default. Void pointer makes no exception and it is two bytes long. The problem is that it cannot host a program memory pointer when you use the large memory model. Large memory model is needed only when you need to access more than 64k of program memory and implies that program memory pointers are 3 bytes long.
The other problem is that, by this convention, the pointed type it is not enough to tell apart program and data memory pointer. That is be it a uint8_t* or a uint8_t const* you cannot tell anything about the memory region the uint8_t is stored.
For this discrimination MCC18 provides two qualifiers: rom and ram; the latter being optional, since by default everything is stored in ram.
On one side I prefer the qualifier approach when compared to an “intelligent” approach where the compiler decides silently where to put what based on “const”-ness or other consideration. In fact I use const qualifier quite everywhere and not just for data stored in program memory.
On the other side having to explicitly provide several versions of the same function is a plain waste of space. I would like having a flexible approach where by default I have generic pointers, handled by the compiler through an adapter layer, and specialized pointers (rom/ram) on demand where performance matters.

Hardware and Software stack
First PICs were really simple processors featuring a couple of levels for the call stack. 18F architecture has 31levels of call stack, thus enabling the CPUs to power medium sized architecture.
At a first glance 31 levels may seem a lot… what the heck, even Windows stack trace rarely spans over such a distance.
That would be fine, but let’s take a closer look at what happens when you run low on memory and you enable all the optimizations.
One of such optimization is called Procedural Abstraction and does a fine work in trimming down the size of the code. This transformation examines the intermediate code (or maybe the assembly code directly) and creates subroutines for duplicated code snippets. Operating at a lower abstraction level than the C source, the optimization has far more opportunities of applying the transformation.
Although clever the optimization has a drawback – it takes out of programmer’s hands the call stack control. This is generally true for every optimizing compiler (e.g.: when the compiler moves a static function into the calling place), but to a much lesser extent. MCC18 is capable of factorizing every bunch of assembly instructions in the middle of every C statement building up C lines by composing several subroutines. A nightmare to debug, a hell to understand in the disassembly listing and a sure way to eat those 31 call stack entries really quick.
I already wrote about how to recover from a stack over/underflow and restart debugging without having to re-program the chip. Now let’s see how to avoid the overflow at all.
First you can decided the stack overflow handling by setting the STVREN configuration bit. Basically you can chose among the “ignore” and the “trap” policies. Unfortunately, as we’re going to see, they are both rather ineffective.
Ignoring stack overflow means that when the limit is trespassed the execution continues jumping at the requested address without pushing the return address into the call-stack.
This means that at the first return instruction, rather than returning to the caller, execution jumps back last non-overflowing call.
Ideally, since overflow flag STKFUL (in STKPTR register) is set on overflow you could think of a function stub that checks this flag on entry. The trouble is that once you are in the called function you have no way to recover the return address since it is lost.
Changing ignore to trapping may seem more promising. In fact when this mode is selected, on limit trespassing the execution jumps to address 0x0 and the STKFUL flag is set. This acts somewhat like a reset, but since the micro state is not reset you could think of it as a trap.
Shamefully, yet again you don’t have a way to recover a return address, so you can’t do a save/restore call-stack.
After headscraping far too long on this problem, I decided to simplify the problem making the assumption that if the stack overflow occurs, it occurs only within interrupts. That makes somewhat sense since for sure interrupts are going to impact on the stack. So I added some code in the low level interrupt for checking the stack pointer against a threshold and saving all the hardware stack in a data memory region. This allows to reset the stack pointer and continue to execute the interrupt code. Before leaving the interrupt the opposite operation is performed.
Ideally it would have been nice to have the C run time support to handle the issue by adding a prologue/epilogue to every function so that hardware stack could have been “virtualized”. It is not much of a pessimization since you have already to handle the software stack for parameters and you may optimize out prologue/epilogue for functions that do not perform subroutine calls.
As of today no industrial C compiler implements this.

Next time I’ll write about Extended Mode.

PIC18F Software Project Survival Guide 4

This is the fourth installment of a series of posts regarding software programming on PIC18F cpu family. Previous installments: firstsecond and third.
IDE
The IDE, named MPLAB, is not brilliant, but it could be worse. I saw on the Microchip website that they are going to have a new edition – namely MPLAB X – (now in Beta) that is based on Netbeans. That means that what I am writing here may become quite obsolescent in the next months.
At the time of writing MPLAB X lacks of a number of feature (extended mode just to name one) that prevent its use in a professional environment. If you do not need extended mode, then maybe the worse part is that compilation is overly sluggish.
Anyway I commend the choice of Netbeans, which I prefer over the most widespread Eclipse.
Back to the current MPLAB IDE. You can configure the IDE to get it less uncomfortable. For example you can right click on top left corner of the windows and make them dock-able. When you are satisfied with the layout remember to save it because it is not granted that MPLAB restores it at the next run (it is about two years I am using MPLAB, but I haven’t found yet the reason for which often windows and panels layout is lost from a run to another).
You can redefine keyboard shortcut, but there isn’t a keyboard equivalent for each GUI control. E.g. the pretty useful “compile this file” is missing.
There is a “Goto Locator” contextual option that (possibly) brings you to the definition/declaration of the symbol under the cursor. You have to explicitly enable this option and it works only after a successful compilation. You can also enable auto-completion, but it seldom works. Goto Locator option is hidden nearly as the legendary pin in the haystack. Just open any text file, right click in the edit area and select “Properties”. “Editor Properties” dialog box pops up. Click on “General” tab and select “Enable Tag Locators”. Now that you are there you can switch to the “C File Types” tab and add line numbers, set the tab size and have the tab key to insert spaces . You can also have a look at the “Tooltips” tab where you could enable the auto-complete (which is mostly wrong), and at the tab “Other” where you can enable the ruler marker for the 80th column.
One of the most annoying limitation we experienced is the limitation to 32 characters for the find/find in files commands.

About debugging
Debugging is not a nightmare provided you have the ICD3 hardware debugger AND you keep enabling the software breakpoints (not that MPLAB 8.66 doesn’t let you use software breakpoints because of a bug, so don’t install that version).
(Well actually it is not a nightmare until you get short on Program Memory and you need to enable space saving optimization, but that it is another story).
Hardware breakpoints have weird limitations – first and foremost they are just three, and, at least one, is used for single stepping. Next they “skid”, when the debugger stops on a hardware breakpoint it never halts exactly at the line where the breakpoint is set, but a few lines after. If the line after the breakpoint contain a function call, then the execution may stop in that function call, i.e. somewhere else with respect to where you placed the breakpoint.
In the same way single stepping is much like ice skating, especially if you are trying to step over function calls.
Software breakpoints are immune the plagues that pester hardware cousins. Therefore I cannot see any reason, beyond self punishment, to stay with hardware breakpoints.
The only point of attention with software breakpoints is to routinely check that they are enabled since MPLAB seems to forget this option.
In the same way MPLAB is likely to forget the event breakpoints. These are special conditions that can be programmed to halt the debugger. One of such conditions is the processing of a Sleep instruction and the other is the stack overflow/underflow condition.
When running the debug version of a program, I usually #define assert to halt the debugger, so that failed assertion directly bumps up on the PC screen and I am able to examine the whole machine state. So Sleep break event is very handy (when MPLAB remembers about it, otherwise is… surprising).
Halting on stack overflow is welcome since there is no other way to detect call stack overflow. According to the configuration bit you may chose whether to reset or to continue
If you stumble into a stack overflow you are left at address 0, apparently without any chance to run the application again.
This is because the code has to clear the stack overflow/underflow condition, but this is prevented by the debugger that is halted because of such condition. At the same time this is a very clear sign that the stack messed-up. The confirmation for stack over/underflow may be obtained by a look at the STKPTR register (search for it in the View/Special Register File window). If STKPTR has bit 7 or 6 set then the stack exploded.
To run the program, unless you want to re-program the chip, you simply change the STKPTR register value to 0x01.

If you are looking for ram or data memory bear in mind that it is called “File Registers”. While “Program Memory” is the flash memory. Unfortunately program memory can be displayed only word by word (2 bytes) (and, confusingly enough, in little endian format). Conversely ram can be display only byte by byte.
The other serious limitation is that program memory located variables are not displayed neither in the watch window nor in the local variable window. Your only chance is to note down the address of the variable and look its memory content up in the memory dump window.
The feature I miss most is the stack view with the run-to-return debugging option. You cannot peek neither at the hardware stack (return stack) nor at the software stack.
You will find plenty of options that are not disabled when they should be, for example you can always select the stack analysis report, but it works only when the extended mode is selected (and when you don’t employ function pointers which is quite a constraint). Hardware stack is always displayed as filled by zero, and I haven’t found (yet) the conditions under which it reports something meaningful.
MPLAB is so clumsy when compared to any modern IDE (Eclipse, Netbeans, Visual Studio) that you may find yourself more comfortable to develop and maintain the code with one of those IDE and then to resort to MPLAB only for building the firmware and debugging it.

Next time I’ll write about memory.