Mental illness in microcontrollers — or in the programmer ?

A while ago I thought it was time to re-implement my clumsy attempt to have some sort of communication between 3 microcontroller boards. It is half-duplex only, so that can’t be hard, can it? As the ATtiny24s don’t have a real UART, I opted to encode the information in the length of pulses sent over just one I/O line. There’s only one available. The pulse length was measured in the main loop with a ‘time()’ function that reads start/stop times from the ‘system ticker’. Not very accurate at all. It worked, but was rather limited. I had come across software-uart routines before and gave it a try.

The most advanced routines I’ve seen implement this by using a pin-change interrupt to capture the start-bit (mandatory) and then off-load the job of sending/receiving the individual bits to a timer interrupt. The latter seems tempting, but all of my timers are already used. It might have been possible to dynamically reconfigure one timer (system ticker) to switch between ‘ticker’ and ‘uart’ mode, but I didn’t feel like changing things that are known to work reliably.

Therefore I went this way: Receiving the data at 9600bps is all handled by the pin-change interrupt, it captures the stop bit and samples the data as well. This takes about 1ms. After receiving a valid stop-bit, the ISR writes the data into a global variable and sets a global flag. The flag is later cleared in the main loop after data has been processed.

Sending is done by a simple function, which is blocking as well. The inter-bit-delays are implemented with _delay_us(HALF_BIT_DELAY) using avr-libc’s ‘delay.h’ code. The pin-change interrupt is enabled in the main loop, instantly disabled in the ISR itself. The result of this looks something like this on a logic analyzer:

The bottom graph (#5) represents the status LED, which is toggled in main. Some sort of heartbeat. Graph (#4) shows activity on the soft-uart’s rx pin. Here it is flooded with data. Graph (#2) shows the activity of the pin-change interrupt, which captures the start-bit and the rest of the data. It is triggered by a state change of (#4). Graph (#3) shows spikes that represent the times of sampling the bits.

The two topmost graphs show the behaviour of the other ISR code, which takes care of the LED drivers. The compare match interrupt runs on timer1, which is in CTC mode. This is quite important btw ;-) As you can see, the LED driver interrupt suddenly stops to do its jobs. First I suspected timer1 would halt for some reason, but couldn’t quite believe it. I added another compare match interrupt that should run at OCR1B = OCR1A/2. This one quit working as well. In another attempt to see what might be happening I also made a trace of the ‘system-ticker’ interrupt (timer0), but that kept working.

Then I had a closer look at the timing of the interrupts. I had already expected (and seen in the traces above) that the pin-change interrupt would at some point have to delay the compare-match interrupt. Timer1 uses a prescaler of 256, which evaluates to 31.25kHz at 8MHz system clock — or 256 clock cycles between the ticks. Therefore the compare match interrupt ,that is triggered by a certain value of timer1, must be considerably shorter than those 256 cycles, or things can get upset. Was that the case? Yes! Then total run-time was well below 32µs, which was verified by the logic analyzer as well.

Hrmph!

The code runs fast enough, but it still doesn’t work… What about the delays introduced by the soft-uart receiver interrupt… CTC mode… compare match… writing to OCR1X registers… hmmmmmmmmmm…

If everything works out as it should, timer1 counts up from 0 to OCR1A, resets the counter to 0 and sets the interrupt flag for the compare-match interrupt. The interrupt is then executed at the next possible slot. The hardware counter is not affected if the interrupt runs or not — or when. This looks like a sawtooth waveform if plotted.

At several spots in the ISR there is code like:

It is imperative that this happens before the timer has reached _some_value_. If this time is missed (delays by pin-change interrupt), timer1 has to count up all the way to 65536 and wrap around to 0 again. In case of a 16bit timer clocked at 31.25kHz, this takes about 2 seconds. For normal human beings this period of time is quite obvious. Especially if the lights should be turned ON.

What to do? Stop timer1 while the pin-change ISR is running? That would also interfere with the LED part. The compare match ISR takes about 5µs to complete, the soft-uart interrupt takes about 1ms (YES, that is terribly long… I know…). The half-bit-time at 9600bps is about 52µs. Shifting the center by 5µs is quite tolerable!

Therefore I chose to have the pin-change interrupt run as such:

As long as the pin-change interrupt takes considerably longer than the LED driver ISR, this method is tolerable. However when going to much higher serial speeds, it will fail at some point. It’s easier to get a chip with a real uart in that case. In hindsight I should have used something like an ATmega48. Almost the same size and many more features.

The corrected code looks like this on the logic analyzer:

Synopsis: think before you code ;-)

Now lets see what the next pitfall will be. Many more to come I guess.

This entry was posted in Electronics., Fix me., Software. and tagged , , , , , , . Bookmark the permalink.