Arduino UNO R3 some limitations

Hi All

This post is rather lengthy so bear with me.

Firstly this post is not meant to be critical of Arduino. It is still and has been an excellent product and does what it has been designed for very well. BUT if some boundaries are exceeded like what I have been experimenting with it will fall a bit short.

The following is meant to help those who may have been experiencing some unexplained or funny results, and those who may be considering trying a bit much. All devices, no matter how good, have some limitations and it does more good than harm to be aware of them. Makes life much easier.

All started out with an idea for a pulse generator with separate control of period thus frequency and pulse width thus duty cycle. Displaying these 4 parameters on a suitable device.

The idea here is to have a unit suitable for testing various small motors:
50% duty cycle and variable frequency to provide a pulse train for stepper driver board.
Set frequency and variable duty cycle for brushed motor speed control.
50Hz (20mSec period) and variable pulse width for servo motor control.

Sounds easy ???

Using Pots or Rotary Encoders it is not difficult to generate the numbers for Period, Frequency, Pulse Width and Duty Cycle and display these. Also the numbers (microseconds) for pulse HIGH and pulse LOW.

Generating these pulses is another matter. I don’t think it can be done with 1 device. The processing and displaying time, which can be significant, is effectively added to the OFF time with impossible results so 2 devices are needed. One to generate the numbers and display and the other dedicated to generate the pulse stream. Or a device capable of true multitasking. I have yet to work out how best to do this. I have provisionally looked at the dual core Pico but I think you have only one memory position accessible to both cores, I need 2.

I might have to go to plan B. Still work in progress.

Back to the subject of this topic.

Using Arduino UNO Ver 3 (Freetronics Eleven) which I am more familiar with.
Oscilloscope. Atten ADS1022C. Calibration status unknown. Checks OK against Function Generator and internal calibrator.
Function Generator. UNI-T UTG932E. New.

While experimenting I discovered that the “digitalWrite” command seems to add more than 4µSec to everything. This means that a 10µSec pulse effectively becomes over 14µSec. At 1kHz and 50% duty cycle this is more than 0.8% and gets worse as the frequency gets higher as the 4+µSec is constant. Not good. This time seems to change as well causing a “jitter” as observed on oscilloscope. Have not been able to get to the bottom of this “jitter” yet. A bit hard to pin down as it appears to be completely random and not cyclic so difficult to catch. It could even possibly the timers (micros() and millis() running out and wrapping to start again. I am not sure how often this happens.

This 4 point something µSec addition is pretty simply described in an article here.

And concerns some housekeeping carried out by “digitalWrite” before applying the command. The author describes a method to bypass this housekeeping which reduces this 4+µSec to 124nSec which could be ignored for most practical purposes.

The thing I would not agree with here is that the author has attempted to measure this 4µSec by averaging 1000 readings using the timers in the device he is measuring. He came up with a figure of a bit over 3µSec. I prefer to use another instrument which I have demonstrated as he later states the internal timers have a resolution of 4µSec which I haven’t been able to verify yes or no.

Now to my measurements.
Pic 1 is just my very untidy workspace. Working real estate is almost non existent these days


Following is a short sketch to measure the delays due to “digitalWrite”.

void setup() {
   // put your setup code here, to run once:
  pinMode(6, OUTPUT);
  pinMode(5, OUTPUT);
}

void loop() {
  // put your main code here, to run repeatedly:
  digitalWrite(6, HIGH);
  delayMicroseconds(50);
  digitalWrite(6, LOW);
  
  digitalWrite(5, HIGH);
  digitalWrite(5, LOW);
  digitalWrite(5, HIGH);
  digitalWrite(5, LOW);
}

The sketch generates a 50µSec pulse on pin 6 to indicate start of loop. In each of the following cases this is monitored by oscilloscope Ch 1 (yellow trace) and triggered Ch 1. 2 pulses are generated on pin5 with no delays so the only delay will be due to “digitalWrite”. Monitored on Ch 2 (blue trace).

The delays of 4.38µSec and 4.5µSec can be clearly seen and close inspection shows Pin 6 is slighter greater than 50µSec.

Next Pic is expansion of last where the delays may be clearer

Next Pic. Further expansion shows the delay between the last pulse on Pin 5 and the start of loop pulse on Pin 6. Most of this delay will be the 4.38µSec delay for the “digitalWrite” on Pin 6 plus a little bit to start the next loop.

Next uploaded the following sketch which replaces the “digitalWrite” with the direct pin manipulation as described in the linked article above. Note Pin changes to get pins both on Port B.

void setup() {
  // put your setup code here, to run once:
  pinMode(8, OUTPUT);
  pinMode(9, OUTPUT);

}

void loop() {
  // put your main code here, to run repeatedly:
  PORTB = PORTB | B00000001;
  delayMicroseconds(10);
  PORTB = PORTB & B11111110;
  PORTB = PORTB | B00000010;
  PORTB = PORTB & B11111101;
  PORTB = PORTB | B00000010;
  PORTB = PORTB & B11111101;

}

As you can see I have removed all the delays except the start of loop indicator pulse (10µSec)
Result in next Pic, much faster.

Next Pic is the expanded view of the relevant pulses. The delay is now 124nSec. The clock period of 16MHz is 62.5nSec so this looks suspiciously like 2 clock pulses to execute this command which is what you would expect.

Next Pic is a further expansion of the same.

The next Pic is an attempt to confirm the scope measurements. The best comparison I have available (except for the built in calibrator which is spot on at 1kHz) is a 2 week old Uni-T function generator. I connected to scope Ch 2 leaving Ch 1 on the Arduino start of loop pulse. As you can see I got very close and by adjusting the function generator to the same 8.88µSec and the FG frequency to get an almost stationary trace the results almost spot on. Ch 1 says 8.88µSec, Ch 2 says 8.88µSec and FG says 8.88µSec. And frequency Ch 1 says 102.5kHz and FG says 102.5014kHz. I had trouble getting a stationary blue trace probably due to the small difference.

Pic of Function generator screen.

The next Pic I re established the pulses and connected Ch 1 to the Function Generator outputting a pulse of 124nSec, triggering off Ch 2. As you can see the scope measures 124nSec on both Ch 1 and Ch 2 so from here on I will believe the scope.

Next a Pic of the FG screen for verification.

During this I had noticed that now with the “fast” digitalWrite the pulses are not long enough. The loop start indicator is now 8.88µSec instead of the requested 10µSec. I had previously noted the 16MHz clock speed measured 16.13MHz (on scope) and I put that down to possible probe interference. BUT the period for 16.13MHz works out at 62nSecwhich would explain the 124nSec delay measurements. If this were the cause of this error it would appear as a percentage, not a consistent 1.12µSec across the board.
I thought this might have something to do with the stated (but unverified) resolution of 4µSec in the timers so I loaded another sketch with a 10µSec start indicator, a pulse of 8µSec, a delay of 20µSec and another pulse of 12µSec. The 2 pulses having a direct relationship with 4µSec.

void setup() {
  // put your setup code here, to run once:
  pinMode(8, OUTPUT);
  pinMode(9, OUTPUT);

}

void loop() {
  // put your main code here, to run repeatedly:
  PORTB = PORTB | B00000001;
  delayMicroseconds(10);
  PORTB = PORTB & B11111110;
  PORTB = PORTB | B00000010;
  delayMicroseconds(8);
  PORTB = PORTB & B11111101;
  delayMicroseconds(20);
  PORTB = PORTB | B00000010;
  delayMicroseconds(12);
  PORTB = PORTB & B11111101;

}

Next Pic. Overall result. Note Ch 1 8.87µSec instead of 10µSec.

Next Pic. Expanded first pulse. 6.88µSec instead of 8µSec .

Next Pic. Expanded second pulse. 10.88µSec instead of 12µSec.

Next Pic. Expanded delay between pulses. 18.87µSec instead of 20µSec.

Just to prove this error os not a percentage I increased the delay between pulses to 100µSec.
Next Pic the result. 98.88µSec.

I realise this is a very very long post for which I apologise but I could not think of a way of making it any shorter. I think the pictures while taking up much space shorten up any explanatory text by much more.
I hope this might explain some past little errors and prevent future hair pulling.
I have yet to track down the so called 4µSec resolution of the timers, maybe some of the Arduino Gurus could help here.
Also the missing 1.12µSec in pulse width when using the “fast” digitalWrite could do with some explanation.

have a NanoEvery available so I might quickly repeat some of this in case there is any difference between models with the same processor. I don’t really expect any.
Thanks for your patience.
Cheers Bob

1 Like

Looks like an important write-up. Gotta read with more time in hand. Thanks.

Hi Robert,

Excellent write-up! I wonder how your results compare to using the timer peripheral to drive the pin directly :thinking:

It’d be interesting to see how this compares to faster 32bit parts too!

Thanks for sharing!
-James

Hi All
Add on:

Repeat some measurements using Arduino NanoEvery.

First experiment using digitalWrite,
Delay seems to be worse than UNO by nearly 1.5µSec. ; now approx 5.8 – 5.9µSec.

Second measurement using direct port control
Failed to compile. Difference apparently the NanoEvery uses a different controller IC. After a bit of research I discovered another library reputed to work with Every. Namely “digitalWriteFast.h”. Installed this and included in sketch.

#include <digitalWriteFast.h>
void setup() {
  // put your setup code here, to run once:
  pinMode(9, OUTPUT);
  pinMode(10, OUTPUT);

}

void loop() {
  // put your main code here, to run repeatedly:
  digitalWriteFast(9,HIGH);
  //delayMicroseconds(10);
  digitalWriteFast(9,LOW);
  digitalWriteFast(10,HIGH);
  //delayMicroseconds(8);
  digitalWriteFast(10,LOW);
  //delayMicroseconds(100);
  digitalWriteFast(10,HIGH);
  //delayMicroseconds(12);
  digitalWriteFast(10,LOW);

}

Note the delays are commented out to try the “no jitter” experiment.

Result:
Delay now 62nSec. Noted the 10µSec loop start indicator pulse is 8.88µSec as with the UNO.

Uploaded sketch with delays of 10µSec on pin 9, 8µSec and 12µSec on pin 10 with 100µSec between. These pulses measured – 8µSec is 6.88µSec, 10µSec is 8.88µSec, 12µSec is 10.88µSec and 100µSec is 98.88µSec. Same as the Arduino UNO.

As a further experiment I uploaded the sketch using “digitalWriteFast” to the Arduino UNO to see if “digitalWriteFast” would compile. Compiled and uploaded OK. Delays still 124nSec as with direct port manipulation so it would seem this library is useable in both UNO and NanoEvery.

Of note was the amount of “jitter” present with all the delays operational. As best I could ascertain this “jitter” seemed to be a change in delay times. Very hard to pin down. I could more or less verify this by removing all of the delays and just displaying the “digitalWriteFast” pulses. Under these conditions the “jitter” disappears. In practise this probably would not be noticeable until you wanted to display some numbers like frequency, duty cycle etc then it might not look too good unless these numbers are displayed at a lower resolution.

This still does not explain the missing 1.12µSec from the pulse widths set with the “delayMicroseconds()” when using “digitalWriteFast”. Once again in practise this probably does not matter unless you need a very accurate pulse width for timing purposes.

But as I stated earlier if you are using a device and know the limitations lots of things are explainable when all does not go quite as expected and you might get to the point where the device you are trying to use will just not quite make it and a compromise has to be reached or try another component.

Cheers Bob

I did not try to understand exactly what you are trying to achieve here. But I’m prepared to say the limitation is not in the device (ATmega328P) but the language. I programmed an ATmega328P to be a top octave generator, simultaneously generating 12 frequency outputs on 12 pins, from 4434Hz to 8368Hz. This is impossible (and I don’t use that word lightly) using any high level language, but can be achieved in assembler (with an estimated CPU use of around 95%). The transitions are on 500ns boundaries, and because it is not possible to write to two ports at the same time there is a 65ns offset between the two ports used.

I am not sure what mechanism you use, the TOG generates the output in an interrupt routine. The interrupt gets delayed by the time of the instruction being executed at the time of the interrupt. This could cause a variable delay and hence jitter. To alleviate this a timer runs continuously and the ISR synced to it. The anti jitter routine delays between 4 and 6 instruction times.

	in      R24,TCNT0 ; jitter reduction.
	sbrc    R24,1     ; Timer normally xxxxxx00
	rjmp    PC+3      ; if xxxxxx10 delay 4 cycles [in(1), sbrc(not taken: 1), rjmp(2)]
	sbrs    R24,0     ; if xxxxxx01 delay 5 cycles [in(1), sbrc(taken: 2), sbrs(taken: 2)]
	rjmp    PC+1      ; xxxxxx00 delay 6 cycles [in(1), sbrc(taken: 2), sbrs(not taken: 1), rjmp(2)]

Unfortunately this can’t handle CALL and RETURN in the mainline, they create longer delays. So where CALL/RETURN would normally be used it is synthesised by stowing a return address and the ‘called’ routine returns with an indirect jump.

I haven’t looked at threading on the ATmega328P but I do it on an 8 bit PIC. The technique is straightforward. When an interrupt occurs, save the relevant status and enable interrupts. When the task is completed, disable interrupts, reinstate status and do a return from interrupt.

On the PIC I have a software 9600 baud serial input. The bit shuffling can be done in the ISR but as soon as a character is complete it has to be dealt with so the next character can be received. So a thread is created to deal with it. The thread moves the character into one of two buffers, assembling a record. If the record is complete, the buffer pointer is swapped and a second thread is created to deal with the record just received. That thread checksums the record, extracts some data and signals the mainline which gets around to it eventually.

3 Likes

Hi Alan
Thank you for that in depth reply.

I started out trying to generate reasonably accurate and stable pulses with an Arduino. I did not seem to be able to achieve this hence all the measurements. I have not considered the Atmega chip in isolation but dealt with Arduino UNO R3 and NanoEvery as a complete device.

I basically did those measurements to satisfy myself and try to determine where the inaccuracies occurred and what, if anything, could be done about it. I posted the measured results as information for the wider Forum community which may or may not be of use and possibly help with hair pulling and sleepless nights.

I realised the problem would probably not be in the Chip itself but rather the way it is used and there would be machine language ways around it which you have pointed out. I think that is what the method used where a whole port is manipulated and that digitalWriteFast library does.

I think that 1.12µSec reduction in a “delayMicroseconds” time might always be there and under “normal” situations it is masked by the several µSec delay in the “digitalWrite” operation and only shows up when the “fast” version is used. I haven’t worked out how to investigate this yet.

Unfortunately my knowledge of machine language is limited to knowing the IC eventually works with 0’s and 1’s and a compiler software does its magic to convert my humble efforts to a form that an IC can understand. So when it comes to all the 0’s and 1’s or even the short bits in your bit of code it is like trying to read Swahili or something. I am trying to say I know all this exists but how it actually works I have to leave to people like your good self. In other words I am basically an analog person.

But like I say. Knowing a device’s limitations is a good start. That way you don’t go mad trying the impossible. Maybe “impossible” is the wrong word. Improbable might be better.
Thanks again for your input.
Cheers Bob

1 Like

OK I read a bit more of what you are trying to achieve. And looked at the specs of the ATmega4809 which I believe is the heart of the NanoEvery.

You have created a problem for yourself by trying to generate your pulses programmatically. This should be offloaded to a PWM (pulse width modulator). You program a timer to run for the period you want (being the inverse of frequency) and the high period into the PWM timer to get your duty cycle, and you are done. The hardware generates the required pulse train until you change it. The ATmega328P has a 16 bit timer which can be clocked between 1 and 1/1024 times the base frequency. This could give you pulses from microseconds to seconds. Once your pulse period is in the millisecond range, should be able to generate pulses accurate to 0.1% which is probably all you need.

The ATmega4809 does this on steroids, it has multiple timers and PWM outputs.

There’s a bit of a learning curve to learn the correct incantations for the various registers. But I see no reason you’d need to resort to assembler. But it may not be portable. If one of the ATmega4809 timer/PWM are backward compatible with the ATmega328P then it may work. It’s over 4 years since I looked at the ATmega328P in depth, But if you get stuck the documentation is still on my PC.

There’s little difference between looking at the microprocessor and at the board. All the board does is feed the microprocessor (power management, USB/serial connection) and bring the signals to the edge of the board. The TOG was a Arduino Nano.

1 Like

Hi Alan
I thought I said in the post that I was intending to come up with a pulse generator suitable for trying / testing brushed motors, servo and stepper devices. To be any use the frequency and duty cycle has to be continuously variable. The method of using PWM is OK and works but means changing the sketch parameters every time one of these has to be changed. Variable adjustments usually means pots or rotary encoders. Either one requires significant processing time to read controls and manipulate the numbers to suit. This time is effectively added to the pulse LOW period and renders the whole idea almost impossible.

But… Another thought was to use 2 devices or a true multitasking one. Use one to generate the numbers which is not difficult. The second to somehow take these numbers and generate the pulse train. Two numbers are needed, pulse high and pulse low. I had provisionally looked at RPi Pico with dual core capability but is seems only one memory location is available to both cores at once and my idea needs 2.

Anyway this whole idea has got beyond the simple home constructed pulse generator so I will look at Plan B. My measurements and post has been an exercise to find out where and why the errors occurred.
Also to provide this info for others who may be wondering why a particular arrangement does not go as expected. Actually that is what started me measuring.

Plan B… Generate the variable pulses with hardware and measure it. This measuring seems to have problems of its own but still possibly feasible. I am looking for and trying OP Amps at the moment. This generator only needs to go to 20 or 30kHz but a decent square wave will have all the odd harmonics to about 11, 13 or even 15 so the bandwidth has to be something above audio. Another story.
Cheers Bob

Did you take this into account?

The designers of the Timer/PWM have taken what you want into account. When you say “continuously variable” I take that to mean “generating according to a set of values until I rotate a pot to change it”. Not “sweeping from 10kHz to 5kHz in 3 seconds” which really is continuously variable. So your program sits looking at the pot, reading its value. When you read its value you do the calculation period = n microseconds. And you want a 50% duty cycle. So you put the appropriate values into the timers while the PWM is running - you don’t need to stop it. The timers are double buffered so the hardware will automatically load those values into the working registers at an appropriate time after you put the values into the buffer registers. There will be no delay caused by software. Believe me, it does what you want.

The documentation is Atmel-42735-8-bit-AVR-Microcontroller-ATmega328-328P_Datasheet, the section is TC1 - 16-bit Timer/Counter1 with PWM. It is not easy reading but there are sketches on the Internet that will get you started. Just search for “arduino pwm variable frequency”. Some of the hits are amateurish, some have good stuff. I couldn’t pick one and say “this is all you need”. Most talk about
analogWrite(pin, dutyCycle)
but I don’t think it has the flexibility you need, and you will have to write directly to the appropriate registers.

Hi Alan
Once again thank you for your in depth help.

That is exactly what I mean. I may have not worded my original statement clearly. You are quite correct.

I did not realise this. I will have another look at it. I thought the process stopped while the processor looped around reading pots and things. Because I thought that I didn’t look too closely at it (PWM).

To be of any real use I need to control both frequency and duty cycle but preferably frequency and pulse width and calculate duty cycle for display purposes. Pulse width would be more useful (read easier) than duty cycle for working with servos while duty cycle would be more relevant for brushed motors. I will probably have to dive a bit deeper than my current expertise but will certainly have a look at this possibility of using PWM as in “analogWrite” command. I think there are libraries around that will change frequency.

As stated I am basically an analog person and in the past dived far enough into the digital world to suit the occasion. Fortunately I have had access to the right people to question and ask for help and I have not been afraid to use this facility. In the commercial world one has to use the best and quickest means to a result, there is little time for much study. But this has not got down to the nitty gritty of processor data sheets etc but more into the realm of transporting these digital signals from point A to point B and some PLC programming etc.
I am afraid this puts me into the “knows enough to be dangerous” category as far as computer sciences go. BUT, I know this so realise when it is time to stop and start asking those who know for help. Unfortunately as I have been retired for 20 years those people I had access to are no longer available so I muddle along within my limitations.

With XMAS on the door step the next week or so is earmarked but I will explore the pointers you kindly provided and do some more experimenting with PWM as time allows.
Cheers and Merry XMAS Bob

There’s only semantics differentiating pulse width/duty cycle. Given a frequency, you can calculate one from the other. When you get around to it, you’ll find you need to calculate in periods. So if for instance you wanted 300Hz with a duty cycle of 30%, you need to translate that into multiples of the CPU clock frequency (from which all timing is derived). 16MHz/300Hz = 53,333 cycles. duty cycle of 30% means high for 16,000 cycles (by deduction low for 37,333).

So you would set up the timer 1 clock to have a 1:1 prescale (i.e the timer clock is the CPU clock), select fast PWM mode, set 53333 in the top register (where the timer resets) and 16000 in the compare register (when the PWM transits from high to low).

I don’t know what range of frequencies you want, a little less than 300Hz you need to select a prescale value e.g.1:8 - 8 CPU clocks = 1 timer clock - allows a PWM frequency down to 30Hz. The highest prescale is 1024:1, gives a frequency in fractional Hz. You pick the prescale that handles the lowest frequency you need, you do not want to change the prescale while running.

There’s a trade off between range and accuracy. If you are happy with 1% accuracy at the highest frequency then you are loading numbers larger than 100 into the top timer. If you want 0.1% then larger than 1000. A 16 bit timer runs up to 65535 so 100:65535 is more than 600:1 ratio highest to lowest frequency. I’m guessing that’s enough for your purpose.

Hi Alan

Definitely agree with that. I just thought it would be nice (and useful) to display both pulse width and duty cycle). I did your suggested search and so far have found this site

which has some useful info. Between that and your useful input I think I have enough to play with for the time being. Below 50Hz is required for servos (operate at 50Hz) and from what I glean from other posts some of the micro stepping steppers can go to 30kHz so I was thinking say 30Hz to 30kHz or 25Hz to 25kHz wile be useful for the home experimenter.

There is a trade off with everything you do in this world. But for the purpose 1% I think would be plenty good enough.

I was not aware of this. Thanks for the tip.

I m really only doing this for a brain exercise and knowledge expansion. I have my second driving test next month (January) so am getting to the stage where I would have very little practical use for this project myself anyway. If I do I have a nice little function generator (Uni-T UTG932E) which will do all of this and anything else I can think of at the moment. Goes really slow, 1µHz which equates to 1 cycle in 11.57 days. Haven’t found a use for that sort of speed yet but looks good as a marketing blurb.
Thanks once again for your info and between that and a bit of research I might come to grips with all this yet.
Cheers Bob