Skip to content

Improve _normalizeAngle speed.#530

Draft
jaguilar wants to merge 1 commit into
simplefoc:devfrom
jaguilar:opt_normalize_angle
Draft

Improve _normalizeAngle speed.#530
jaguilar wants to merge 1 commit into
simplefoc:devfrom
jaguilar:opt_normalize_angle

Conversation

@jaguilar
Copy link
Copy Markdown

Description

Previously _normalizeAngle used fmodf, which is expensive on Cortex-M series chips (40-100 cycles). For normalizing the angle, we can use a trick of just subtracking 2 pi times the number of full rotations from the angle.

This method does lose precision much faster than the fmodf approach. There will be significant error if the total number of radians passed in is greater than 10^5. However, we can counteract this by not passing in such a large number of radians -- basically making sure that we are not using an accumulating radian count in any location where the angle will eventually need to be normalized. This should typically be the case since accumulating radian use cases are definitionally not in need of normalization. We should consider adding an assert here to ensure that we catch any existing cases where we should be passing the mechanical or electrical angle and are instead passing an accumulator.

Performance test (on STM32G474, with certain other optimizations):

Before: foc_nanos:13737
After: foc_nanos:12550 (approx 9% reduction in loopFOC time)

Type of change

  • Optimization

How Has This Been Tested?

I built an arduino-foc-based test program based on this commit and observed that the motor still worked fine. I also benchmarked the code and observed it to be faster. I have not tested all possible control modes so it is not impossible that this approach is unacceptable in some configurations. Please review carefully and suggest anything further that I should check.

Test Configuration/Setup:

  • Hardware: STM32G474 with custom gate driver board
  • IDE: STM32CubeMX/VSCode/CMake
  • MCU package version (stm32duino/arduino-esp32/..): N/A, using my own custom Arduino shims

Previously _normalizeAngle used fmodf, which is expensive on Cortex-M
series chips (40-100 cycles). For normalizing the angle, we can use
a trick of just subtracking 2 pi times the number of full rotations from
the angle.

This method does lose precision much faster than the fmodf approach.
There will be significant error if the total number of radians passed
in is greater than 10^5. However, we can counteract this by not passing
in such a large number of radians -- basically making sure that we are
not using an accumulating radian count in any location where the angle
will eventually need to be normalized. This should typically be the case
since accumulating radian use cases are definitionally not in need of
normalization. We should consider adding an assert here to ensure
that we catch any existing cases where we should be passing the
mechanical or electrical angle and are instead passing an accumulator.

Performance test (on STM32G474, with certain other optimizations):

Before:   foc_nanos:13737
After:    foc_nanos:12550 (approx 9% reduction in loopFOC time)
}
// Normalize the shaft angle after each iteration to prevent it growing indefinitely
// and eventually losing precision.
shaft_angle = _normalizeAngle(shaft_angle);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you test this? It seems to me we’ll never reach the target as shaft-angle is used to track the current position in open loop angle mode?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part was not tested with angles greater than 2PI. I agree that if the user sets angles outside the range, it wouldn't work. We should probably also normalize the target angle beforehand. WDYT?

@jaguilar
Copy link
Copy Markdown
Author

jaguilar commented May 6, 2026

Okay, here is my suggestion. Let's drop all the parts of this commit that call _normalizeAngle in more places. Currently the only real place where _normalizeAngle is called on an accumulator is the shaft_angle of angle_openloop. I don't think it's very important to get high precision for very high angles here. If someone is willing to set a target angle above 50k, They will potentially lose some precision, but I don't think that is a typical use case, especially not in openloop mode.

My upcoming changes to _sincos will require normalizing some things that aren't currently normalized, but we will do that closer to the callsite. wdyt?

@dekutree64
Copy link
Copy Markdown
Contributor

dekutree64 commented May 27, 2026

Agreed. Changing to floorf is good, but eliminating the need to normalize electrical angle was one of the big wins of the current _sincos over the original non-interpolated 200-entry quarter-wave LUT. I'm curious to see what you have in mind if the added cost of normalizing would still come out ahead. If you're using the STM32's CORDIC, be aware that we already have some code for it here https://github.com/simplefoc/Arduino-FOC-drivers/tree/master/src/utilities/stm32math

For dealing with large angle precision, I still stand by my proposal here #302

@jaguilar
Copy link
Copy Markdown
Author

If you're using the STM32's CORDIC, be aware that we already have some code for it here https://github.com/simplefoc/Arduino-FOC-drivers/tree/master/src/utilities/stm32math

For dealing with large angle precision, I still stand by my proposal here #302

Yes, that's exactly the problem. That code uses fmod which is very very slow.

@dekutree64
Copy link
Copy Markdown
Contributor

dekutree64 commented May 27, 2026

My recommendation would be to use CORDIC->WDATA = (q31_t)(a * (32768.0f / _PI)) << 16;, which should give equal resolution and overflow range as the interpolated LUT _sincos, and effectively one CPU cycle to do the normalization.

But IIRC the CORDIC sincos with fmod was roughly the same speed as LUT before, so even if you do go with the floor trick it should be a nice speedup plus precision boost compared to the LUT. But only on STM32, so I think we should keep the normalization inside the CORDIC sincos rather than reintroducing it to the motor code.

For other places in the code where normalization is necessary, the floor trick is an excellent improvement. I never knew it was so much faster than fmod.

It's too bad we can't just use 16-bit fixed-point angles everywhere. The tricks you can do are so much fun. 32-bit for large angle values, cast to u16 for 0-2pi like _normalizeAngle, or cast to s16 for -pi,+pi representation. And sine/cosine are super fast without having to scale the range and cast to int.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants