Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

merge V23.12.16 #556

Merged
merged 87 commits into from
Dec 27, 2023
Merged

merge V23.12.16 #556

merged 87 commits into from
Dec 27, 2023

Conversation

helgeerbe
Copy link
Collaborator

@helgeerbe helgeerbe commented Dec 19, 2023

This is a big merge for

  • v23.12.16
  • v23.12.18
  • v23.12.19

from upstream

On my test system it seems to run, OK

But tbnobody separated the loop() functions of the components to run in separated tasks.

So please check if your components are still running:

  • Victron Shunt
  • JKM BMS
  • Victron Charger
  • Huawei

I saw that you sometimes used mutexes, please check if they are still needed, because the loops runs now in tasks.

I also used the original MessageOutput, since it seems to be thread safe.

tbnobody and others added 30 commits November 18, 2023 22:34
@helgeerbe helgeerbe self-assigned this Dec 19, 2023
@helgeerbe
Copy link
Collaborator Author

@schlimmchen can you do a review please

@schlimmchen
Copy link
Member

@helgeerbe I am humbled that you are asking for my opinion. I will gladly inspect the changeset and give feedback. I can't tell you when that will be...

I saw that you sometimes used mutexes, please check if they are still needed, because the loops runs now in tasks.

This statement confuses me. If all the loop() functions now run in their own tasks, we need more mutexes. Many more. If that's really the case that every loop() function can now be preempted at any moment, there will be lots and lots of issues. I am really eager to learn what tbnobody did there and how it might affect the OpenDTU-OnBattery-specific features.

@schlimmchen
Copy link
Member

https://github.com/arkhipenko/TaskScheduler

Oooooh, I see! This actually sounds awesome! And now it makes sense that you are asking to remove mutexes.

I still have questions, e.g., how these new non-preemptive tasks interface with preemptive threads that are not under our control, like the WiFi driver and so on?

Very interesting. I will look at this in more detail soon.

@helgeerbe
Copy link
Collaborator Author

helgeerbe commented Dec 20, 2023

@schlimmchen Thanks, I would really appreciate your support.

I just did a simple merge and applied @tbnobody patterns to our code extensions. It seems to run well (at least with my victron charger). But I don't own the complete supported hardware components.

I assume even if everything will work well, we could refactor some code in future due to the newly introduced TaskScheduler.

@MalteSchm
Copy link

MalteSchm commented Dec 24, 2023

Hi,

I just installed this PR. As far as I can tell everything is working.

  • DPL works
  • getting data from Victron MPPT
  • getting / setting data from / to the Huawei PSU works
  • getting data from the Pylontech battery

Merry Christmas and thank y'all for the energy here !

@MalteSchm
Copy link

Adding to this:
No reboots in the last 2 days

@stefan123t
Copy link

@helgeerbe first of all let me thank you for your great job of maintaining this fantastic fork of OpenDTU for a variety of additional purposes. Especially the swift and timely manner in which you are merging upstream changes into your extensions are outstanding! 🎉 😄

I just did a simple merge and applied @tbnobody patterns to our code extensions.

I have only seen the code changes from upstream being merged into OpenDTU-OnBattery in this PR #556.
Where are the changes to the code you have written, which are extending native OpenDTU for Victron, Pylontech, Shelly 3EM, etc. ?

I assume even if everything will work well, we could refactor some code in future due to the newly introduced TaskScheduler.

I would expect something similar as in SunPosition.c 12031ed for most of your classes or code extensions, too.

@MalteSchm
Copy link

I would expect something similar as in SunPosition.c 12031ed for most of your classes or code extensions, too.

I don't fully understand. This is done. Could it be that you only lock at one commit and not at all of the changes?

@helgeerbe
Copy link
Collaborator Author

OK, Thanks for the positiv feedback. So I will merge this PR.

@helgeerbe helgeerbe merged commit d494810 into development Dec 27, 2023
10 checks passed
@schlimmchen
Copy link
Member

I appreciate the changes introduced with this PR, i.e., the work done in the upstream project to use TaskScheduler in favor of extending the loop() paradigm. This has the potential to make the project more efficient and responsive.

Where are the changes to the code you have written

@stefan123t They are part of the merge commit 7c11a5a. @helgeerbe did all the work required to migrate all OpenDTU-OnBattery-specific components to TaskScheduler. Have a look at the changes to src/Battery.cpp in the aforementioned commit for an example.

He did it in the most backwards-compatible way, which makes perfect sense. This means the components have their loop() methods made private, but they are called periodically as often as possible, just like before. In the near future, we should update the respective tasks' frequency dynamically (which is explicitly supported) depending on the actually required frequency. The DPL uses a dynamic backoff time which should be used here to schedule the next iteration of the DPL's loop at the appropriate time instead of checking every time in the loop() whether or not the backoff elapsed. The same goes for the JK BMS Controller for sure. Also the VE.Direct loop() can be called at a reasonable frequency since we know only one message per second will arrive.

Regarding the mutex in BatteryClass: This protects _upProvider and must be part of the newly implemented updateSettings(). Currently that function accesses _upProvider without locking the mutex, which happens when the battery settings are changed from the WebUI. This is probably an oversight, as it was done the right way in VictronMppt.

And this is where my concers start: As far as I can tell, AsyncWebServer does indeed still run in a thread other than the main loop() thread, i.e., the TaskScheduler context.

As before, all `loop()´ functions that we care about run in the same context. Now they are part of the TaskScheduler. However, also as before, some callbacks run in the context of the respective caller, i.e., MQTT and WebServer callbacks in particular. Those are not part of the TaskScheduler context. Same with the WiFi driver (and potentially other drivers as well), which also may execute callbacks that access shared data structures.

As an example I looked closer at AsyncWebServer. I don't see that AsyncWebServer would have any knowledge about TaskScheduler such that it would ask TaskScheduler to execute callbacks as TaskScheduler tasks. Instead, the callbacks are (still) executed in the "async_tcp" FreeRTOS thread, which is scheduled pre-emptively alongside the TaskScheduler thread.

Hence I don't see how we could remove mutexes and locking mechanisms, as the concurrency issues are the same as before.

This is based on reading code as I am unable to conduct experiments right now. However, I am fairly confident that my assessment holds true.

I also used the original MessageOutput, since it seems to be thread safe.

It absolutely is not. Serial output will be garbled from time to time. Not within a single println() or printf() call, but in between words or formatting boundaries. And worse, we are back to the original limitation of the static buffer, which causes JK BMS and VE.Direct verbose logging messages to be cut (very) short in the web console. I will propose a new PR that restores the previous MessageOutput implementation by myself. It ran perfectly well for months now at all devices that used respectively recent releases.

For the GridProfileParser, the library "Frozen" was added. That's also very interesting and can serve performance improvements and code reduction for JK BMS and VE.Direct at least.

If this was not already released, I would argue to go ahead and do so. If users experience errors, we can address them as needed. This is a good time for experiments as sun is scarce anyways and people might have more time than usual to report, inspect and fix issues.

#350 was effectively reverted. Let's keep that in mind in case spontaneous reboots or data corruption are reported while using the MQTT interface. We might need to have a look at all MQTT callbacks within OpenDTU-OnBattery.

Copy link

github-actions bot commented Apr 4, 2024

This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new discussion or issue for related concerns.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Apr 4, 2024
@schlimmchen schlimmchen deleted the v23.12.16 branch April 16, 2024 16:56
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants