-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FRU-Device does not work well with 16bit eeproms #1
Comments
Adding @pstrinkle, @amithash and @vijaykhemka as they are / have worked with this issue. |
Main issue is that it is hard to detect a device 8 bit vs 16 bit by reading it. In current implementation, assumption is device comes up with index pointer pointing to 0 offset. If it points to different offset/page then can't read header without writing. |
Yup. That's the primary difficulty. I have a device that is 16-bit addressed, but every other boot of the BMC, FruDevice changes its mind. So I implemented a quick hint-lookup that'll check and see if a device is "hard-coded" to be one or the other. However, this requires a lot of board knowledge -- and we mix 8-bit and 16-bit at the same smbus address. Although, we have some knowledge that if it's on bus 6 or 7 (for example) then it must be 16-bit. So, I have those hints available to the code. With the hint in place, it always works for me. |
When using multiple dbus-probe types, we were seeing: Program received signal SIGBUS, Bus error. 0x00475c6c in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count() () (gdb) bt #0 0x00475c6c in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count() () #1 0x00477820 in std::vector<std::shared_ptr<PerformProbe>, std::allocator<std::shared_ptr<PerformProbe> > >::clear() () #2 0x0046d594 in ?? () #3 0x0046e14c in ?? () #4 0x76f60bd0 in ?? () from /lib/libsystemd.so.0 Backtrace stopped: previous frame identical to this frame (corrupt stack?) The logic in this was quite bad, by moving the storage of PerformProbe shared_ptrs into the captures, we don't need to worry about calling clear ever, so we won't run into this problem. This was reordered to fix the issue. Tested: On system that frequently saw the crash, it went away, all sensors still available. Change-Id: Icacb8861466816df64b24efe940e5a732102345a Signed-off-by: James Feist <[email protected]>
Known issue, unfortunately I don't have any 16bit eeproms in my system to play with.
Start of solution here: https://gerrit.openbmc-project.xyz/c/openbmc/entity-manager/+/18783
The text was updated successfully, but these errors were encountered: