You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
Exactly at 00:00 on January 1, FreeSWITCH rolls over the CSeq values used in NOTIFY packets. This makes our Poly phones mad. When they get their first NOTIFY with a lower CSeq, they reply with "500 Internal Server Error" and start spamming the server with SUBSCRIBEs. The resulting server load is easily high enough to impact service. The phones are fine after a reboot.
In looking at the RFCs, the phones seem to be in the wrong for expecting an ever increasing CSeq. But I'm trying to see if anything better can be done within real world limitations, including obviously the fact that we can't fix the phones' firmware.
To Reproduce
Steps to reproduce the behavior:
Use presence with Poly phones.
Wait until 00:00 January 1.
Expected behavior
This is the tricky part. Obviously I don't want the Poly phones to crash. But it's unclear to me what can be done better on the FreeSWITCH side.
I believe the whole reason that FreeSWITCH is doing this presence CSeq thing, including the presence epoch, is to avoid this problem when FreeSWITCH is restarted (which, if nothing special was done, would otherwise cause the CSeq to roll over).
In that first linked message, @anthmFS said, "Many phones break until you reset the whole registration if they are unhappy with the cseq." That seems to suggest that the phones might accept a CSeq rollover when they re-REGISTER. If that's the case (which is a big if), would it be reasonably possible to calculate the presence CSeq based on the timestamp of the last registration? If phones really do accept the rollover when REGISTERing, this would be better. If they do not, then this would make the problem massively worse, as the phones would freak out on every REGISTER rather than just once a year.
Package version or git hash
1.10.11~release~25~f24064f7c9~buster-1~buster+1 (i.e. f24064f) but the issue is obviously still present in the code in git master:
Describe the bug
Exactly at 00:00 on January 1, FreeSWITCH rolls over the CSeq values used in NOTIFY packets. This makes our Poly phones mad. When they get their first NOTIFY with a lower CSeq, they reply with "500 Internal Server Error" and start spamming the server with SUBSCRIBEs. The resulting server load is easily high enough to impact service. The phones are fine after a reboot.
In looking at the RFCs, the phones seem to be in the wrong for expecting an ever increasing CSeq. But I'm trying to see if anything better can be done within real world limitations, including obviously the fact that we can't fix the phones' firmware.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
This is the tricky part. Obviously I don't want the Poly phones to crash. But it's unclear to me what can be done better on the FreeSWITCH side.
There were some discussions about this in the past:
https://lists.freeswitch.org/pipermail/freeswitch-dev/2018-August/007873.html
https://freeswitch-users.freeswitch.narkive.com/TYeYlPf7/notify-s-cseq-too-high
I believe the whole reason that FreeSWITCH is doing this presence CSeq thing, including the presence epoch, is to avoid this problem when FreeSWITCH is restarted (which, if nothing special was done, would otherwise cause the CSeq to roll over).
In that first linked message, @anthmFS said, "Many phones break until you reset the whole registration if they are unhappy with the cseq." That seems to suggest that the phones might accept a CSeq rollover when they re-REGISTER. If that's the case (which is a big if), would it be reasonably possible to calculate the presence CSeq based on the timestamp of the last registration? If phones really do accept the rollover when REGISTERing, this would be better. If they do not, then this would make the problem massively worse, as the phones would freak out on every REGISTER rather than just once a year.
Package version or git hash
1.10.11~release~25~f24064f7c9~buster-1~buster+1 (i.e. f24064f) but the issue is obviously still present in the code in git master:
freeswitch/src/mod/endpoints/mod_sofia/sofia_presence.c
Line 2113 in 4470f8d
The text was updated successfully, but these errors were encountered: