You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If a harvest job is currently running while harvester run is issued, it seems possible that a record currently being processed could be re-submitted to the fetch queue and processed again. This might lead to session rollbacks that detach the SqlAlchemy instance.
At least I'm seeing it with my site, where harvest jobs can take hours to finish, and harvester run can be run many times an hour during the harvest.
Maybe switching the order of database queries could fix this? Or is it possible to expire objects in the current session just before resubmitting jobs?
The text was updated successfully, but these errors were encountered:
I believe you could be correct. The problem I was seeing could have been a side effect of having the CKAN database on a remote machine, where aggressive firewall rules were closing the connection. CKAN harvesting does not appear to behave well when the client/server database connection is closed repeatedly, especially while a harvest is underway. I moved the database to the same machine as the CKAN application, and this problem of harvest object resubmission went away.
If someone else can confirm that proper harvesting behavior relies on a database connection remaining open indefinitely, I might suggest that CKAN administrators be made more aware of this.
@amercader
5ffe6d4
If a harvest job is currently running while
harvester run
is issued, it seems possible that a record currently being processed could be re-submitted to the fetch queue and processed again. This might lead to session rollbacks that detach the SqlAlchemy instance.At least I'm seeing it with my site, where harvest jobs can take hours to finish, and
harvester run
can be run many times an hour during the harvest.See #445, which shows the error I'm seeing.
Maybe switching the order of database queries could fix this? Or is it possible to expire objects in the current session just before resubmitting jobs?
The text was updated successfully, but these errors were encountered: