Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pyswip and thread #3

Open
BartholomewPanda opened this issue Aug 3, 2015 · 9 comments
Open

Pyswip and thread #3

BartholomewPanda opened this issue Aug 3, 2015 · 9 comments
Assignees

Comments

@BartholomewPanda
Copy link

Hello,

I think there is a segmentation fault error when pyswip is used in a thread. Here is a simple code to reproduce the error:

#!/usr/bin/env python3
# -*- coding: utf-8 -*-

import threading
import pyswip

class MyThread(threading.Thread):

    def __init__(self):
        threading.Thread.__init__(self)

    def run(self):
        p = pyswip.Prolog()
        p.assertz('animal(dog)')
        p.query('animal(X)')


t = MyThread()
t.start()
t.join()

When I debug the program, the segmentation fault error occurs at this line (prolog.py, line 91):
swipl_fid = PL_open_foreign_frame()

Regards,
Bartholomew.

@chandralekhapoo
Copy link

Hi,

Im also facing the same issue while trying a prolog query in my code

python analytics.py

(Thread-1 ) wait_for_reading_data from redis db
(Thread-2 ) wait_for_writing_data from redis db
(Thread-1 ) wait_for_reading_data from redis db
....first pwsip query start
Segmentation fault (core dumped)

Code Debug using gdb
#################################################################################

gdb -ex r --args python analytics.py

GNU gdb (Ubuntu 7.7.1-0ubuntu5~14.04.2) 7.7.1
Copyright (C) 2014 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
http://www.gnu.org/software/gdb/bugs/.
Find the GDB manual and other documentation resources online at:
http://www.gnu.org/software/gdb/documentation/.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from python...Reading symbols from /usr/lib/debug//usr/bin/python2.7...done.
done.
Starting program: /usr/bin/python analytics.py
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
New Thread 0x7ffff46c0700 (LWP 19112) wait_for_reading_data from redis db
[New Thread 0x7ffff3ebf700 (LWP 19113)]
New Thread 0x7ffff36be700 (LWP 19114) wait_for_writing_data from redis db
(Thread-1 ) wait_for_reading_data from redis db
....first pwsip query start

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff36be700 (LWP 19114)]
0x00007ffff5050941 in ?? () from /usr/lib/libswipl.so.7.2.2
(gdb)

#################################################################################

Regards,
Chandralekha

@bbferka
Copy link

bbferka commented Mar 30, 2017

Did anyone make any progress on this? I am having the exact same issue when using Pyswip in combination with Flask

@xpinguin
Copy link

xpinguin commented Jan 5, 2018

pyswip doesn't support multithreading at all. IMHO, contemporary API (or at least pyswip.prolog module) needs to be redesigned from the scratch, see: http://www.swi-prolog.org/pldoc/man?section=foreignthread

I've devised an ad-hoc non-intrusive solution, just to test things out:

import pyswip, ctypes

class PrologMT(pyswip.Prolog):
    """Multi-threaded (one-to-one) pyswip.Prolog ad-hoc reimpl"""
    _swipl = pyswip.core._lib

    PL_thread_self = _swipl.PL_thread_self
    PL_thread_self.restype = ctypes.c_int

    PL_thread_attach_engine = _swipl.PL_thread_attach_engine
    PL_thread_attach_engine.argtypes = [ctypes.c_void_p]
    PL_thread_attach_engine.restype = ctypes.c_int

    @classmethod
    def _init_prolog_thread(cls):
        pengine_id = cls.PL_thread_self()
        if (pengine_id == -1):
            pengine_id = cls.PL_thread_attach_engine(None)
            print("{INFO} attach pengine to thread: %d" % pengine_id)
        if (pengine_id == -1):
            raise pyswip.prolog.PrologError("Unable to attach new Prolog engine to the thread")
        elif (pengine_id == -2):
            print("{WARN} Single-threaded swipl build, beware!")

    class _QueryWrapper(pyswip.Prolog._QueryWrapper):
        def __call__(self, *args, **kwargs):
            PrologMT._init_prolog_thread()
            return super().__call__(*args, **kwargs)

PrologMT is a drop-in replacement for pyswip.Prolog in your code. On each query, native thread's local storage is scanned for the prolog engine (PL_thread_self). If there is no pengine associated with the current thread, new one will be created (PL_thread_attach_engine), without any further resource management.
You couldn't use any term_t-related functions of libswipl from the thread with no pengine - that leads to SEGFAULT. By default, the thread which imports pyswip first, will be implicitly associated with the single ("main") pengine, as PL_initialise is called on the pyswip.prolog module top-level.

@yuce yuce self-assigned this Jun 9, 2018
@CPStagg
Copy link

CPStagg commented Dec 1, 2018

IMO when using Prolog's foreign language interface it is usually better to think of the PL as a service that runs on a worker thread and to which you send jobs from other threads, then ask the PL service if it's completed. This is particularly well-suited to having the PL execute a batch of tasks that you can assign for execution some time before you need to collect the results.

This is a little system I built a couple of days ago.

import threading
import time

# - - - - - - - -

class PlJob( object ):
    
    def execute( self, prolog ):
        raise NotImplementedError("Subclass must implement abstract method")

# - - - - - - - -

class PlQuery( object ):
    def __init__( self, query ):
        self.m_Query = query
        self.m_Solutions = None

    def execute( self, prolog ):
        # Note that this is going to find ALL the solutions for your query. If you want interactive
        # nondeterminism, that will require a different approach/architecture. Have fun. 
        self.m_Solutions = list( prolog.query(self.m_Query) )

# - - - - - - - -

class PlConsult( object ):
    def __init__( self, filename ):
        self.m_Filename = filename + ".pl"

    def execute( self, prolog ):
        prolog.consult(self.m_Filename)

# - - - - - - - -

class PlManager( threading.Thread ):
    def __init__( self ):   
        threading.Thread.__init__(self)   
        self.m_Done = False
        self.m_QueuedLock = threading.Lock()
        self.m_CompletedLock = threading.Lock()
        self.m_Event = threading.Event()
        self.m_NextTicketID = 0
        self.m_ContinueRunning = True
        self.m_QueuedJobs = []
        self.m_CompletedJobs = {}
        self.m_HighestTicketCompleted = -1
        self.start()

    def run( self ):
        from pyswip import Prolog
        self.m_Prolog = Prolog()

        while self.m_ContinueRunning:

            while len( self.m_QueuedJobs ) == 0 and self.m_ContinueRunning:
                self.m_Event.wait()
                self.m_Event.clear()

            if self.m_ContinueRunning:
                self.m_QueuedLock.acquire()
                [ticket, toExecute] = self.m_QueuedJobs.pop(0)
                self.m_QueuedLock.release()

                toExecute.execute(self.m_Prolog)

                self.m_CompletedLock.acquire()
                self.m_CompletedJobs[ticket] = toExecute 
                self.m_HighestTicketCompleted = ticket               
                self.m_CompletedLock.release()

                nMaxCompletedJobsToKeep = 16 # or whatever you want
                # note that this test isn't strictly thread-safe, but...would we care?
                if len( self.m_CompletedJobs ) > nMaxCompletedJobsToKeep:
                    # we can actually put this onto a subsidiary worker thread
                    threading.Thread(target=self.purgeOldestCompletedJobs).start() 
                    


    def halt( self ):
        self.m_ContinueRunning = False
        self.m_Event.set()

    def submitQuery( self, query ):
        job = PlQuery( query )
        return self.submitJob( job )

    def submitConsult( self, filename ):
        # job = PlConsult( filename )
        job = PlQuery( "consult( " + filename + " )" )
        return self.submitJob( job )

    def submitJob( self, job ):
        self.m_QueuedLock.acquire()
        ticketID = self.m_NextTicketID
        self.m_QueuedJobs.append( [ticketID, job] )
        self.m_NextTicketID += 1
        self.m_QueuedLock.release()
        self.m_Event.set()

        return ticketID

    def tryToGetCompletedQuery( self, ticket ):
        result = None
        # I think this test can go outside of the crit section unless writing to self.m_HighestTicketCompleted is not atomic
        if self.m_HighestTicketCompleted >= ticket: 
            self.m_CompletedLock.acquire()
            result = self.m_CompletedJobs.pop(ticket, None)
            self.m_CompletedLock.release()

        return result

    def purgeOldestCompletedJobs( self ):
        # oldest job will have lowest ticket ID    
        self.m_CompletedLock.acquire()
        tickets = self.m_CompletedJobs.keys()
        oldestTicket = min( tickets )
        del self.m_CompletedJobs[oldestTicket]
        self.m_CompletedLock.release()

# - - - - - - - -

print( "Main thread turn on" )
manager = PlManager()

manager.submitConsult( "my_pl_file")
allEvents = "[[a,b,c],[d,e,f]]"
query = "nested_member( " + allEvents + ", Event )"
ticketID = 0
print( "Take off every 'job'!" )
# I actually submit the same job 32 times here just to show that the job queue (and the purge system) works
for i in range( 32 ):
    ticketID = manager.submitQuery( query )
completedQuery = None
while not completedQuery:
    completedQuery = manager.tryToGetCompletedQuery( ticketID ) 
    if not completedQuery:
        time.sleep(0.01)      
print( "We get solution!!!" )
for soln in completedQuery.m_Solutions:
    print( str( soln ) )

manager.halt()

With my_pl_file.pl containing:

nested_member( OuterList, X ) :-
	member( Innerlist, OuterList ),
	member( X, Innerlist ).

In this example the main thread sends a job (or a series of jobs) to the PlManager object, which has started a worker loop on another thread. Each job submitted returns a ticket number, and in this instance the main thread has nothing better to do than wait for the job to be completed. But it might be able to submit the jobs earlier in the frame and have basically zero latency when collecting results from SWI-Prolog.

Output of running this program:

Main thread turn on
Take off every 'job'!
We get solution!!!
{'Event': 'a'}
{'Event': 'b'}
{'Event': 'c'}
{'Event': 'd'}
{'Event': 'e'}
{'Event': 'f'}

(Note that only the results of the last of the 32 jobs submitted is printed out; ticketID ends up being assigned to the value returned from submitting the final job.)

There are some downsides to this specific approach. In particular, I think (?) PySwip returns its query results as a generator object which handles the interactive nondeterminism of Prolog, generating solutions on the fly as necessary. In my approach the worker thread simply evaluates all solutions pre-emptively and stores them. This could be problematic if the state space were large (or infinite), for reasons of both processing time and storage space.

Having two separate thread locks might seem over-engineered but so far as I can tell there's no need to block a thread posting a new job just because the manager is in the middle of updating the "completed" dict.

Oh, and originally I used polymorphic PlJob objects that had subtypes for general queries and file consults, but realized I could just use the general query method for the consult. But the ability to have polymorphic job classes might prove useful.

@HansBambel
Copy link

Did anyone make any progress on this? I am having the exact same issue when using Pyswip in combination with Flask

Is there any progress on this? Sadly the solution from @xpinguin didn't work for me.

@xpinguin
Copy link

xpinguin commented Jan 17, 2020

Did anyone make any progress on this? I am having the exact same issue when using Pyswip in combination with Flask

Is there any progress on this? Sadly the solution from @xpinguin didn't work for me.

Could you please quote the error you've been faced with?

While I've departed from the Python "scene" towards the C++ (being the sort of "mother tongue" for me), I still feel for Prolog, so I'll try my best to solve the problem you had been facing.

@HansBambel
Copy link

Thank you, but I moved the whole project to Python now. As I recall, there was not really an error, but just not using the prolog file which resulted in nothing being returned.

hashlash added a commit to hashlash/prolog-battleships that referenced this issue May 16, 2020
hashlash added a commit to hashlash/prolog-battleships that referenced this issue May 16, 2020
g-gemignani pushed a commit to magazino/pyswip that referenced this issue Jul 2, 2020
…est yuce#3)

SW-17379 Improve functor representation in lists

Approved-by: Guglielmo Gemignani <[email protected]>
@ShubhamSharma1808
Copy link

is this issue of multithreading in pyswip still there or it is solved. @yuce

@m00lecule
Copy link

Faced similiar result with Flask

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants