Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace use of sweet_pickle in naming, and delete sweet_pickle subpackage #199

Merged
merged 18 commits into from
Nov 25, 2020

Conversation

aaronayres35
Copy link
Contributor

@aaronayres35 aaronayres35 commented Nov 10, 2020

fixes #133

This PR very possibly is biting the bullet too early on this, so I am happy to hold off on merging or close if need be. (we might not want to kill sweet_pickle!)

sweet_pickle was previously only used in a very minor way in apptools.naming, and was not used anywhere else in apptools. Additionally, AFAIK it is not used in other projects currently (this might not be the case in which case we can simply close this PR). This PR replaces its use in apptools.naming and deletes the sweet_pickle subpackage.

For more detailed commentary on this see #133 and specifically these comments: #133 (comment) #133 (comment)

It was also mentioned in the EEP that in envisage "The single_project plugin uses sweet_pickle and naming.". However, single_project was removed by enthought/envisage#331

Checklist

  • Add a news fragment if this PR is news-worthy for end users. (see docs/releases/README.rst)

Also, we could do a quick scream test on slack like we did for apptools.lru_cache.

@kitchoi kitchoi self-requested a review November 16, 2020 14:26
Copy link
Contributor

@kitchoi kitchoi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Persistence is tricky because we also need to think about files that have been created in the past. Although pickling is not meant for long-term persistence, we still need to think about them...

Looks like Mayavi uses of apptools.naming (which uses sweet_pickle) is mostly transient: It is about binding variables to the Python shell.

As for the changes in apptools.naming.object_serializer, it seems those lines are indirectly covered by tests exercising apptools.naming.pyfs_context or apptools.naming.context._bind, and I can't find a test to ensure that round-tripping is working correctly, i.e. what is saved by ObjectSerializer.save can be loaded by ObjectSerializer.load correctly.

So I added the following test, which passes on master:

New test for roundtripping
# (C) Copyright 2005-2020 Enthought, Inc., Austin, TX
# All rights reserved.
#
# This software is provided without warranty under the terms of the BSD
# license included in LICENSE.txt and may be redistributed only under
# the conditions described in the aforementioned license. The license
# is also available online at http://www.enthought.com/licenses/BSD.txt
#
# Thanks for using Enthought open source!
import os
import shutil
import tempfile
import unittest

from traits.api import cached_property, HasTraits, Property, Str, Event

from apptools.naming.api import ObjectSerializer


class FooWithTraits(HasTraits):
    """ Dummy HasTraits class for testing ObjectSerizalizer."""

    full_name = Str()

    last_name = Property(depends_on="full_name")

    event = Event()

    @cached_property
    def _get_last_name(self):
        return self.full_name.split(" ")[-1]

class TestObjectSerializer(unittest.TestCase):

    def setUp(self):
        self.tmpdir = tempfile.mkdtemp()
        self.addCleanup(shutil.rmtree, self.tmpdir)
        self.tmp_file = os.path.join(self.tmpdir, "tmp.pickle")

    def test_save_load_roundtrip(self):
        # Test HasTraits objects can be serialized and deserialized as expected
        obj = FooWithTraits(full_name="John Doe")

        serializer = ObjectSerializer()
        serializer.save(self.tmp_file, obj)

        self.assertTrue(serializer.can_load(self.tmp_file))
        deserialized = serializer.load(self.tmp_file)

        self.assertIsInstance(deserialized, FooWithTraits)
        self.assertEqual(deserialized.full_name, "John Doe")
        self.assertEqual(deserialized.last_name, "Doe")

But it fails on this branch:

======================================================================
ERROR: test_save_load_roundtrip (apptools.naming.tests.test_object_serializer.TestObjectSerializer)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/kchoi/Work/ETS/apptools/apptools/naming/tests/test_object_serializer.py", line 48, in test_save_load_roundtrip
    deserialized = serializer.load(self.tmp_file)
  File "/Users/kchoi/Work/ETS/apptools/apptools/naming/object_serializer.py", line 58, in load
    obj = VersionedUnpickler(f).load()
  File "/Users/kchoi/Work/ETS/apptools/apptools/persistence/versioned_unpickler.py", line 37, in load
    ret = Unpickler.load(self)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/pickle.py", line 1210, in load
    dispatch[key[0]](self)
TypeError: 'classmethod' object is not callable

Something still needs fixing?

apptools/persistence/versioned_unpickler.py Show resolved Hide resolved
apptools/naming/object_serializer.py Show resolved Hide resolved
@aaronayres35
Copy link
Contributor Author

Note there is this integration test for VersionedUnpickler: https://github.com/enthought/apptools/blob/master/integrationtests/persistence/test_persistence.py

(I'm simply linking this because I saw it existed, it is very possibly irrelevant / we may just want to delete it)

@aaronayres35
Copy link
Contributor Author

So I added the following test, which passes on master:

New test for roundtripping
But it fails on this branch:

======================================================================
ERROR: test_save_load_roundtrip (apptools.naming.tests.test_object_serializer.TestObjectSerializer)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/kchoi/Work/ETS/apptools/apptools/naming/tests/test_object_serializer.py", line 48, in test_save_load_roundtrip
    deserialized = serializer.load(self.tmp_file)
  File "/Users/kchoi/Work/ETS/apptools/apptools/naming/object_serializer.py", line 58, in load
    obj = VersionedUnpickler(f).load()
  File "/Users/kchoi/Work/ETS/apptools/apptools/persistence/versioned_unpickler.py", line 37, in load
    ret = Unpickler.load(self)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/pickle.py", line 1210, in load
    dispatch[key[0]](self)
TypeError: 'classmethod' object is not callable

Something still needs fixing?

So, I was able to get this test (and all the tests) to pass on this branch, by simply removing the @classmethod decorator I had previously added to the load_build method on VersionedUnpickler. I had originally added it to match the VersionedUnpickler from sweet_pickle and to follow what was said in the comment:

# Make this a class method since dispatch is a class variable.
# Otherwise, supposing the initial sweet_pickle.load call (which would
# have overloaded the load_build method) makes a pickle.load call at some
# point, we would have the dispatch still pointing to
# NewPickler.load_build whereas the object being passed in will be an
# Unpickler instance, causing a TypeError.
def load_build(cls, obj):
# Just save the instance in the list of objects.
if isinstance(obj, NewUnpickler):
obj.objects.append(obj.stack[-2])
Unpickler.load_build(obj)
load_build = classmethod(load_build)

It appears this code was copy pasted from sweet_pickle and then this decorator likely got removed because it was causing bugs somehow? In any case I am still quite confused as your test passes on master, where in sweet_pickle that @classmethod decorator is there, however, adding it here makes that test fail. I need to investigate more

@kitchoi
Copy link
Contributor

kitchoi commented Nov 23, 2020

It appears this code was copy pasted from sweet_pickle and then this decorator likely got removed because it was causing bugs somehow? In any case I am still quite confused as your test passes on master, where in sweet_pickle that @classmethod decorator is there, however, adding it here makes that test fail. I need to investigate more

I think that's because this line wasn't removed after @classmethod was added:

load_build = classmethod(load_build)

This line is equivalent to adding @classmethod to def load_build, but we don't need both.

@kitchoi
Copy link
Contributor

kitchoi commented Nov 23, 2020

@aaronayres35 The roundtripping test above is quite useful for making sure the changes in this PR has not broken object serializer (in the most trivial ways at least). I think it would be good to add the test as part of this PR, it will also serve as future reference as to what extent the object serializer remains to work.

I saw that you have removed apptools/sweet_pickle/tests/__init__.py. While the tests in that folder are not discovered and run by unittest discover any more, the test files are still in the folders. It would be a strong proof to the claim "users of sweet_pickle can transition to using apptools.persistance and pickle from the python standard library" if we can convert the existing tests in sweet_pickle to use persistence, what do you think?

@aaronayres35
Copy link
Contributor Author

@aaronayres35 The roundtripping test above is quite useful for making sure the changes in this PR has not broken object serializer (in the most trivial ways at least). I think it would be good to add the test as part of this PR, it will also serve as future reference as to what extent the object serializer remains to work.

Yeah that makes sense, I will add the test to this PR and have it sit in naming as it is intended for ObjectSerializer

I saw that you have removed apptools/sweet_pickle/tests/__init__.py. While the tests in that folder are not discovered and run by unittest discover any more, the test files are still in the folders. It would be a strong proof to the claim "users of sweet_pickle can transition to using apptools.persistance and pickle from the python standard library" if we can convert the existing tests in sweet_pickle to use persistence, what do you think?

Somehow when merging / addressing conflicts I must have made a mistake and readded sweet_pickle files. I had intended to remove the entire sub package in this PR. What you said does make sense though as far as supporting the "users of sweet_pickle can transition to using apptools.persistance and pickle from the python standard library" claim.
But in that case where should these tests sit? Should we leave a basically empty sweet_pickle package keeping the old tests? Should I move the tests into persistence after they have been modified (I think that would make sense)

@kitchoi
Copy link
Contributor

kitchoi commented Nov 23, 2020

Somehow when merging / addressing conflicts I must have made a mistake and readded sweet_pickle files. I had intended to remove the entire sub package in this PR.

I see! That's probably why I did not see them when I checked out the branch and now they reappear.

Should I move the tests into persistence after they have been modified (I think that would make sense)

Adding them to persistence would make sense to me too.

@aaronayres35
Copy link
Contributor Author

Note sweet_pickle is also used in codetools, but all that is used is sweet_pickle.load and sweet_pickle.dump which can be replaced as is done here in apptools.naming.

@aaronayres35
Copy link
Contributor Author

I saw that you have removed apptools/sweet_pickle/tests/__init__.py. While the tests in that folder are not discovered and run by unittest discover any more, the test files are still in the folders. It would be a strong proof to the claim "users of sweet_pickle can transition to using apptools.persistance and pickle from the python standard library" if we can convert the existing tests in sweet_pickle to use persistence, what do you think?

Some of the sweet_pickle tests are a bit sweet_pickle specific. For example, the tests in
https://github.com/enthought/apptools/blob/master/apptools/sweet_pickle/tests/test_updater.py
basically each just test each specific method on the Updater class. persistence also has an Updater class but it is definitely not the same. It seems they go about doing things in different ways. Some of these tests I'm not sure how to directly modify to use persistence.

In fact, I am not confident that persistence can replace all of the functionality provided by sweet_pickle. I am confident thought that it can replace all the functionality of sweet_pickle that I have seen used. Which at this point all I have really seen is calls to sweet_pickle.load, sweet_pickle.loads, sweet_pickle.dump, and sweet_pickle.dumps, which can be replaced like they are in this PR in apptools.naming

@kitchoi
Copy link
Contributor

kitchoi commented Nov 24, 2020

In fact, I am not confident that persistence can replace all of the functionality provided by sweet_pickle. I am confident thought that it can replace all the functionality of sweet_pickle that I have seen used. Which at this point all I have really seen is calls to sweet_pickle.load, sweet_pickle.loads, sweet_pickle.dump, and sweet_pickle.dumps, which can be replaced like they are in this PR in apptools.naming

I see. Let's try to convert the tests that can be converted. Then we can more confidently say which functionality can be migrated and which are being discarded.

Comment on lines -65 to -81
def test_infinite_loop_detection(self):
"""Validates that the class mapping framework detects infinite
loops of class mappings.
"""
# Add mappings to the registry
self.registry.add_mapping_to_class(Foo.__module__, Foo.__name__, Bar)
self.registry.add_mapping_to_class(Bar.__module__, Bar.__name__, Baz)
self.registry.add_mapping_to_class(Baz.__module__, Baz.__name__, Foo)

# Validate that an exception is raised when trying to unpickle an
# instance anywhere within the circular definition.
def fn(o):
sweet_pickle.loads(sweet_pickle.dumps(o))

self.assertRaises(sweet_pickle.UnpicklingError, fn, Foo())
self.assertRaises(sweet_pickle.UnpicklingError, fn, Bar())
self.assertRaises(sweet_pickle.UnpicklingError, fn, Baz())
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe cycles aren't possible with how persistence is set up, so I deleted this test. The following test I modified slightly as there is only one transition that can occur. ie with persistence you can't go Foo->Bar->Baz in one roundtrip

Comment on lines -113 to -116
def test_unpickled_chain_functionality(self):
"""Validates that the registered state functions are used when
unpickling.
"""
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it may be possible to modify this test, to get a similar flavor of test for apptools.persistence by using the setstates attribute on an Updater instance that is passed to a VersionedUnpickler. Unfortunately I was unable to get it to work atm.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked into this briefly... IMO, it looks evil! I opened #238.

@aaronayres35
Copy link
Contributor Author

Orthogonal to this PR, but I think apptools could really benefit from improved docstrings (numpy style). Obviously documentation could be improved in general, but I find myself spending a lot of time just trying to understand what arguments are, etc. For example when workng on this PR I noticed that the apptools.persistence.updater.Updater class definition never mentions a setstates attribute. See here:

class Updater:
"""An abstract class to provide functionality common to the updaters."""
def get_latest(self, module, name):
"""The refactorings dictionary contains mappings between old and new
module names. Since we only bump the version number one increment
there is only one possible answer.
"""
if hasattr(self, "refactorings"):
module = self.strip(module)
name = self.strip(name)
# returns the new module and name if it exists otherwise defaults
# to using the original module and name
module, name = self.refactorings.get(
(module, name), (module, name)
)
return module, name
def strip(self, string):
# Who would have thought that pickle would pass us
# names with \013 on the end? Is this after the files have
# manually edited?
if ord(string[-1:]) == 13:
return string[:-1]
return string

It is however used here:
fn = self.updater.setstates.get((module, name), False)

I discovered it here:

self.setstates = {
("cplab.project", "Project"): update_project
}

docstrings would save a lot of time spent trying to follow the code

@aaronayres35 aaronayres35 requested a review from kitchoi November 24, 2020 20:11
@mdickinson mdickinson mentioned this pull request Nov 25, 2020
Copy link
Contributor

@kitchoi kitchoi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for converting the tests, it really helps us to understand what can or cannot be migrated!
Some suggestions for the tests and the news fragment given the new information. The rest looks good to me.

docs/releases/upcoming/199.removal.rst Outdated Show resolved Hide resolved
apptools/persistence/tests/test_two_stage_unpickler.py Outdated Show resolved Hide resolved
apptools/persistence/tests/test_two_stage_unpickler.py Outdated Show resolved Hide resolved
apptools/persistence/tests/test_two_stage_unpickler.py Outdated Show resolved Hide resolved
apptools/persistence/tests/test_state_function.py Outdated Show resolved Hide resolved
Comment on lines -113 to -116
def test_unpickled_chain_functionality(self):
"""Validates that the registered state functions are used when
unpickling.
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked into this briefly... IMO, it looks evil! I opened #238.

Copy link
Contributor

@kitchoi kitchoi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fix multiple versioned pickle wrappers
2 participants