Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Versioned personal data stored by PaperTrail left behind #11

Open
ahukkanen opened this issue Mar 8, 2023 · 4 comments · Fixed by #24
Open

Versioned personal data stored by PaperTrail left behind #11

ahukkanen opened this issue Mar 8, 2023 · 4 comments · Fixed by #24

Comments

@ahukkanen
Copy link
Contributor

Decidim stores versions of the user data through PaperTrail as part of the Decidim::Traceable module as defined here:
https://github.com/decidim/decidim/blob/bfc862f2308c3215c52b324e16de8680ed64fe16/decidim-core/lib/decidim/traceable.rb#L19

The data is stored to the versions table in the database within the object_changes column in YML format e.g. as follows:

id:                                                                   
-                                                                     
- 221                                                                 
email:                                                                
- ''                                                                  
- [email protected]                   
encrypted_password:                                                   
- ''                                                                  
- "$2a$11$R4LAq...L690yl8Sy7VUw6vB.6"      
created_at:                                                           
-                                                                     
- !ruby/object:ActiveSupport::TimeWithZone                            
  utc: &1 2023-03-01 09:59:33.930443106 Z                             
  zone: &2 !ruby/object:ActiveSupport::TimeZone                       
    name: Etc/UTC
  time: 2023-03-01 09:59:33.930443106 Z
updated_at:
- 
- !ruby/object:ActiveSupport::TimeWithZone
  utc: *1
  zone: *2
  time: 2023-03-01 09:59:33.930443106 Z
decidim_organization_id:
- 
- 1
confirmed_at:
- 
- !ruby/object:ActiveSupport::TimeWithZone
  utc: 2023-03-01 09:59:33.830071364 Z
  zone: *2
  time: 2023-03-01 09:59:33.830071364 Z
name:
- 
- Lisabeth Schiller 4 4 endr4
nickname:
- ''
- coleman_gleichner
type:
- 
- Decidim::User

You can find this data e.g. with the following command from the rails console:

Decidim::User.all.sample.versions[0].object_changes

I think this data should be also cleared up for deleted users after a certain period of time as it contains personal details especially when applied to the user related models.

Note that this data can be sometimes useful to trace back the changes in the user model, e.g. in case we are accidentally deleting some account or in case we need to investigate some issue with the account.

I would suggest that there would be a defined (preferrably configurable) "cutoff" period after which the versioned user data would be also deleted for deleted accounts.

Note that this same issue also applies for the Decidim::Authorization model which also holds personal data. Those records can be already deleted by admins from the admin panel but the versions table is not currently cleaned after the removal. A similar "cutoff" period should also apply to the versioned authorization data.

To fetch the version data for deleted user accounts:

PaperTrail::Version.joins(
  <<~SQL.squish
    INNER JOIN decidim_users ON decidim_users.id = versions.item_id
      AND versions.item_type IN ('Decidim::User', 'Decidim::UserBaseEntity')
  SQL
).where.not(decidim_users: { deleted_at: nil })

To fetch the version data for deleted authorizations:

PaperTrail::Version.joins(
  <<~SQL.squish
    LEFT JOIN decidim_authorizations ON decidim_authorizations.id = versions.item_id
      AND versions.item_type = 'Decidim::Authorization'
  SQL
).where(item_type: "Decidim::Authorization", decidim_authorizations: { id: nil })
@Quentinchampenois
Copy link
Collaborator

Thank you for this point, resources and examples !

Task should include versioned user data and authorizations as described, and check if omniauth identities are also cleared after predefined (and also configurable) period

@Quentinchampenois
Copy link
Collaborator

Hello @ahukkanen,

The account destroy is based on Decidim::DestroyAccount ( see )

We can implement the clear of versioned user data and authorizations into this module. I wonder if it can be interesting for the community to have it directly in decidim-core in Decidim::DestroyAccount. If so, we can make a contribution, what do you think ?

@ahukkanen
Copy link
Contributor Author

@Quentinchampenois IMO we should have some kind of retention period (i.e. the "cutoff" period I mentioned) before the versioned data is wiped out.

I would suggest that this would be configurable and the defaults could be e.g.

  • For users, the versioned data would be destroyed 1 month after the deletion of the user account (based on deleted_at in Decidim::User
  • For authorizations, the versioned data would be destroyed 1 month after the creation of the version data (based on created_at in PaperTrail::Version
    • Note that this only needs to happen for DELETED authorizations as specified above

I would suggest that this period (1 month) would be configurable from the module's config_accessors.

So I would not wipe out the data straight when the user account is deleted.

Eventually, I think this whole module should be in the core but core is in feature freeze right now, so it won't happen straight away.

@Quentinchampenois
Copy link
Collaborator

Thanks for your answer it is good to me, we will implement it into this module

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants