WordPress.org

Making WordPress.org

Opened 6 weeks ago

Last modified 5 weeks ago

#3244 accepted defect

Data Protection and Bank Detail issues

Reported by: Hugo Finley Owned by: iandunn
Milestone: Priority: high
Component: WordCamp Site & Plugins Keywords:
Cc:

Description

Within the Reimbursement back end of the WordCamp Sites personal details are being stored forever, and any organiser who has access can still see everyones personal details.

  1. Scrub the financial bank details after the set auditing time or at time of reimbursement.

Solution: I am aware that WordCamp will have to store financial data for a while but it is important to know that volunteers bank details will not be stored after they are no longer needed. WordCamp can retain the amounts but scrub the bank details as soon as they are allowed to. I do generally believe that personal bank details should be scrubbed as soon as the claim is paid, mostly because WordCamp should of stored this information somewhere more secure when making payments and also because you have receipts which are proof of payment.

  1. Currently any organiser continues to have access to the back end of any WordCamp site they were an organiser for and all of these sites hold peoples personal addresses and bank details too.

Solution: Deny access to all financial information apart from budgets once the camp has been signed off.

I am concerned about data protection and a little about financial conduct, I have a good understanding about data protection too, and kind of feel some of these changes need to be considered carefully. If WordCamp was hacked it is potentially a identity theft goldmine as it stores peoples home addresses and bank details.

Change History (28)

#1 @iandunn
6 weeks ago

#3243 was marked as a duplicate.

#2 @iandunn
6 weeks ago

  • Keywords needs-patch good-first-bug added

I don't think the data is as sensitive as it seems (bank account numbers are printed on checks, for example), but I do agree that it'd be good to redact it once it's no longer needed.

I think we can safely do that 14 days after the item is marked as paid. I'd prefer to overwrite the fields with [redacted] rather than deleting them, to make it clear what happened.

I imagine the cron job will be naturally redact old data, but if it doesn't apply retroactively for some reason, then we'll also need a one-time WP-CLI script to do that.

#3 @Hugo Finley
6 weeks ago

The data is mostly sensitive because everything is provided with the bank details, like home address and name on the account. You are right that back account details are printed on cheques along with the name of the account holder, but personal addresses tend not to be. I am concerned by the amount of information stored in these forms and the amount of people who have access to them.

Redacting the information after 14 days would make this safer. I assume that the payment details are stored somewhere safely in conjunction with the storage of personal data. In terms of EU law this will need to be looked at more when the GDPR comes into effect next May.

I also mention this because this type of sensitive information is currently still being stored on all previous WordCamp websites, for example I can still access the 2015 WordCamp sites which store these details too.

#4 @iandunn
6 weeks ago

#3246 was marked as a duplicate.

#5 follow-up: @TJNowell
6 weeks ago

Bank account information is considered personally identifiable information, and covered under numerous privacy and data protection laws.

Also keep in mind that if I submit a reimbursement, it's not just visible to me and central, but it's also visible to all of my organisers who now know my home address and bank details.

Redacting after 14 days sounds like a good approach, I'd suggest it be applied to phone numbers, bank details, and any addresses

#6 follow-up: @TJNowell
6 weeks ago

Having thought about it further, as soon as the status is set to PAID the personal information should be immediately redacted or removed, a cron job shouldn't be necessary. This should simplify the technical side of things

#7 in reply to: ↑ 6 ; follow-up: @andreamiddleton
5 weeks ago

Replying to TJNowell:

Having thought about it further, as soon as the status is set to PAID the personal information should be immediately redacted or removed, a cron job shouldn't be necessary. This should simplify the technical side of things

We originally thought to keep the info for 14 days because sometimes after the bank tells us money has been sent, they come back and tell us that the money has not been sent, and we need to review what payment details were used to send the funds. That said, we can usually get that information from the bank (banking software can be really buggy, which is where the "usually" comes from). So if it's going to cause an issue with GDPR, then we could probably wipe banking info immediately after the request is marked Paid and research issues based on what the bank tells us.

#8 in reply to: ↑ 5 ; follow-up: @andreamiddleton
5 weeks ago

Replying to TJNowell:

Bank account information is considered personally identifiable information, and covered under numerous privacy and data protection laws.

Also keep in mind that if I submit a reimbursement, it's not just visible to me and central, but it's also visible to all of my organisers who now know my home address and bank details.

If we restricted user access for reimbursement requests -- making it possible for only the reimbursement requester and the super-admin who reviews the request to access it -- it seems like that would resolve this issue.

Is there any way that could backfire on us?

#9 follow-up: @TJNowell
5 weeks ago

If we restricted user access for reimbursement requests -- making it possible for only the reimbursement requester and the super-admin who reviews the request to access it -- it seems like that would resolve this issue.
Is there any way that could backfire on us?

Not to my knowledge, I would expect you'd need a list of who has access, but I also expect you already have that somewhere.

Another thing to note is that currently it's possible to export reimbursements in full via the WordPress exporter, but that also affects other post types such as vendor payments, and should probably be a separate ticket

#10 in reply to: ↑ 7 ; follow-up: @idea15
5 weeks ago

  • Priority changed from normal to high

Please don't frame this as an issue of something that needs to be done in order to achieve compliance with GDPR.

It should not take an incoming European data protection law to safeguard the WordPress community.

That being said, under GDPR unauthorised access to data constitutes a data breach, even if that data is accessed internally, and even if nothing malicious is done with it. Any volunteer quickly given an admin login at the desk during registration to sign people in on the spot - as I've seen happen - and therefore have access to the lead organisers' bank account details - will qualify as a data breach.

Also, I've changed this ticket's priority to High, as it's now been online for five days as a big flashing signpost to anyone who might be tempted to try out a proof of concept.

Replying to andreamiddleton:

So if it's going to cause an issue with GDPR, then we could probably wipe banking info immediately after the request is marked Paid and research issues based on what the bank tells us.

Last edited 5 weeks ago by idea15 (previous) (diff)

#11 in reply to: ↑ 8 ; follow-up: @idea15
5 weeks ago

Replying to andreamiddleton:

Is there any way that could backfire on us?

It solves the issue of who has access to data going forward for future WordCamps and reimbursements.

It doesn't solve the issue of thousands of peoples' bank details (including mine) still being on American (?) servers from past WordCamps, available to any volunteer and his dog with an admin login.

It doesn't solve the problem of anyone who submits a reimbursement not being informed who will have access to their data, how long it will be retained for reimbursement and auditing purposes, and when it will be deleted.

And that's all assuming that all super admins use secure wifi in secure places.

For the record, I've screen capped the admin interface from the reimbursement area for my small expenses as a WordCamp organiser from the gig we put on in November 2015, which you personally processed. Still sitting there, inclusive of ALL my bank details, available to all seven organisers from that event. It's a good thing we were a solid and functional team.

Last edited 5 weeks ago by idea15 (previous) (diff)

#12 in reply to: ↑ 10 @andreamiddleton
5 weeks ago

I'm sorry, I didn't mean to sound flippant about the issue. I agree that good data security practices are important, regardless of legal concerns.

Replying to idea15:

Please don't frame this as an issue of something that needs to be done in order to achieve compliance with GDPR.

It should not take an incoming European data protection law to safeguard the WordPress community.

#13 in reply to: ↑ 9 @iandunn
5 weeks ago

Replying to TJNowell:

Having thought about it further, as soon as the status is set to PAID the personal information should be immediately redacted or removed, a cron job shouldn't be necessary. This should simplify the technical side of things

I think that actually complicates the technical side, because it means that we'd have to write an additional script to go back and retroactively redact the existing data. A cron job would take care of that on the first run.

Replying to TJNowell:

Another thing to note is that currently it's possible to export reimbursements in full via the WordPress exporter

Tom opened #3253 for that particular issue. It has some more details, but the TL;DR is that the exported data is essentially meaningless, because it can't be decrypted outside of WordCamp.org.

Replying to idea15:

It doesn't solve the issue of thousands of peoples' bank details (including mine) still being on American (?) servers from past WordCamps, available to any volunteer and his dog with an admin login.

I don't think Andrea intended her comment to be applied to both of the problems that are being discussed in this ticket, only the one she was directly replying to.

Both of the issues are valid, and the patch should include resolutions for both of them.

Replying to idea15:

And that's all assuming that all super admins use secure wifi in secure places.

WordCamp.org requires HTTPS connections for wp-admin, so the data will still be encrypted even when sent over insecure wireless networks.


I think the best way forward here is to:

  1. Setup the cron job described in comment:2. This will resolve issue #1 (data being kept longer than necessary).
  2. Hide the meta box from everyone except network admins and the post's author. That will resolve issue #2 (other organizers having access to payment details). Something like current_user_can( 'manage_network' ) || get_current_user_id() === $post->author. That probably needs to be applied to both displaying the metabox, and saving the corresponding data, to avoid removing the data if another organizer saves the post (because the fields would be missing from $_POST).

@TJNowell, do you feel strongly enough about this to spend time contributing a patch?

#14 follow-up: @TJNowell
5 weeks ago

I would but I'm somewhat time constrained at the moment wrapping up a WordCamp and other personal things, but I don't think this should wait

re: 2, I'd have hidden the post entirely, as long as those meta values never become available via the REST API that should be effective

#15 in reply to: ↑ 14 @iandunn
5 weeks ago

Replying to TJNowell:

re: 2, I'd have hidden the post entirely, as long as those meta values never become available via the REST API that should be effective

Huh, that surprises me. I can't think of any reason why we'd need to hide reimbursement requests from other organizers (after PII is scrubbed).

(Actually, I think that all of the budget posts types should be completely public (minus PII), for the sake of transparency, but that's a whole other discussion.)

Am I missing something?

I think meta fields are only included in REST API endpoints if they're explicitly registered and opt-in (i.e., register_meta( $type, $key, array( 'show_in_rest' => true ) ), so that shouldn't be an issue. If they did accidentally make it in, they'd still be encrypted, for the same reason as #3253.

#16 in reply to: ↑ 11 ; follow-ups: @iandunn
5 weeks ago

Replying to idea15:

It doesn't solve the problem of anyone who submits a reimbursement not being informed who will have access to their data, how long it will be retained for reimbursement and auditing purposes, and when it will be deleted.

That's a good point to bring up. To address that, we could add some text to the metabox that says something like, "Your financial data will be retained until 14 days after the payment has cleared. During that time, it will be displayed to you and a handful of trusted financial and technical administrators."

#17 in reply to: ↑ 16 @idea15
5 weeks ago

Good idea. You are going to need a disclaimer in any case for compliance reasons.

"A handful of trusted financial and technical administrators" is too ambiguous, though. Say something more specific like "members of the WordCamp Central finance team" etc.

Replying to iandunn:

Replying to idea15:

It doesn't solve the problem of anyone who submits a reimbursement not being informed who will have access to their data, how long it will be retained for reimbursement and auditing purposes, and when it will be deleted.

That's a good point to bring up. To address that, we could add some text to the metabox that says something like, "Your financial data will be retained until 14 days after the payment has cleared. During that time, it will be displayed to you and a handful of trusted financial and technical administrators."

#18 @Hugo Finley
5 weeks ago

Can I just point at #3246 which has been marked as a duplicate although it is not really. I was told to separate the issues but I still wish I had submitted them together as they are part of the conversation.

Access is important for the Lead Organiser/Financial organiser too as they need to make sure the claims being made are not fraudulent it is not a 'lack of trust' but when you are responsible for monies from sponsors and WC central you need to be aware of what is being spent and by whom. The receipt system is just as important as the other financial parts of the back end.

#19 @Hugo Finley
5 weeks ago

#3246 is not a duplicate of this ticket although it has been marked as if it is, I was actively encouraged to separate the two things but wish i had not. There is a bit of overlap between them, but I do firmly believe that not just the super users (I assume this is a reference to the WC team) should have access to the claims. When I take on the role of financial volunteer I generally also check through the claims to make sure they are not fraudulent in any way, this is not a lack of trust in my team it is just that you have to make sure all financial systems are transparent, when dealing with monies, especially donated funds like sponsorship.

My first point in #3246 is really one of usability, my second point is wheat we have ended us talking about in terms of access to data. I do believe that more than just the WC central team should have access to the claim, as the financial volunteer I have been asked to check them by people before they have submitted them, also I have helped talk people through making a claim. You take on a role of responsibility when you take on any role that involves monies and it should be a defined role really which is I guess where the current system lets volunteers down.

#20 in reply to: ↑ 16 @danieltj
5 weeks ago

  • Keywords 2nd-opinion added

Replying to iandunn:

Replying to idea15:

It doesn't solve the problem of anyone who submits a reimbursement not being informed who will have access to their data, how long it will be retained for reimbursement and auditing purposes, and when it will be deleted.

That's a good point to bring up. To address that, we could add some text to the metabox that says something like, "Your financial data will be retained until 14 days after the payment has cleared. During that time, it will be displayed to you and a handful of trusted financial and technical administrators."

I'd argue that seven days is more than enough time. Additionally, having text that states only trusted people can see it, are these people vetted and a closed team of people etc? I'm just cautious about who can access it and why. In an ideal world, once the payment information is used for it's sole purpose, it should be gone for good and never retrievable by anyone, ever.

I also think that bank account details, whether they're on cheques or not, are very sensitive and is an easy way for someone to build up a profile for potential fraud. Any personal data needs to be stored once for it's intended use and then after that it needs to be permanently deleted for good. On top of all of this, we need people with the time to make these code changes and it does need to happen sooner rather than later otherwise it'll be forgotten about.

#21 @iandunn
5 weeks ago

In 6094:

WordCamp Budgets: Limit access to payment details to protect privacy.

Props hugo-finley, idea15, andreamiddleton
See #3244

#22 @iandunn
5 weeks ago

  • Keywords needs-patch good-first-bug 2nd-opinion removed
  • Owner set to iandunn
  • Status changed from new to accepted

r6094 takes care of #2 (preventing other organizers from viewing your payment details), and the notice about how long data will be retained and who has access to it.

I have the cron job to resolve #1 partially done, and will probably deploy it tomorrow.

#23 @iandunn
5 weeks ago

In 6108:

WordCamp Payments: Clarify that payment data is deleted, not just hidden.

See #3244

#24 @iandunn
5 weeks ago

In 6109:

WordCamp Payments: Lower the data retention period to 7 days after being paid

See #3244

#25 @iandunn
5 weeks ago

In 6110:

WordCamp Payments: Add mailing addresses to list of encrypted/deleted fields.

See #3244

#26 @iandunn
5 weeks ago

In 6111:

WordCamp Budgets: Delete old payment information to protect privacy.

See #3244

#27 @iandunn
5 weeks ago

It looks like all of the payment data for requests that were paid before 7 days ago has been deleted. The cron job is running twice per day, so that new requests will also have their data deleted once they leave the 7 day retention period.

I'm also planning on further restricting which network admins can view the data, limiting access to attachments, and adding more details about who has access to the data.

This ticket was mentioned in Slack in #meta-wordcamp by kcristiano. View the logs.


5 weeks ago

Note: See TracTickets for help on using tickets.