Open Source Asked by Dharmesh Tarapore on November 13, 2021
I have built a software tool that recognizes text from images of handwriting. I want to make the code public under a permissive license (like Apache 2.0), but with the stipulation that all transcriptions and images be shared to a (publicly accessible) datastore.
I intend to use these data to further improve the software tool for everyone.
Is there an open source license that includes such a stipulation (i.e. you must allow the tool to upload your inputs to a publicly accessible datastore) or does this necessitate an entirely new form of license? I understand GPL et al cover changes to code, but I don’t see any mention of a data sharing policy.
This has many philosophical, social, and practical problems:
This could not be free or open software according to Debian's Free Software Guidelines since it fails the desert island test, i.e., it is necessary to communicate with some particular third party to use the software. Aside from being annoying for those who are able to comply, it is a huge practical problem for people who cannot comply due to political firewalls or regional unavailability of the necessary network infrastructure.
Your code may be used in contexts you never anticipated. Imagine someone used a piece of your code (e.g., pre-recognition image processing or something) to do something completely unrelated to handwriting recognition. Your requirement to send input to you would be onerous for them and could be quite bad for you as well, since you may be innundated with inputs unrelated to handwriting samples.
Generally, users of FLOSS software (and to a lesser extent, users of software in general, when I'm feeling optimisitic) expect submission of their own data to a vendor to be opt-in. A mandatory data-submission policy will not be popular. In addition to concerns about social norms, there may be legal issues as well: have you considered how such a submission policy complies with Europe's GDPR privacy requirements?
If your data store goes offline 50 years from now, can people still distribute your code? What if you change locations? What if you forget to renew your domain and someone else registers it? In other words, how will you specify the location where data must be sent in a way that is broad enough to be durable in the long term and specific enough to be legally meaningful?
I would say a better plan is to implement such a feature (preferably in an opt-in way), and people who don't wish to use it (or for whom use would be impossible or mutually undesirable) may opt-out or remove the feature from their version of the code. Yes, this is much weaker than your desired requirement, but it avoids the myriad problems listed above.
Answered by apsillers on November 13, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP