Curious what the consensus is on how GH should have approached this to avoid suc...

BlueTemplar · on July 8, 2021

Code (co)created with Copilot has to follow all the licenses of the source (heh) code. This generally means at the very least automatically including in projects getting help from Copilot a copy of all the licenses involved, and attribution for all the people the code of which Copilot has been trained on.

(Not sure for the cases where there is no license and therefore normal copyright applies, but AFAIK this isn't the case for any code on Github, which automatically gets an open source licence ?

EDIT : Code in public repositories seems to be "forkable" on Github itself but not copyable (to elsewhere). That's some nasty walled garden stuff right there, I wonder how legal that ToS is ? I could see how this could make them to incentivize people to stop using other licenses on Github, to not have to deal with this license mess... EEE yet again ?)

neom · on July 8, 2021

So I guess then, the first thing they should have done, is trained it to understand licenses, and used that as a first principle for how they built the system?

whimsicalism · on July 8, 2021

Is it a derivative work of GPL licensed work if it is trained on the license? Is the GPL license text under GPL?

BlueTemplar · on July 8, 2021

> GNU GENERAL PUBLIC LICENSE

> Version 3, 29 June 2007

> Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed.

BlueTemplar · on July 8, 2021

Seems to be too much effort (is it even possible to link the source to the end result ?), and might not be admissible, so just include a database with all of the relevant licenses and authors ?

tazjin · on July 8, 2021

> Second best case scenario

Not really, consider for example repositories mirrored to Github.

It seems unclear who has the rights to grant this permission anyways (with free software licenses). Probably the copyright holder? Who that is might also be complicated.

doritosfan84 · on July 8, 2021

In that hypothetical I wouldn’t think GitHub is responsible for determining if a repository is mirrored and what the implications of that are. They just need to look at what license is on the repo in GitHub.

neom · on July 8, 2021

Good point, I would have thought GH requires you to agree in some TOS that you have permission to put the code on GH (but I don't know)? If so, could that point be put aside? (I'm not a software engineer so sorry if that made no sense. Super curious about the whole codepilot thing from a business and community perspective)

tazjin · on July 8, 2021

> that you have permission to put the code on GH

This is the complicated bit: All open-source licenses grant you permission to redistribute the code (usually with stipulations like having to include the license), so you are almost always allowed to upload the code to Github.

What it doesn't mean however is that you're the copyright holder of that code, you're merely redistributing work that somebody else has ownership of.

So who gets to decide what Github is allowed to do with it?

I expect this will end up in courts and we won't get a definite answer before that.

neom · on July 8, 2021

If you'll entertain me on a hypothetical for a moment. Suppose then the copious amount of intelligent folks over at GH know this will eventually end up in the courts, and expected that from the start. Would you suggest they messaged/rolled it out any differently? Did they do exactly what they needed to do so that it did end up in the courts? Should they have done anything differently to not piss folks off so much? Sorry for the million questions, you seem to know/have thought a bit about this. Thanks! :)

lukeplato · on July 8, 2021

They should have only used code from projects that included a license that allow for commercial use or made their model openly available and/or free to use

whimsicalism · on July 8, 2021

How does attribution work then?