Amazing but the instruction fine-tuning is still a huge challenge for businesses since what is released cannot be used for commercial purposes. Instructions are much more useful.
I have a feeling that there are probably some people who will look at the "commercial okay" license for the first part and in their mind that will somehow make it okay to use the instruction-tuned ones for commercial purposes.
Maybe we don't really need Instruct stuff? Because it seems like its a huge amount of redoing work. I wonder if the OpenAssistant people will start building off of these models.
I wonder what happens if you just feel that dataset back into another LLM to re-write it and filter out the low quality items? IS there still any connection to the original copyright? How would that even be proven?
I have a feeling that there are probably some people who will look at the "commercial okay" license for the first part and in their mind that will somehow make it okay to use the instruction-tuned ones for commercial purposes.
Maybe we don't really need Instruct stuff? Because it seems like its a huge amount of redoing work. I wonder if the OpenAssistant people will start building off of these models.