@conderoga

conderoga@beehaw.org · 2 years ago

LLM generated text can also be easily detected provided you can figure out which model it came from and the weights within it. For people training models, this won’t be hard to do.

I agree with the take that getting better and better datasets for training is going to get easier over time, rather than harder. The story of AlphaZero is a good example of this too - the best chess AI quickly trounced any AI trained on human games simply by playing against itself. To me, that suggests that training on LLM output will lead to even better results, since you can generate so much more of it.

conderoga@beehaw.org · 2 years ago

A bunch of developers isn’t going to do anything though if they retain control over everything. I think after learning about this background, and their weird claims surrounding it yesterday, the path forward I would prefer is for a strong fork to emerge of the original code that instances deploy instead.