I disagree. I'd include overfitting for LLMs as creating unreasonably strong con...

		fennecfoxy 5 months ago \| parent \| context \| favorite \| on: Meta's Llama 3.1 can recall 42 percent of the firs... I disagree. I'd include overfitting for LLMs as creating unreasonably strong connections to individual sequences used for training, whereas a good mix of that and connections between chunks of those sequences are required.