> probabilistic graphical models- of which transformers is an example Having don...

dekhn · 2025-10-24T23:35:50 1761348950

I was talking about things inspired by (for example) hidden markov models. See https://en.wikipedia.org/wiki/Graphical_model

In biology, PGMs were one of the first successful forms of "machine learning"- given a large set of examples, train a graphical model using probabilities using EM, and then pass many more examples through the model for classification. The HMM for proteins is pretty straightforward, basically just a probabilistic extension of using dynamic programming to do string alignment.

My perspective- which is a massive simplification- is that sequence models are a form of graphical model, although the graphs tend to be fairly "linear" and the predictions generate sequences (lists) rather than trees or graphs.

pishpash · 2025-10-24T22:45:04 1761345904

It's got nothing to do with PGM's. However, there is the flavor of describing graph structure by soft edge weights vs. hard/pruned edge connections. It's not that surprising that one does better than the other, and it's a very obvious and classical idea. For a time there were people working on NN structure learning and this is a natural step. I don't think there is any breakthrough here, other than that computation power caught up to make it feasible.