It depends on what you are trying to do. LLMs are powerful, but can be difficult...

It depends on what you are trying to do.

LLMs are powerful, but can be difficult to work with w.r.t. doing post processing or adding markup/annotations in the text. For example, asking it to label terms in a sentence it can produce varied output, e.g. sometimes listing "word or clause: description of the meaning" which is hard to parse in downstream tasks, or sometimes out of order.

I've also seen LLMs label split infinitives with the correct meaning at the preposition instead of the whole subclause. With NLP you would label either the start and span of the label (common) or the root of the subclause, depending on the application. The NLP approach would work for more complex nested and interconnected expressions where the structure is not flat text.

If you ask it to generate XML or HTML, it will usually invent its own markup, or sometimes generate invalid output. E.g. I've seen it output HTML using tags like `<Question Word>` in some cases.

Asking it to generate CoNLL-U (which it knew about from its response) -- it only outputted 3 columns as "in _ PREP" etc. which is not correct.

Asking `Can you lemmatize "She is going to the bank to get some money."` I get `"She go(ing) to bank(er) for money(y)"` which leaves out words, doesn't lemmatize some words correctly, and has a format that is difficult to parse/interpret reliably.

LLMs also have a limited context window, so if you are trying to process a large document, or have a large complex set of instructions (e.g. on how to label, process, or format the data) then it can lose those instructions and deviate from what you are doing.

LLMs are also susceptible to being guided by the input text as the prompt and input are taken together. Thus, if the text is talking about a different format or something else it can easily switch to that.

While NLP pipelines require more work to get right, they can often be more efficient computationally as they are often a lot smaller than the 7b parameters, or use other techniques that don't use ML/NNs.

You can build more custom pipelines by querying the different features (part of speech, lemma, lexical features, etc.) without having to reparse the data, and you can keep the annotations consistent across those different pipelines, e.g. when labelling, extracting and storing the data in a database for searching, etc. so a user can see all the places where a term is referenced in a given text.