> This is my experience with how LLMs "draft" legal arguments: at first glance, it's plausible — but may be, and often is, invalid, unsound, and/or ill-advised.
Correct, and this of course extends past just laws, into the whole scope of rules and regulations described in human languages. It will by its nature imply things that aren't explicitly stated nor can be derived with certainty, just because they're very plausible. And those implications can be wrong.
Now I've had decent success with having LLMs then review these LLM-generated texts to flag such occurences where things aren't directly supported by the source material. But human review is still necessary.
The cases I've been dealing with are also based on relatively small sets of regulations compared the scope of the law involved with many legal cases. So I imagine that in the domain you're working on, much more needs flagging.