OpenAI CPO Kevin Weil says their o1 model can now write legal briefs that previously were the domain of $1000/hour associates: "what does it mean when you can suddenly do $8000 of work in 5 minutes for $3 of API credits?" pic.twitter.com/MotT9Oo9rv— Tsarathustra (@tsarnick) October 19, 2024 OpenAI's Chief Product…
Law = rules. Action A is legal. Action B isn’t. Doing X + Y + Z constitutes action A and so on. Legal arguments have to abide by all rules of logic. Law is one of the most logic heavy fields out there.
As for LLMs not being able to reason, it’s very debatable. Whether they reason or not depends upon your definition of “reasoning”. Debating definitions here is useless however, as the end result speaks for itself.
If LLMs can pass certification exams for lawyers, then it means either one of two things:
I disagree with your first statement. Law is about the application of rules, not the rules themselves. In a perfect world, it would be about determining which law has precedence in matter at hand, a task in itself outside of AI capabilities as it involves weighing moral and ethical principles against eachother, but in reality it often comes down to why my interpretation of reality is the correct one.
Ai, which lacks morality by definition, is as capable in morals as it is describing smells. As for that human data the question quickly becomes which data? As expressed in literature, social media or un politics? And also which century? It’s enough to compare today with pre-millenia conditions to see how widely it differs.
As for 2. You assume that there is an objective reality free from emotion? There might be, but I am unsure if it can be perceived by anything living. Or AI, for that matter. It is after all, like you said, trained on human data.
Anyways, time will tell if Openai is correct in their assessment, or if humans will want the human touch. As a tool for trained professionals to use, sure. As a substitute for one? I’m not convinced yet.
Law = rules. Action A is legal. Action B isn’t. Doing X + Y + Z constitutes action A and so on. Legal arguments have to abide by all rules of logic. Law is one of the most logic heavy fields out there.
You’re ignoring the whole Job of a judge, where they put the actions and laws into a procedural, historical and social context (something which LLMs can’t emulate) to reach a verdict.
You know what’s way closer to “pure logic”? Programming? You know what’s the quality of the code LLMs shit out? It’s quite bad.
Debating definitions here is useless however, as the end result speaks for itself.
I also don’t agree with your assessment. If an LLM passes a perfect law exam (a thing that doesn’t really exist) and afterwards only invents laws and precedent cases, it’s still useless.
You’re ignoring the whole Job of a judge, where they put the actions and laws into a procedural, historical and social context (something which LLMs can’t emulate) to reach a verdict.
LLMs would have no problem doing any of this. There’s a discernible pattern in any judge’s verdict. LLMs can easily pick this pattern up.
You know what’s the quality of the code LLMs shit out?
LLMs in their current form are “spitting out” code in a very literal way. Actual programmers never do that. No one is smart enough to code by intuition. We write code, take a look at it, run it, see warnings/errors if any, fix them and repeat. No programmer writes code and gets it correct in the first try itself.
LLMs till now have had their hands tied behind their backs. They haven’t been able to run the code by themselves at all. They haven’t been able to do recursive reasoning.
TILL NOW.
The new O1 model (I think) is able to do that. It’ll just get better from here. Look at the sudden increase in the quality of code output. There’s a very strong reason as to why I believe this as well.
I heavily use LLMs for my code. They seem to write shit code in the first pass. I give it the output, the issues with the code, semantic errors if any and so on. By the third or fourth time I get back to it, the code it writes is perfect. I have stopped needing to manually type out comments and so on. LLMs do that for me now (of course, I supervise what it writes n don’t blindly trust it). Using LLMs has sped up my coding at least by 4 times (and I’m not even using a fine tuned model).
I also don’t agree with your assessment. If an LLM passes a perfect law exam (a thing that doesn’t really exist) and afterwards only invents laws and precedent cases, it’s still useless.
There’s no reason as to why it would do that. The underlying function behind verdicts/legal arguments has been the same, and will remain the same, because it’s based on logic and human morals. Tackling morals is easy because LLMs have been trained on human data. Their morals are a reflection of ours. If we want to specify our morals explicitly, then we could make them law (and we already have for the ones that matter most), which makes stuff even easier.
LLMs would have no problem doing any of this. There’s a discernible pattern in any judge’s verdict. LLMs can easily pick this pattern up.
That’s worse! You do see how that’s worse right?!?
You are factually correct, but those are called biases. That doesn’t mean that LLMs would be good at that job. It means they can do the job with comparable results for all the reasons that people are terrible at it. You’re arguing to build a racism machine because judges are racist.
Ok, so you just ignore the reports and continue to coast on feels over reals. Cool.
I didn’t. I went through your links. Your links however, pointed at a problem with the environment our LLMs are in instead of the LLMs themselves. The code one, where the LLM invents package names is not the LLMs fault. Can you accurately come up with package names just from memory? No. Neither can the LLM. Give the LLM the functionality to look up npm’s database. Give it the functionality to read the docs and then look at what it can do. I have done this myself (manually giving it this information), and it has been a beast.
As for the link in the reply, it’s a blog post about anecdotal evidence. Come on now… I literally have personal anecdotal evidence to parry this.
But whatever, you’re always going to go “AI bad AI bad AI bad” till it takes your job. I really don’t understand why AI denialism is so prevalent on lemmy, a leftist platform, where we should be discussing about seizing the new means of production instead of denying its existence.
Regardless, I won’t contribute to this thread any further, because I believe I’ve made my point.
The bar exam is just one part of a larger set of qualifications
The bar exam is just a (closed-book) proxy for the actual skills and knowledge being tested. While a reasonable proxy for humans, it is a poor proxy for computers
The short answer is being admitted by the bar; we already trust them to certify humans.
If for some reason I were arbiter, I would say a convincing record of doing actual legal work, vetted by existing lawyers. The legal profession already has a well-defined model of how non-lawyers can contribute to the work, so there is no need for a quantum leap up to being a lawyer.
I think you’re conflating formal and informal logic. Programmers are excellent at defining a formal logic system which the computer follows, but the computer itself isn’t particularly “logical”.
What you describe as:
Action A is legal. Action B isn’t. Doing X + Y + Z constitutes action A and so on.
Is a particularly nasty form of logic called abstract reasoning. Biological brains are very good at that! Computers a lot less so…
(Using a test designed to measure that)[https://arxiv.org/abs/1911.01547] humans average ~80% accuracy. The current best algorithm (last I checked…) has a 31% accuracy. (LLMs can get up to ~17% accuracy.)[https://arxiv.org/pdf/2403.11793] (With the addition of some prompt engineering and other fancy tricks). So they are technically capable… Just really bad at it…
Now law ismarketed as a very logical profession but, at least Western, modern law is more akin to combatative theater. The law as written serves as the base worldbuilding and case law serving as addition canon. The goal of law is to put on a performance with the goal of tricking the audience (typically judge, jury, opposing legal) that it is far more logical and internally consistent than it actually is.
That is essentially what LLMs are designed to do. Take some giant corpus of knowledge and return some permutation of it that maximizes the “believability” based on the input prompt. And it can do so with a shocking amount of internal logic and creativity. So it shouldn’t be shocking that they’re capable of passing bar exams, but that should not be conflated with them being rational, logical, fair, just, or accurate.
And neither should the law. Friendly reminder to fuck the police and the corrupt legal system they enforce.
The legal system does not revolve around logic and even it it was: LLMs can’t reason, so they’d be useless, anyways.
Law = rules. Action A is legal. Action B isn’t. Doing X + Y + Z constitutes action A and so on. Legal arguments have to abide by all rules of logic. Law is one of the most logic heavy fields out there.
As for LLMs not being able to reason, it’s very debatable. Whether they reason or not depends upon your definition of “reasoning”. Debating definitions here is useless however, as the end result speaks for itself.
If LLMs can pass certification exams for lawyers, then it means either one of two things:
I disagree with your first statement. Law is about the application of rules, not the rules themselves. In a perfect world, it would be about determining which law has precedence in matter at hand, a task in itself outside of AI capabilities as it involves weighing moral and ethical principles against eachother, but in reality it often comes down to why my interpretation of reality is the correct one.
As for 2. You assume that there is an objective reality free from emotion? There might be, but I am unsure if it can be perceived by anything living. Or AI, for that matter. It is after all, like you said, trained on human data.
Anyways, time will tell if Openai is correct in their assessment, or if humans will want the human touch. As a tool for trained professionals to use, sure. As a substitute for one? I’m not convinced yet.
You’re ignoring the whole Job of a judge, where they put the actions and laws into a procedural, historical and social context (something which LLMs can’t emulate) to reach a verdict.
You know what’s way closer to “pure logic”? Programming? You know what’s the quality of the code LLMs shit out? It’s quite bad.
Yes, it does speak for itself: They can’t.
Yes, the exams are flawed. This podcast episode takes a look at these supposed AI lawyers.
I also don’t agree with your assessment. If an LLM passes a perfect law exam (a thing that doesn’t really exist) and afterwards only invents laws and precedent cases, it’s still useless.
LLMs would have no problem doing any of this. There’s a discernible pattern in any judge’s verdict. LLMs can easily pick this pattern up.
LLMs in their current form are “spitting out” code in a very literal way. Actual programmers never do that. No one is smart enough to code by intuition. We write code, take a look at it, run it, see warnings/errors if any, fix them and repeat. No programmer writes code and gets it correct in the first try itself.
LLMs till now have had their hands tied behind their backs. They haven’t been able to run the code by themselves at all. They haven’t been able to do recursive reasoning. TILL NOW.
The new O1 model (I think) is able to do that. It’ll just get better from here. Look at the sudden increase in the quality of code output. There’s a very strong reason as to why I believe this as well.
I heavily use LLMs for my code. They seem to write shit code in the first pass. I give it the output, the issues with the code, semantic errors if any and so on. By the third or fourth time I get back to it, the code it writes is perfect. I have stopped needing to manually type out comments and so on. LLMs do that for me now (of course, I supervise what it writes n don’t blindly trust it). Using LLMs has sped up my coding at least by 4 times (and I’m not even using a fine tuned model).
There’s no reason as to why it would do that. The underlying function behind verdicts/legal arguments has been the same, and will remain the same, because it’s based on logic and human morals. Tackling morals is easy because LLMs have been trained on human data. Their morals are a reflection of ours. If we want to specify our morals explicitly, then we could make them law (and we already have for the ones that matter most), which makes stuff even easier.
That’s worse! You do see how that’s worse right?!?
You are factually correct, but those are called biases. That doesn’t mean that LLMs would be good at that job. It means they can do the job with comparable results for all the reasons that people are terrible at it. You’re arguing to build a racism machine because judges are racist.
Ok, so you just ignore the reports and continue to coast on feels over reals. Cool.
Another report contradicting you
Stop believing the hype. Sam Altman is lying to you.
I didn’t. I went through your links. Your links however, pointed at a problem with the environment our LLMs are in instead of the LLMs themselves. The code one, where the LLM invents package names is not the LLMs fault. Can you accurately come up with package names just from memory? No. Neither can the LLM. Give the LLM the functionality to look up npm’s database. Give it the functionality to read the docs and then look at what it can do. I have done this myself (manually giving it this information), and it has been a beast.
As for the link in the reply, it’s a blog post about anecdotal evidence. Come on now… I literally have personal anecdotal evidence to parry this.
But whatever, you’re always going to go “AI bad AI bad AI bad” till it takes your job. I really don’t understand why AI denialism is so prevalent on lemmy, a leftist platform, where we should be discussing about seizing the new means of production instead of denying its existence.
Regardless, I won’t contribute to this thread any further, because I believe I’ve made my point.
Look at what OpenAI, Google, Microsoft, etc. Does and tell me once again that this is supposedly good for the workers. Jeez. 🙄
Ok. What test would an LLM need to pass to convince you that it capable of being a lawyer?
The short answer is being admitted by the bar; we already trust them to certify humans.
If for some reason I were arbiter, I would say a convincing record of doing actual legal work, vetted by existing lawyers. The legal profession already has a well-defined model of how non-lawyers can contribute to the work, so there is no need for a quantum leap up to being a lawyer.
Draw a pentagon
I think you’re conflating formal and informal logic. Programmers are excellent at defining a formal logic system which the computer follows, but the computer itself isn’t particularly “logical”.
What you describe as:
Is a particularly nasty form of logic called abstract reasoning. Biological brains are very good at that! Computers a lot less so…
(Using a test designed to measure that)[https://arxiv.org/abs/1911.01547] humans average ~80% accuracy. The current best algorithm (last I checked…) has a 31% accuracy. (LLMs can get up to ~17% accuracy.)[https://arxiv.org/pdf/2403.11793] (With the addition of some prompt engineering and other fancy tricks). So they are technically capable… Just really bad at it…
Now law ismarketed as a very logical profession but, at least Western, modern law is more akin to combatative theater. The law as written serves as the base worldbuilding and case law serving as addition canon. The goal of law is to put on a performance with the goal of tricking the audience (typically judge, jury, opposing legal) that it is far more logical and internally consistent than it actually is.
That is essentially what LLMs are designed to do. Take some giant corpus of knowledge and return some permutation of it that maximizes the “believability” based on the input prompt. And it can do so with a shocking amount of internal logic and creativity. So it shouldn’t be shocking that they’re capable of passing bar exams, but that should not be conflated with them being rational, logical, fair, just, or accurate.
And neither should the law. Friendly reminder to fuck the police and the corrupt legal system they enforce.