Kotlin creator's new language: talk to LLMs in specs, not English

257souvlakee | 8 hours ago | 222 | HN

loading story #47356817

As far as I can tell it's not a new language, but rather an alternative workflow for LLM-based development along with a tool that implements it.

The idea, IIUC, seems to be that instead of directly telling an LLM agent how to change the code, you keep markdown "spec" files describing what the code does and then the "codespeak" tool runs a diff on the spec files and tells the agent to make those changes; then you check the code and commit both updated specs and code.

It has the advantage that the prompts are all saved along with the source rather than lost, and in a format that lets you also look at the whole current specification.

The limitation seems to be that you can't modify the code yourself if you want the spec to reflect it (and also can't do LLM-driven changes that refer to the actual code), and also that in general it's not guaranteed that the spec actually reflects all important things about the program, so the code does also potentially contain "source" information (for example, maybe your want the background of a GUI to be white and it is so because the LLM happened to choose that, but it's not written in the spec).

The latter can maybe be mitigated by doing multiple generations and checking them all, but that multiplies LLM and verification costs.

Also it seems that the tool severely limits the configurability of the agentic generation process, although that's just a limitation of the specific tool.

loading story #47352772

loading story #47353009

loading story #47352499

loading story #47352798

loading story #47352438

the_duke7 hours ago | parent | next

This doesn't make too much sense to me.

* This isn't a language, it's some tooling to map specs to code and re-generate

* Models aren't deterministic - every time you would try to re-apply you'd likely get different output (without feeding the current code into the re-apply and let it just recommend changes)

* Models are evolving rapidly, this months flavour of Codex/Sonnet/etc would very likely generate different code from last months

* Text specifications are always under-specified, lossy and tend to gloss over a huge amount of details that the code has to make concrete - this is fine in a small example, but in a larger code base?

* Every non-trivial codebase would be made up of of hundreds of specs that interact and influence each other - very hard (and context - heavy) to read all specs that impact functionality and keep it coherent

I do think there are opportunities in this space, but what I'd like to see is:

* write text specifications

* model transforms text into a *formal* specification

* then the formal spec is translated into code which can be verified against the spec

2 and three could be merged into one if there were practical/popular languages that also support verification, in the vain of ADA/Spark.

But you can also get there by generating tests from the formal specification that validate the implementation.

loading story #47354224

loading story #47352595

loading story #47352300

loading story #47354002

loading story #47353203

loading story #47353999

loading story #47353557

loading story #47352549

loading story #47353124

loading story #47352581

loading story #47357316

loading story #47355882

loading story #47356210

loading story #47356266

le-mark7 hours ago | parent | next

This concept is assuming a formalized language would make things easier somehow for an llm. That’s making some big assumptions about the neuro anatomy if llms. This [1] from the other day suggests surprising things about how llms are internally structured; specifically that encoding and decoding are distinct phases with other stuff in between. Suggesting language once trained isn’t that important.

[1] https://news.ycombinator.com/item?id=47322887

loading story #47352856

loading story #47356596

tonipotato7 hours ago | parent | next

The problem with formal prompting languages is they assume the bottleneck is ambiguity in the prompt. In my experience building agents, the bottleneck is actually the model's context understanding. Same precise prompt, wildly different results depending on what else is in the context window. Formalizing the prompt doesn't help if the model builds the wrong internal representation of your codebase. That said curious to see where this goes.

loading story #47352518

loading story #47353902

loading story #47353739

loading story #47355357

alexc056 hours ago | parent | next

this is really exciting and dovetails really closely with the project I'm working on.

I'm writing a language spec for an LLM runner that has the ability to chain prompts and hooks into workflows.

https://github.com/AlexChesser/ail

I'm writing the tool as proof of the spec. Still very much a pre-alpha phase, but I do have a working POC in that I can specify a series of prompts in my YAML language and execute the chain of commands in a local agent.

One of the "key steps" that I plan on designing is specifically an invocation interceptor. My underlying theory is that we would take whatever random series of prose that our human minds come up with and pass it through a prompt refinement engine:

> Clean up the following prompt in order to convert the user's intent > into a structured prompt optimized for working with an LLM > Be sure to follow appropriate modern standards based on current > prompt engineering reasech. For example, limit the use of persona > assignment in order to reduce hallucinations. > If the user is asking for multiple actions, break the prompt > into appropriate steps (**etc...)

That interceptor would then forward the well structured intent-parsed prompt to the LLM. I could really see a step where we say "take the crap I just said and turn it into CodeSpeak"

What a fantastic tool. I'll definitely do a deep dive into this.

loading story #47353661

h4ch17 hours ago | parent | next

You can basically condense this entire "language" into a set of markdown rules and use it as a skill in your planning pipeline.

And whatever codespeak offers is like a weird VCS wrapper around this. I can already version and diff my skills, plans properly and following that my LLM generated features should be scoped properly and be worked on in their own branches. This imo will just give rise to a reason for people to make huge 8k-10k line changes in a commit.

loading story #47356146

loading story #47354095

loading story #47354893

roxolotl7 hours ago | parent | next

This doesn't seem particularly formal. I still remain unconvinced reducing is really going to be valuable. Code obviously is as formal as it gets but as you trend away from that you quickly introduce problems that arise from lack of formality. I could see a world in which we're all just writing tests in the form of something like Gherkin though.

loading story #47352511

loading story #47352097

loading story #47355043

loading story #47354850

loading story #47353944

mft_7 hours ago | parent | next

Conceptually, this seems a good direction.

The other piece that has always struck me as a huge inefficiency with current usage of LLMs is the hoops they have to jump through to make sense of existing file formats - especially making sense of (or writing) complicated semi-proprietary formats like PDF, DOC(X), PPT(X), etc.

Long-term prediction: for text, we'll move away from these formats and towards alternatives that are designed to be optimal for LLMs to interact with. (This could look like variants of markdown or JSON, but could also be Base64 [0] or something we've not even imagined yet.)

[0] https://dnhkng.github.io/posts/rys/

loading story #47352229

loading story #47355062

xvedejas7 hours ago | parent | next

We already have a language for talking to LLMs: Polish

https://www.zmescience.com/science/news-science/polish-effec...

gritzko8 hours ago | parent | next

So is it basically Markdown? The landing does not articulate, unfortunately, what the key contribution is.

loading story #47351758

loading story #47354664

loading story #47353667

loading story #47355498

ppqqrr6 hours ago | parent | next

i’ve been doing this for a while, you create an extra file for every code file, sketch the code as you currently understand it (mostly function signatures and comments to fill in details), ask the LLM to help identify discrepancies. i call it “overcoding”.

i guess you can build a cli toolchain for it, but as a technique it’s a bit early to crystallize into a product imo, i fully expect overcoding to be a standard technique in a few years, it’s the only way i’ve been able to keep up with AI-coded files longer than 1500 lines

loading story #47355214

loading story #47354647

loading story #47354584

loading story #47354509

loading story #47355146

loading story #47353175

WillAdams6 hours ago | parent | next

This raises a question --- how well do LLMs understand Loglan?

https://www.loglan.org/

Or Lojban?

https://mw.lojban.org/

loading story #47353330

loading story #47353334

Cpoll7 hours ago | parent | next

> The spec is the source of truth

This feels wrong, as the spec doesn't consistently generate the same output.

But upon reflection, "source of truth" already refers to knowledge and intent, not machine code.

loading story #47354283

loading story #47353952

ljlolel7 hours ago | parent | next

Getting so close to the idea. We will only have Englishscripts and don’t need code anymore. No compiling. No vibe coding. No coding. Https://jperla.com/blog/claude-electron-not-claudevm

loading story #47351968

cesarvarela7 hours ago | parent | next

Instead of using tabs, it would be much better to show the comparison side by side.

Also, the examples feel forced, as if you use external libraries, you don't have to write your own "Decode RFC 2047"

amelius7 hours ago | parent | next

I want to see an LLM combined with correctness preserving transforms.

So for example, if you refactor a program, make the LLM do anything but keep the logic of the program intact.

loading story #47354191

loading story #47354379

fallkp7 hours ago | parent | next

"Coming soon: Turning Code into Specs"

There you have it: Code laundering as a service. I guess we have to avoid Kotlin, too.

loading story #47356027

loading story #47353135

7 hours ago | parent | next

{"deleted":true,"id":47352144,"parent":47350931,"time":1773329235,"type":"comment"}

oytis7 hours ago | parent | next

Then of course we are going to ask LLMs to generate specifications in this new language

loading story #47355390

loading story #47355041

loading story #47354009

loading story #47354471

Brajeshwar7 hours ago | parent | next

So, back to a programming language, albeit “simplified.”

loading story #47352851

loading story #47354059

7 hours ago | parent | next

{"deleted":true,"id":47352298,"parent":47350931,"time":1773329684,"type":"comment"}

oceanwaves7 hours ago | parent | next

https://thinkwright.ai/simplex

pjmlp7 hours ago | parent | next

I think stuff like Langflow and n8n are more likely to be adopted, alongside with some more formal specifications.

jajuuka7 hours ago | parent | next

We created programming languages to direct programs. Then created LLM's to use English to direct programs. Now we've create programming languages to direct LLM's. What is old is new again!

loading story #47354555

loading story #47354391

loading story #47353629

tamimio6 hours ago | parent | next

As someone who hates writing (and thus coding) this might be a good tool, but how’s is it different from doing the same in claude? And I only see python, what about other languages, are they also production grade?

kittikitti7 hours ago | parent | next

The intent of the idea is there, and I agree that there should be more precise syntax instead of colloquial English. However, it's difficult to take CodeSpeak seriously as it looks AI generated and misses key background knowledge.

I'm hoping for a framework that expands upon Behavior Driven Development (BDD) or a similar project-management concept. Here's a promising example that is ripe for an Agentic AI implementation, https://behave.readthedocs.io/en/stable/philosophy/#the-gher...

whalesalad7 hours ago | parent | next

https://en.wikipedia.org/wiki/Literate_programming

theoriginaldave7 hours ago | parent | next

I for one can't wait to be a confident CodeSpeak programmer /sarc

Does this make it a 6th generation language?

loading story #47354841

loading story #47356293

yellow_lead6 hours ago | parent | next

So, just a markdown file?

loading story #47356265

loading story #47353875

kleiba7 hours ago | parent | next

I cannot read light on black. I don't know, maybe it's a condition, or simply just part of getting old. But my eyes physically hurt, and when I look up from reading a light-on-black screen, even when I looked at only for a short moment, my eyes need seconds to adjust again.

I know dark mode is really popular with the youngens but I regularly have to reach for reader mode for dark web pages, or else I simply cannot stand reading the contents.

Unfortunately, this site does not have an obvious way of reading it black-on-white, short of looking at the HTML source (CTRL+U), which - in fact - I sometimes do.

loading story #47352273

loading story #47353489

loading story #47354365

lich_king7 hours ago | parent | next

We built LLMs so that you can express your ideas in English and no longer need to code.

Also, English is really too verbose and imprecise for coding, so we developed a programming language you can use instead.

Now, this gives me a business idea: are you tired of using CodeSpeak? Just explain your idea to our product in English and we'll generate CodeSpeak for you.

loading story #47352263

loading story #47352410

loading story #47353077

loading story #47352057

loading story #47351780

loading story #47351841

loading story #47352279

loading story #47354714

loading story #47355844

loading story #47354771

loading story #47354677

#visit	13,079,710
#session	74,665
#live-session	0