Story Detail of id 47680506 | Liveview Hacker News

adrian_b20 hours ago | on: GLM-5.1: Towards Long-Horizon Tasks

For conversational purposes that may be too slow, but as a coding assistant this should work, especially if many tasks are batched, so that they may progress simultaneously through a single pass over the SSD data.

QuantumNomad_20 hours ago | parent | next

Three hour coffee break while the LLM prepares scaffolding for the project.

pbhjpbhj18 hours ago | root | parent | next

Like computing used to be. When I first compiled a Linux kernel it ran overnight on a Pentium-S. I had little idea what I was doing, probably compiled all the modules by mistake.

loading story #47684423

loading story #47682328

tempoponet3 hours ago | root | parent | next

Rather, Imagine you have 2-3 of these working 24/7 on top of what you're doing today. What does your backlog look like a month from now?

cyanydeez19 hours ago | root | parent

[flagged]

loading story #47681713

zozbot23420 hours ago | parent

Batching many disparate tasks together is good for compute efficiency, but makes it harder to keep the full KV-cache for each in RAM. You could handle this in an emergency by dumping some of that KV-cache to storage (this is how prompt caching works too, AIUI) and offloading loads for that too, but that adds a lot more overhead compared to just offloading sparsely-used experts, since KV-cache is far more heavily accessed.

#visit	13,259,653
#session	74,665
#live-session	0