Unit tests as documentation

https://www.thecoder.cafe/p/unit-tests-as-documentation

I share this ideal, but also have to gripe that "descriptive test name" is where this falls apart, every single time.

Getting all your teammates to quit giving all their tests names like "testTheThing" is darn near impossible. It's socially painful to be the one constantly nagging people about names, but it really does take constant nagging to keep the quality high. As soon as the nagging stops, someone invariably starts cutting corners on the test names, and after that everyone who isn't a pedantic weenie about these things will start to follow suit.

Which is honestly the sensible, well-adjusted decision. I'm the pedantic weenie on my team, and even I have to agree that I'd rather my team have a frustrating test suite than frustrating social dynamics.

Personally - and this absolutely echoes the article's last point - I've been increasingly moving toward Donald Knuth's literate style of programming. It helps me organize my thoughts even better than TDD does, and it's earned me far more compliments about the readability of my code than a squeaky-clean test suite ever does. So much so that I'm beginning to hold hope that if you can build enough team mass around working that way it might even develop into a stable equilibrium point as people start to see how it really does make the job more enjoyable.

loading story #41874705

loading story #41874655

loading story #41874662

loading story #41875392

lucianbr9 hours ago | parent | next

One - unit tests explain nothing. They show what the output should be for a given input, but not why, or how you get there. I'm surprised by the nonchalant claim that "unit tests explain code". Am I missing something about the meaning of the english word "explain"?

Two - so any input value outside of those in unit tests is undocumented / unspecified behavior? A documentation can contain an explanation in words, like what relation should hold between the inputs and outputs in all cases. Unit tests by their nature can only enumerate a finite number of cases.

This seems like such an obviously not great idea...

atoav8 hours ago | parent | next

Not sure about this, but I like it the way it is done in the Rust ecosystem.

In Rust, there are two types of comments. Regular ones (e.g. starting with //) and doc-comments (e.g. starting with ///). The latter will land in in the generated documentation when you run cargo doc.

And now the cool thing: If you have example code in these doc comments, e.g. to explain how a feature of your library can be used, that script will automatically become part of the tests per default. That means you are unlikey to forget to update these examples when your code changes and you can use them as tests at the same time by asserting something at the end (which also communicates the outcome to the reader).

loading story #41874298

loading story #41872805

loading story #41872703

loading story #41875148

monocasa8 hours ago | parent | next

Unit tests can explain nothing. But so can paragraphs of prose.

The benefit of explanations in tests is that running them gets you closer to knowing if any of the explanations have bit rotted.

loading story #41872921

loading story #41874890

loading story #41874910

worldsayshi8 hours ago | parent | next

One: Can we test the tests using some somewhat formal specification of the why?

Two: my intuition says that exhaustively specifying the intended input output pairs would only hold marginal utility compared to testing a few well selected input output pairs. It's more like attaching the corners of a sheet to the wall than gluing the whole sheet to the wall. And glue is potentially harder to remove. The sheet is n-dimensional though.

loading story #41872651

loading story #41872973

loading story #41873690

__MatrixMan__8 hours ago | parent | next

Often, tests are parameterized over lists of cases such that you can document the general case near the code and document the specific cases near each parameter. I've even seen test frameworks that consume an excel spreadsheet provided by product so that the test results are literally a function of the requirements.

Would we prefer better docs than some comments sprinkled in strategic places in test files? Yes. Is having them with the tests maybe the best we can do for a certain level of effort? Maybe.

If the alternative is an entirely standalone repository of docs which will probably not be up to date, I'll take the comments near the tests. (Although I don't think this approach lends itself to unit tests.)

loading story #41873888

loading story #41874566

simonw9 hours ago | parent | next

A trick I use a lot these days is to take the unit tests from an under-documented library, dump them into an LLM and ask it to write me detailed usage documentation.

This works REALLY well. I've even occasionally done some of my own reviewing and editing of those docs and submitted them back to the project. Here's an example: https://github.com/pydantic/jiter/pull/143 - Claude transcript here: https://gist.github.com/simonw/264d487db1a18f8585c2ca0c68e50...

8 hours ago | parent

{"deleted":true,"id":41872536,"parent":41871877,"time":1729190746,"type":"comment"}

loading story #41875719

benrutter8 hours ago | parent | next

I've heard "tests are documentation" a lot, and even said it without thinkibg much myself. It sounds good, and I definitely like the idea of it, but I'm not sure it's true. Here's my thinking:

- I've never tried to understand a code base by looking at the tlunit tests first. They often require more in depth understanding (due to things like monkeypatching) than just reading the code. I haven't seen anyone else attempt this either.

- Good documentation is good as far as it aids understanding. This might be a side effect of tests, but I don't think it's their goal. A good test will catch breaks in behaviour, I'd never trade completeness for readability in tests, in docs it's the reverse.

So I think maybe, unit tests are just tests? They can be part of your documentation, but calling them documentation in and of themselves I think is maybe just a category error?

latchkey7 hours ago | parent | next

Nobody is mentioning this. Tests are for change over time, they are not just for testing the output is the same.

When you have a codebase sitting around rotting for years and you need to go back and refactor things to add a feature or change the behavior, how do you know you aren't breaking some dependent code down the line?

What happens when you upgrade a 3rd party dependency, how do you know it isn't breaking your code? The javascript ecosystem is rife with this. You can't upgrade anything years later or you have to start over again.

Tests are especially important when you've quit your company and someone else is stuck maintaining your code. The only way they can be sure to have all your ingrained knowledge is to have some sort of reliable way of knowing when things break.

Tests are for preventing the next developer from cursing you under their breath.

loading story #41875674

jaredcwhite8 hours ago | parent | next

I very much disagree with this.

Good code can be documentation, both in the way it's written and structured and obviously in the form of comments.

Good tests simply verify what the author of the test believes the behavior of what is being tested should be. That's it. It's not documentation, it rarely "explains" anything, and any time someone eschews actually writing documentation in the form of good code hygiene and actual docs in favor of just writing tests causes the codebase to suffer.

loading story #41874406

hombre_fatal3 hours ago | parent | next

I like how they couldn't be bothered to show examples of this ideal unit test code they think is just as good as documentation, just like people who can't be bothered to write docs.

In reality, except for the most trivial projects or vigilant test writers, tests are too complicated to act as a stand in for docs.

They are usually abstract in an effort to DRY things up such that you don't even get to see all the API in one place.

I'd rather keep tests optimized for testing rather than nerfing them to be readable to end users.

tln9 hours ago | parent | next

Extracting unit tests from your docs: great!

Somehow extracting your docs from unit tests: might be ok!

Pointing people at unit tests instead of writing docs: not even remotely ok.

Is that really what this guy is advocating??

bluefirebrand9 hours ago | parent | next

> Pointing people at unit tests instead of writing docs: not even remotely ok.

Couldn't agree more

I'm trying to integrate with a team at work that is doing this, and I'm finding it impossible to get a full picture of what their service can do.

I've brought it up with my boss, their boss, nothing happens

And then the person writing the service is angry that everyone is asking him questions about it all the time. "Just go read the tests! You'll see what it does if you read the tests!"

Incredibly frustrating to deal with when my questions are about the business rules for the service, not the functionality of the service

loading story #41873496

loading story #41875354

teivah8 hours ago | parent

No, not replacing documentation is a way to enrich documentation. That being said, that should have been clearer; I will update it.

Thanks, "This guy"

rglover9 hours ago | parent | next

Just write the docs. A simple template:

- What is it?

- What does it do?

- Why does it do that?

- What is the API?

- What does it return?

- What are some examples of proper, real world usage (that don't involve foo/bar but instead, real world inputs/outputs I'd likely see)?

MathMonkeyMan9 hours ago | parent | next

I was going to say that unit tests have the benefit of breaking when the truth changes.

But then I realized that a lot of what makes a set of tests good documentation is comments, and those rot, maybe worse than dedicated documentation.

Keeping documentation up to date is a hard problem that I haven't yet seen solved in my career.

rglover9 hours ago | root | parent | next

The only fix for that is discipline. You can't automate away quality. The best people/teams understand that and make good docs a feature requirement, not an afterthought.

My favorite example is Stripe. They've never skimped on docs and you can tell they've made it a core competency requirement for their team.

loading story #41872469

loading story #41872339

loading story #41874894

sbuttgereit8 hours ago | root | parent | next

Elixir's documentation (ExDoc) & unit testing framework (ExUnit) doesn't solve this problem but provides a facility to ease it a bit.

In the documentation, you can include code examples that, if written a certain way, not only looks good when rendered but can also be tested for their form and documented outputs as well. While this doesn't help with the descriptive text of documentation, at least it can flag you when the documented examples are no longer valid... which can in turn capture your attention enough to check out the descriptive elements of that same area of documentation.

This isn't to say these documentation tests are intended to replace regular unit tests: these documentation tests are really just testing what is easily testable to validate the documentation, the code examples.

Something can be better than nothing and I think that's true here.

Ygg28 hours ago | root | parent

> Keeping documentation up to date is a hard problem that I haven't yet seen solved in my career.

Rust doctests. They unite documentation and unit test. Basically documentation that's never so out of sync its assert fail.

loading story #41873752

croes9 hours ago | parent

Why is a hard question.

And what should be obvious or it’s still too complex.

rglover9 hours ago | root | parent

If why is hard it may not need to exist. For example:

"This function exists to generate PDFs for reports and customer documents."

"This endpoint exists to provide a means for pre-flight authorization of requests to other endpoints."

loading story #41872655

PaulHoule9 hours ago | parent | next

Actually every example in the documentation should be backed by a unit test, as in the example is transcluded from the unit test into the docs. Since you often want to show examples that don’t compile in docs you also should be able to write tests for compile errors.

red2awn9 hours ago | parent

Better yet, use doc test as featured in Python [1] or Rust [2]. This makes sure your documentation examples are always up-to-date and runnable.

[1]: https://docs.python.org/3/library/doctest.html

[2]: https://doc.rust-lang.org/rustdoc/write-documentation/docume...

danjl9 hours ago | parent | next

Why just unit tests? Integration tests seem much more valuable as documentation of what the users will do in the app. Unit tests have limited benefits overall, and add a bunch of support time, slowing down development. If you have good (90%+) coverage just from integration tests, you are likely doing 90%+ coverage of the unit tests at the same time, without the extra effort or support burden. You can use the same reasoning to describe the benefits for understanding the code, you get a clear understanding of the important usage cases, plus you get the unit-level "documentation" for free.

avensec8 hours ago | parent | next

Your point is valid, and some of the dialog in the replies to your comment is also valid. So, I'm just responding to the root of the dialog. What architectures are you working with that suggest higher integration test strategies?

I'd suggest that the balance between Unit Test(s) and Integration Test(s) is a trade-off and depends on the architecture/shape of the System Under Test.

Example: I agree with your assertion that I can get "90%+ coverage" of Units at an integration test layer. However, the underlying system would suggest if I would guide my teams to follow this pattern. In my current stack, the number of faulty service boundaries means that, while an integration test will provide good coverage, the overhead of debugging the root cause of an integration failure creates a significant burden. So, I recommend more unit testing, as the failing behaviors can be identified directly.

And, if I were working at a company with better underlying architecture and service boundaries, I'd be pointing them toward a higher rate of integration testing.

So, re: Kent Dodds "we write tests for confidence and understanding." What layer we write tests at for confidence and understanding really depends on the underlying architectures.

evil-olive8 hours ago | parent | next

unit vs integration tests is not an either/or. you need both, and in appropriate coverage amounts.

a common way to think about this is called the "test pyramid" - unit tests at the base, supporting integration tests that are farther up the pyramid. [0]

roughly speaking, the X-axis of the pyramid is number of test cases, the Y-axis is number of dependencies / things that can cause a test to fail.

as you travel up the Y-axis, you get more "lifelike" in your testing...but you also generally increase the time & complexity it takes to find the root-cause of a test failure.

many times I've had to troubleshoot a failure in an integration test that is trying to test subsystem A, and it turns out the failure was caused by unrelated flakiness in subsystem B. it's good to find that flakiness...but it's also important to be able to push that testing "down the pyramid" and add a unit test of subsystem B to prevent the flakiness from reoccurring, and to point directly at the problem if it does.

> Unit tests have limited benefits overall, and add a bunch of support time, slowing down development

unit tests, _when done poorly_, have limited benefits, require additional maintenance, and slow down development.

integration tests can also have limited benefits, require additional maintenance, and slow down development time, _when done poorly_.

testing in general, _when done well_, increases development velocity and improves product quality in a way that completely justifies the maintenance burden of the additional code.

0: https://martinfowler.com/articles/practical-test-pyramid.htm...

smrtinsert9 hours ago | parent | next

If you look for edge cases in integration tests, you will have a combinatorial explosion of integration tests and you will be adding much more work. Unit tests save time, not lose it.

I make this part of my filtering potential companies to work with now. I can't believe how often people avoid doing unit tests.

danjl9 hours ago | root | parent

That's funny, since I wouldn't code at a place that mandates unit tests. Sure, they have a very minor role, in very specific cases, but I'd say 90% of projects can get 90% of the benefits by writing only integration tests with 90% coverage. If you'd like a more in-depth discussion of why integration testing is better: https://kentcdodds.com/blog/write-tests

theLiminator9 hours ago | parent

I think unit testing if you're testing in a blackbox manner. Whitebox unit testing tends to be very fragile and nowhere near as valuable as an integration test.

janalsncm5 hours ago | parent | next

Code and tests tell you what. They don’t tell you why. And if there’s a bug not covered in the tests, neither code nor tests can help you figure that out.

1980phipsi2 hours ago | parent | next

D has documented unit tests.

https://dlang.org/spec/unittest.html#documented-unittests

Nice when combined with CI since you’ll know if you accidentally break your examples.

ssalka3 hours ago | parent | next

I forget where I heard this, but early in my career someone described unit tests to me as "a contract between you and your code." Which seems largely true – when I write a test, I'm saying "this is how a given function should behave, and that contract should hold true over time." If my future self wants the code to behave differently, so be it, but the contract needs to be amended so that the new code changes are also in agreement with it.

Conversely, if you fail to write a unit test, there is no contract, and the code can freely diverge over time from what you think it ought to be doing.

eschneider9 hours ago | parent | next

Unit tests are a _kind_ of documentation, but are rarely a complete solution to "documenting code". In general, the folks who don't do adequate code documentation are the same folks who don't do adequate unit tests. :/

meindnoch9 hours ago | parent | next

Is this "article" written by a LLM?

"Tomorrow, you will receive your weekly recap on unit tests."

Please, no.

teivah9 hours ago | parent

As the post's author, no, it's not written by an LLM.

The Coder Cafe is a daily newsletter for coders; we go over different topics from Monday to Thursday, and on Friday, there's a recap ;)

exabrial2 hours ago | parent | next

If you want to see how to do this right, go look at the CDI specification for Java.

Every statement in the spec has a corresponding unit test, and it’s unbelievably incredible. Hats of to everyone that worked on this.

zahlman9 hours ago | parent | next

This isn't at all a new idea, but it's the first time I've seen it presented with this textbook AI style.

teivah9 hours ago | parent

Is there something problematic you think about the style? It's a genuine question.

I wrote a book, and when I created my newsletter, I wanted to have a shift in terms of style because, on the Internet, people don't have time. You can't write a post the same way you write a book. So, I'm following some principles taken here and there. But happy to hear if you have some feedback about the style itself :)

hannasm3 hours ago | parent | next

I like the idea of this article but I would say that it's actually integration tests that are documentation.

When learning a new codebase, and I'm looking for an example of how to use feature X I would look in the tests first or shortly after a web search.

It seems to me like the second half of this article also undermines the main idea and goal of using unit tests in this way though.

  > Descriptive test name, Atomic, Keep tests simple, Keep tests independent

A unit test that is good at documenting the system needs to be comprehensive, clear and in many cases filled with complexity that a unit test would ignore or hide.

A test with a bunch of mocks, helpers, overrides and assumptions does not help anyone understand things like how to use feature X or the correct way to solve a problem with the software.

There are merits to both kinds of tests in their time and place but good integration tests are really the best ones for documenting and learning.

kubectl_h8 hours ago | parent | next

I am starting to notice more and more unit tests in my org are written by AI -- I'm guessing usually after the implementation. I know this because I have, guiltily, done it and can tell when someone else has done it as well. I don't think anything can be done about this technically so it probably needs to be something discussed socially within the team.

_thisdot8 hours ago | parent

What is wrong with this? Tests involve a lot of hardcoding and mocking. I see this as an excellent use case for AI.

loading story #41872801

loading story #41875694

tqi4 hours ago | parent | next

Without further documentation (beyond a descriptive test name), I fear that unit tests inevitably become a kind of Chesterton's Fence...

danielovichdk7 hours ago | parent | next

Unit tests is documentation of assertions. Hence it documents the result of how the code results to specification.

It's of course not documentation in the sense of a manual to the detail of code it exercises, but it definitely helps if tests are proper crafted.

Attummm9 hours ago | parent | next

Unit tests as documentation have proven their worth over the years.

For example this recent feature was added through unit test as documentation.

https://github.com/Attumm/redis-dict/blob/main/extend_types_...

kbbgl879 hours ago | parent | next

I believe that doctest is the best of both worlds, https://docs.python.org/3/library/doctest.html

byyll4 hours ago | parent | next

Write your unit tests all you want but they are not documentation.

mihaigalos9 hours ago | parent | next

In TDD, u-tests are called "spec". Pretty much sums it up.

lucianbr8 hours ago | parent | next

So again, any inputs outside of those exemplified in unit tests are unspecified behaviour? How would this work for mathematical operators for example?

loading story #41872555

loading story #41875288

advisedwang8 hours ago | parent

spec and documentation are things different though?

worik9 hours ago | parent | next

Unit tests are valuable

But they are also pricy

I am interested in how people prevent unit tests becoming a maintenance burden over time.

I have seen so many projects with legacy failing tests. Any proposal to invest time and money cleaning them up dies on the alter of investing limited resources in developing features that make money

timeon7 hours ago | parent | next

"// The Coder Cafe"

if it had "///" it could have test in docs: https://doc.rust-lang.org/stable/book/ch14-02-publishing-to-...

eesmith8 hours ago | parent | next

I did not agree with most of the advice. Here are some examples:

> Unit tests explain [expected] code behavior

Unit tests rarely evaluate performance, so can't explain why something is O(n) vs O(n^2), or if it was supposed to be one or the other.

And of course the unit tests might not cover the full range of behaviors.

> Unit tests are always in sync with the code

Until you find out that someone introduced a branch in the code, eg, for performance purposes (classic refactor step), and forgot to do coverage tests to ensure the unit tests exercised both branches.

> Unit tests cover edge cases

Note the True Scotsman fallacy there? 'Good unit tests should also cover these cases' means that if it didn't cover those cases, it wasn't good.

I've seen many unit tests which didn't cover all of the edge cases. My favorite example is a Java program which turned something like "filename.txt" into "filename_1.txt", where the "_1" was a sequence number to make it unique, and ".txt" was required.

Turns out, it accepted a user-defined filename from a web form, which could include a NUL character. "\x00.txt" put it in an infinite loop due to it's incorrect error handling of "", which is how the Java string got interpreted as a filename.

> Descriptive test name

With some test systems, like Python's unittest, you have both the test name and the docstring. The latter can be more descriptive. The former might be less descriptive, but easier to type or select.

> Keep tests simple

That should be 'Keep tests understandable'. Also, 'too many' doesn't contribute information as by definition it's beyond the point of being reasonable.

Etheryte9 hours ago | parent

This is functionally not different from saying your code is your documentation. If it builds, then it's valid, etc. In other words, nonsense. Code, tests and documentation each serve a useful purpose and crucially they each serve a purpose that's distinct from the other ones, but supports them. Code is there to do the thing, tests are there to make sure the thing is done correctly, documentation is for other humans to understand what the thing is and how it's done.

wubrr8 hours ago | parent

Code as documentation is not nonsense at all. I do think high quality documentation should exist on it's own, but cleanly written and organized, well-commented code that is easy to read and understand is extremely valuable for many reasons. It IS a huge part of the documentation for the technical people that have to maintain the code and/or use it in advanced/specialized ways.

loading story #41872984

#visit	10087721
#session	44449
#live-session	0