Story Detail of id 47423619 | Liveview Hacker News

cloudbonsai18 hours ago | on: Python 3.15's JIT is now back on track

> There is no caching of a "utf-8 representation".

No there certainly is. This is documented in the official API documentation:

    UTF-8 representation is created on demand and cached in the Unicode object.

    https://docs.python.org/3/c-api/unicode.html#unicode-objects

In particular, Python's Unicode object (PyUnicodeObject) contains a field named utf8. This field is populated when PyUnicode_AsUTF8AndSize() is first called and reused thereafter. You can check the exact code I'm talking about here:

https://github.com/python/cpython/blob/main/Objects/unicodeo...

Is it clear enough?

zahlman9 hours ago | parent

The C API may provide for it, but I'm not seeing a way to access that from Python. This sort of thing is provided for people writing C extensions who need to interface to other C code.

(And the code search seems to be broken; it can't find me the definition of `unicode_fill_utf8` although I'm sure it's obvious enough.)

#visit	13,165,726
#session	74,665
#live-session	0