diff --git a/.github/CODEOWNERS b/.github/CODEOWNERS index e3585e55f60..bee6245d43a 100644 --- a/.github/CODEOWNERS +++ b/.github/CODEOWNERS @@ -540,7 +540,7 @@ peps/pep-0657.rst @pablogsal @isidentical @ammaraskar peps/pep-0658.rst @brettcannon peps/pep-0659.rst @markshannon peps/pep-0660.rst @pfmoore -peps/pep-0661.rst @taleinat +peps/pep-0661.rst @taleinat @JelleZijlstra peps/pep-0662.rst @brettcannon peps/pep-0662/ @brettcannon peps/pep-0663.rst @ethanfurman diff --git a/peps/pep-0661.rst b/peps/pep-0661.rst index a5a9cbe19df..bc0dbbb573c 100644 --- a/peps/pep-0661.rst +++ b/peps/pep-0661.rst @@ -1,16 +1,14 @@ PEP: 661 Title: Sentinel Values -Author: Tal Einat +Author: Tal Einat , Jelle Zijlstra Discussions-To: https://discuss.python.org/t/pep-661-sentinel-values/9126 -Status: Deferred +Status: Draft Type: Standards Track Created: 06-Jun-2021 +Python-Version: 3.15 Post-History: `20-May-2021 `__, `06-Jun-2021 `__ -TL;DR: See the `Specification`_ and `Reference Implementation`_. - - Abstract ======== @@ -39,8 +37,8 @@ uncommon enough that there hasn't been a clear need for standardization. However, the common implementations, including some in the stdlib, suffer from several significant drawbacks. -This PEP proposes adding a utility for defining sentinel values, to be used -in the stdlib and made publicly available as part of the stdlib. +This PEP proposes adding a built-in utility for defining sentinel values, to +be used in the stdlib and made publicly available to all Python code. Note: Changing all existing sentinels in the stdlib to be implemented this way is not deemed necessary, and whether to do so is left to the discretion @@ -72,8 +70,9 @@ in the discussion: 1. Some do not have a distinct type, hence it is impossible to define clear type signatures for functions with such sentinels as default values. -2. They behave unexpectedly after being copied or unpickled, due to a separate - instance being created and thus comparisons using ``is`` failing. +2. They behave unexpectedly after being copied, due to a separate instance + being created and thus comparisons using ``is`` failing. Some common + sentinel idioms have similar problems after being pickled and unpickled. In the ensuing discussion, Victor Stinner supplied a list of currently used sentinel values in the Python standard library [2]_. This showed that the @@ -118,8 +117,8 @@ The criteria guiding the chosen implementation were: 3. It should be simple to define as many distinct sentinel values as needed. 4. The sentinel objects should have a clear and short repr. 5. It should be possible to use clear type signatures for sentinels. -6. The sentinel objects should behave correctly after copying and/or - unpickling. +6. The sentinel objects should behave correctly after copying, and module-scope + sentinels should preserve identity when pickled and unpickled. 7. Such sentinels should work when using CPython 3.x and PyPy3, and ideally also with other implementations of Python. 8. As simple and straightforward as possible, in implementation and especially @@ -129,56 +128,83 @@ The criteria guiding the chosen implementation were: documentation. With so many uses in the Python standard library [2]_, it would be useful to -have an implementation in the standard library, since the stdlib cannot use -implementations of sentinel objects available elsewhere (such as the -``sentinels`` [5]_ or ``sentinel`` [6]_ PyPI packages). +have an implementation available to the standard library, since the stdlib +cannot use implementations of sentinel objects available elsewhere (such as +the ``sentinels`` [5]_ or ``sentinel`` [6]_ PyPI packages). After researching existing idioms and implementations, and going through many -different possible implementations, an implementation was written which meets -all of these criteria (see `Reference Implementation`_). +different possible implementations, the design below was chosen to meet these +criteria while keeping the API and implementation small (see +`Reference Implementation`_). Specification ============= -A new ``Sentinel`` class will be added to a new ``sentinellib`` module. +A new built-in callable named ``sentinel`` will be added. - >>> from sentinellib import Sentinel - >>> MISSING = Sentinel('MISSING') + >>> MISSING = sentinel('MISSING') >>> MISSING MISSING +``sentinel()`` takes a single positional-only argument, ``name``, which must +be a ``str``. Passing a non-string raises ``TypeError``. The name is used as +the sentinel's name and repr. + +Sentinel objects have two public attributes: + +* ``__name__`` is the sentinel's name. +* ``__module__`` is the name of the module where ``sentinel()`` was called. + +``sentinel`` may not be subclassed. + +Each call to ``sentinel(name)`` returns a new sentinel object. If a sentinel +is needed in more than one place, it should be assigned to a variable and that +same object should be reused explicitly, just as with the common +``MISSING = object()`` idiom:: + + MISSING = sentinel('MISSING') + + def read_value(default=MISSING): + ... + Checking if a value is such a sentinel *should* be done using the ``is`` operator, as is recommended for ``None``. Equality checks using ``==`` will also work as expected, returning ``True`` only when the object is compared with itself. Identity checks such as ``if value is MISSING:`` should usually be used rather than boolean checks such as ``if value:`` or ``if not value:``. -Sentinel instances are "truthy", i.e. boolean evaluation will result in +Sentinel objects are "truthy", i.e. boolean evaluation will result in ``True``. This parallels the default for arbitrary classes, as well as the boolean value of ``Ellipsis``. This is unlike ``None``, which is "falsy". -The names of sentinels are unique within each module. When calling -``Sentinel()`` in a module where a sentinel with that name was already -defined, the existing sentinel with that name will be returned. Sentinels -with the same name defined in different modules will be distinct from each -other. - Creating a copy of a sentinel object, such as by using ``copy.copy()`` or by -pickling and unpickling, will return the same object. +``copy.deepcopy()``, will return the same object. + +Sentinels importable from their defining module by name preserve their identity +when pickled and unpickled, using the standard pickle mechanism for named +singletons. When ``sentinel()`` creates a sentinel, it records the calling +module as the sentinel's ``__module__`` attribute. Pickling records the +sentinel by module and name. Unpickling then imports the module and retrieves +the sentinel by name, so the following round trip preserves identity:: + + MISSING = sentinel('MISSING') + assert pickle.loads(pickle.dumps(MISSING)) is MISSING -``Sentinel()`` will also accept a single optional argument, ``module_name``. -This should normally not need to be supplied, as ``Sentinel()`` will usually -be able to recognize the module in which it was called. ``module_name`` -should be supplied only in unusual cases when this automatic recognition does -not work as intended, such as perhaps when using Jython or IronPython. This -parallels the designs of ``Enum`` and ``namedtuple``. For more details, see -:pep:`435`. +Sentinels that are not importable by module and name, such as sentinels +created in a local scope and not assigned to a matching module global or class +attribute, are not picklable. -The ``Sentinel`` class may not be sub-classed, to avoid the greater complexity -of supporting subclassing. +The repr of a sentinel object is the ``name`` passed to ``sentinel()``. No +implicit module qualification is added. If a qualified repr is desired, the +qualified name should be passed explicitly:: -Ordering comparisons are undefined for sentinel objects. + >>> MyClass_NotGiven = sentinel('MyClass.NotGiven') + >>> MyClass_NotGiven + MyClass.NotGiven + +Ordering comparisons are undefined for sentinel objects. Sentinels do not +support weakrefs. Typing ------ @@ -191,16 +217,14 @@ Sentinel objects may be used in This is similar to how ``None`` is handled in the existing type system. For example:: - from sentinels import Sentinel - - MISSING = Sentinel('MISSING') + MISSING = sentinel('MISSING') def foo(value: int | MISSING = MISSING) -> int: ... More formally, type checkers should recognize sentinel creations of the form -``NAME = Sentinel('NAME')`` as creating a new sentinel object. If the name -passed to the ``Sentinel`` constructor does not match the name the object is +``NAME = sentinel('NAME')`` as creating a new sentinel object. If the name +passed to ``sentinel()`` does not match the name the object is assigned to, type checkers should emit an error. Sentinels defined using this syntax may be used in @@ -211,10 +235,9 @@ single member, the sentinel object itself. Type checkers should support narrowing union types involving sentinels using the ``is`` and ``is not`` operators:: - from sentinels import Sentinel from typing import assert_type - MISSING = Sentinel('MISSING') + MISSING = sentinel('MISSING') def foo(value: int | MISSING) -> None: if value is MISSING: @@ -223,20 +246,35 @@ using the ``is`` and ``is not`` operators:: assert_type(value, int) To support usage in type expressions, the runtime implementation -of the ``Sentinel`` class should have the ``__or__`` and ``__ror__`` +of sentinel objects should have the ``__or__`` and ``__ror__`` methods, returning :py:class:`typing.Union` objects. +C API +----- + +Sentinels can also be useful in C extensions. We propose two new +C API functions:: + +* ``PyObject *PySentinel_New(const char *name, const char *module_name)`` creates a new sentinel object. +* ``bool PySentinel_Check(PyObject *obj)`` checks if an object is a sentinel. + +C code can use the ``==`` operator to check if an object is a +specific sentinel. + Backwards Compatibility ======================= -This proposal should have no backwards compatibility implications. +Adding a new builtin means that code which currently relies on the bare name +``sentinel`` raising ``NameError`` will instead see the new builtin. This is +the usual compatibility consideration for new builtins. Existing local, +global, and imported names called ``sentinel`` are unaffected. How to Teach This ================= -The normal types of documentation of new stdlib modules and features, namely -doc-strings, module docs and a section in "What's New", should suffice. +The normal types of documentation of new builtins and features, namely +docstrings, library docs and a section in "What's New", should suffice. Security Implications @@ -248,45 +286,45 @@ This proposal should have no security implications. Reference Implementation ======================== -The reference implementation is found in a dedicated GitHub repo [7]_. A -simplified version follows:: +A reference implementation is available as a CPython pull request [10]_. A +previous reference implementation is found in a dedicated GitHub repo [7]_. +A sketch of the intended behavior follows:: - _registry = {} - - class Sentinel: + class sentinel: """Unique sentinel values.""" - def __new__(cls, name, module_name=None): - name = str(name) - - if module_name is None: - module_name = sys._getframemodulename(1) - if module_name is None: - module_name = __name__ - - registry_key = f'{module_name}-{name}' + __slots__ = ("__name__", "_module_name") - sentinel = _registry.get(registry_key, None) - if sentinel is not None: - return sentinel + def __init_subclass__(cls): + raise TypeError("type 'sentinel' is not an acceptable base type") - sentinel = super().__new__(cls) - sentinel._name = name - sentinel._module_name = module_name + def __init__(self, name, /): + if not isinstance(name, str): + raise TypeError("sentinel name must be a string") + self.__name__ = name + self._module_name = sys._getframemodulename(1) - return _registry.setdefault(registry_key, sentinel) + @property + def __module__(self): + return self._module_name def __repr__(self): - return self._name + return self.__name__ def __reduce__(self): - return ( - self.__class__, - ( - self._name, - self._module_name, - ), - ) + return self.__name__ + + def __copy__(self): + return self + + def __deepcopy__(self, memo): + return self + + def __or__(self, other): + return typing.Union[self, other] + + def __ror__(self, other): + return typing.Union[other, self] Rejected Ideas @@ -391,6 +429,57 @@ idiom were unpopular, with the highest-voted option being voted for by only 25% of the voters. +Use a new standard library module +--------------------------------- + +Earlier drafts proposed adding a ``Sentinel`` class to a new ``sentinels`` or +``sentinellib`` module. However, adding a new module for a single public +callable is unnecessary, and using a module makes the feature less convenient +than the existing ``object()`` idiom. The Steering Council also specifically +encouraged making the feature a builtin so that it is at least as easy to use +as ``object()``. + +Using the name ``sentinels`` would also conflict with an existing, actively +used PyPI package. While other module names are possible, making the feature a +builtin avoids the naming problem entirely. + + +Use a registry of per-module sentinel names +------------------------------------------- + +Earlier drafts proposed making sentinel names unique within each module. Under +that design, repeated calls such as ``Sentinel("MISSING")`` from the same +module would return the same object, using a process-global registry keyed by +module name and sentinel name. + +This was rejected because the behavior is too implicit. Code that needs a +shared sentinel can define one explicitly and reuse it by name, just as code +already does with ``MISSING = object()``. Code in a local scope may also want +a fresh sentinel for each call or iteration, and repeated calls to +``sentinel(name)`` should behave like repeated calls to ``object()`` by +creating distinct objects. + +Removing the registry also keeps the implementation and mental model simpler: +``sentinel(name)`` creates a new unique object whose repr is ``name``. + + +Automatically discover or pass a module name +-------------------------------------------- + +Earlier drafts proposed an optional ``module_name`` argument, with frame +inspection used to discover the caller's module when the argument was omitted. +This followed the precedent of ``Enum`` and ``namedtuple``, and supported the +registry-based design. + +With the registry removed, a public ``module_name`` argument is no longer +needed for the core proposal. The implementation still records the calling +module internally, as ``TypeVar`` and similar helpers do, so that pickle can +serialize importable sentinels by module and name. This internal module name +does not affect the sentinel's repr. If users want a repr that includes a +module or class name, they can include it in the single ``name`` argument +explicitly, e.g. ``sentinel("mymodule.MISSING")``. + + Allowing customization of repr ------------------------------ @@ -399,6 +488,20 @@ changing their repr. However, this was eventually dropped as it wasn't considered worth the added complexity. +Allowing customization of boolean evaluation +-------------------------------------------- + +Discussions considered allowing sentinels to be explicitly truthy, falsy, or +not convertible to ``bool``. Some existing third-party sentinels expose falsy +behavior as part of their public API, and several participants argued that +raising in boolean contexts would better enforce identity checks. + +This PEP keeps the initial proposal simpler by giving sentinels the default +truthy behavior of ordinary objects and by recommending identity checks. +Custom boolean behavior may be considered later if the added API and typing +complexity is judged worthwhile. + + Using ``typing.Literal`` in type annotations -------------------------------------------- @@ -414,29 +517,22 @@ advantages of not requiring an import and being much shorter. Additional Notes ================ -* This PEP and the initial implementation are drafted in a dedicated GitHub - repo [7]_. +* This PEP and an initial implementation were drafted in a dedicated GitHub + repo [7]_. The implementation should be updated to match this simplified + proposal. * For sentinels defined in a class scope, to avoid potential name clashes, - one should use the fully-qualified name of the variable in the module. The - full name will be used as the repr. For example:: + or when a qualified repr would be clearer, one should pass the desired + qualified name explicitly. For example:: >>> class MyClass: ... NotGiven = sentinel('MyClass.NotGiven') >>> MyClass.NotGiven MyClass.NotGiven -* One should be careful when creating sentinels in a function or method, since - sentinels with the same name created by code in the same module will be - identical. If distinct sentinel objects are needed, make sure to use - distinct names. - -* There is no single desirable value for the "truthiness" of sentinels, i.e. - their boolean value. It is sometimes useful for the boolean value to be - ``True``, and sometimes ``False``. Of the built-in sentinels in Python, - ``None`` evaluates to ``False``, while ``Ellipsis`` (a.k.a. ``...``) - evaluates to ``True``. The desire for this to be set as needed came up in - discussions as well. +* Creating sentinels in a function or method is allowed. Each call to + ``sentinel()`` creates a distinct object, so a sentinel created in a local + scope behaves like one created by calling ``object()`` in that scope. * The boolean value of ``NotImplemented`` is ``True``, but using this is deprecated since Python 3.9 (doing so generates a deprecation warning.) @@ -450,15 +546,6 @@ Additional Notes for these sentinels, where different options were discussed. -Open Issues -=========== - -* **Is adding a new stdlib module the right way to go?** I could not find any - existing module which seems like a logical place for this. However, adding - new stdlib modules should be done judiciously, so perhaps choosing an - existing module would be preferable even if it is not a perfect fit? - - Footnotes ========= @@ -471,6 +558,7 @@ Footnotes .. [7] `Reference implementation at the taleinat/python-stdlib-sentinels GitHub repo `_ .. [8] `bpo-35712: Make NotImplemented unusable in boolean context `_ .. [9] `Discussion thread about type signatures for these sentinels on the typing-sig mailing list `_ +.. [10] `CPython reference implementation `_ Copyright