I'm looking for an efficient way to check variables of a Python function. For example, I'd like to check arguments type and value. Is there a module for this? Or should I use something like decorators, or any specific idiom?
def my_function(a, b, c):
"""An example function I'd like to check the arguments of."""
# check that a is an int
# check that 0 < b < 10
# check that c is not an empty string
In this elongated answer, we implement a Python 3.x-specific type checking decorator based on PEP 484-style type hints in less than 275 lines of pure-Python (most of which is explanatory docstrings and comments) – heavily optimized for industrial-strength real-world use complete with a py.test
-driven test suite exercising all possible edge cases.
Feast on the unexpected awesome of bear typing:
>>> @beartype
... def spirit_bear(kermode: str, gitgaata: (str, int)) -> tuple:
... return (kermode, gitgaata, "Moksgm'ol", 'Ursus americanus kermodei')
>>> spirit_bear(0xdeadbeef, 'People of the Cane')
AssertionError: parameter kermode=0xdeadbeef not of <class "str">
As this example suggests, bear typing explicitly supports type checking of parameters and return values annotated as either simple types or tuples of such types. Golly!
O.K., that's actually unimpressive. @beartype
resembles every other Python 3.x-specific type checking decorator based on PEP 484-style type hints in less than 275 lines of pure-Python. So what's the rub, bub?
Bear typing is dramatically more efficient in both space and time than all existing implementations of type checking in Python to the best of my limited domain knowledge. (More on that later.)
Efficiency usually doesn't matter in Python, however. If it did, you wouldn't be using Python. Does type checking actually deviate from the well-established norm of avoiding premature optimization in Python? Yes. Yes, it does.
Consider profiling, which adds unavoidable overhead to each profiled metric of interest (e.g., function calls, lines). To ensure accurate results, this overhead is mitigated by leveraging optimized C extensions (e.g., the _lsprof
C extension leveraged by the cProfile
module) rather than unoptimized pure-Python (e.g., the profile
module). Efficiency really does matter when profiling.
Type checking is no different. Type checking adds overhead to each function call type checked by your application – ideally, all of them. To prevent well-meaning (but sadly small-minded) coworkers from removing the type checking you silently added after last Friday's caffeine-addled allnighter to your geriatric legacy Django web app, type checking must be fast. So fast that no one notices it's there when you add it without telling anyone. I do this all the time! Stop reading this if you are a coworker.
If even ludicrous speed isn't enough for your gluttonous application, however, bear typing may be globally disabled by enabling Python optimizations (e.g., by passing the -O
option to the Python interpreter):
$ python3 -O
# This succeeds only when type checking is optimized away. See above!
>>> spirit_bear(0xdeadbeef, 'People of the Cane')
(0xdeadbeef, 'People of the Cane', "Moksgm'ol", 'Ursus americanus kermodei')
Just because. Welcome to bear typing.
Bear typing is bare-metal type checking – that is, type checking as close to the manual approach of type checking in Python as feasible. Bear typing is intended to impose no performance penalties, compatibility constraints, or third-party dependencies (over and above that imposed by the manual approach, anyway). Bear typing may be seamlessly integrated into existing codebases and test suites without modification.
Everyone's probably familiar with the manual approach. You manually assert
each parameter passed to and/or return value returned from every function in your codebase. What boilerplate could be simpler or more banal? We've all seen it a hundred times a googleplex times, and vomited a little in our mouths everytime we did. Repetition gets old fast. DRY, yo.
Get your vomit bags ready. For brevity, let's assume a simplified easy_spirit_bear()
function accepting only a single str
parameter. Here's what the manual approach looks like:
def easy_spirit_bear(kermode: str) -> str:
assert isinstance(kermode, str), 'easy_spirit_bear() parameter kermode={} not of <class "str">'.format(kermode)
return_value = (kermode, "Moksgm'ol", 'Ursus americanus kermodei')
assert isinstance(return_value, str), 'easy_spirit_bear() return value {} not of <class "str">'.format(return_value)
return return_value
Python 101, right? Many of us passed that class.
Bear typing extracts the type checking manually performed by the above approach into a dynamically defined wrapper function automatically performing the same checks – with the added benefit of raising granular TypeError
rather than ambiguous AssertionError
exceptions. Here's what the automated approach looks like:
def easy_spirit_bear_wrapper(*args, __beartype_func=easy_spirit_bear, **kwargs):
if not (
isinstance(args[0], __beartype_func.__annotations__['kermode'])
if 0 < len(args) else
isinstance(kwargs['kermode'], __beartype_func.__annotations__['kermode'])
if 'kermode' in kwargs else True):
raise TypeError(
'easy_spirit_bear() parameter kermode={} not of {!r}'.format(
args[0] if 0 < len(args) else kwargs['kermode'],
__beartype_func.__annotations__['kermode']))
return_value = __beartype_func(*args, **kwargs)
if not isinstance(return_value, __beartype_func.__annotations__['return']):
raise TypeError(
'easy_spirit_bear() return value {} not of {!r}'.format(
return_value, __beartype_func.__annotations__['return']))
return return_value
It's long-winded. But it's also basically* as fast as the manual approach. * Squinting suggested.
Note the complete lack of function inspection or iteration in the wrapper function, which contains a similar number of tests as the original function – albeit with the additional (maybe negligible) costs of testing whether and how the parameters to be type checked are passed to the current function call. You can't win every battle.
Can such wrapper functions actually be reliably generated to type check arbitrary functions in less than 275 lines of pure Python? Snake Plisskin says, "True story. Got a smoke?"
And, yes. I may have a neckbeard.
Bear beats duck. Duck may fly, but bear may throw salmon at duck. In Canada, nature can surprise you.
Next question.
Existing solutions do not perform bare-metal type checking – at least, none I've grepped across. They all iteratively reinspect the signature of the type-checked function on each function call. While negligible for a single call, reinspection overhead is usually non-negligible when aggregated over all calls. Really, really non-negligible.
It's not simply efficiency concerns, however. Existing solutions also often fail to account for common edge cases. This includes most if not all toy decorators provided as stackoverflow answers here and elsewhere. Classic failures include:
@checkargs
decorator).isinstance()
builtin.AssertionError
exceptions rather than specific TypeError
exceptions on failed type checks. For granularity and sanity, type checking should never raise generic exceptions.Bear typing succeeds where non-bears fail. All one, all bear!
Bear typing shifts the space and time costs of inspecting function signatures from function call time to function definition time – that is, from the wrapper function returned by the @beartype
decorator into the decorator itself. Since the decorator is only called once per function definition, this optimization yields glee for all.
Bear typing is an attempt to have your type checking cake and eat it, too. To do so, @beartype
:
exec()
builtin.Shall we? Let's dive into the deep end.
# If the active Python interpreter is *NOT* optimized (e.g., option "-O" was
# *NOT* passed to this interpreter), enable type checking.
if __debug__:
import inspect
from functools import wraps
from inspect import Parameter, Signature
def beartype(func: callable) -> callable:
'''
Decorate the passed **callable** (e.g., function, method) to validate
both all annotated parameters passed to this callable _and_ the
annotated value returned by this callable if any.
This decorator performs rudimentary type checking based on Python 3.x
function annotations, as officially documented by PEP 484 ("Type
Hints"). While PEP 484 supports arbitrarily complex type composition,
this decorator requires _all_ parameter and return value annotations to
be either:
* Classes (e.g., `int`, `OrderedDict`).
* Tuples of classes (e.g., `(int, OrderedDict)`).
If optimizations are enabled by the active Python interpreter (e.g., due
to option `-O` passed to this interpreter), this decorator is a noop.
Raises
----------
NameError
If any parameter has the reserved name `__beartype_func`.
TypeError
If either:
* Any parameter or return value annotation is neither:
* A type.
* A tuple of types.
* The kind of any parameter is unrecognized. This should _never_
happen, assuming no significant changes to Python semantics.
'''
# Raw string of Python statements comprising the body of this wrapper,
# including (in order):
#
# * A "@wraps" decorator propagating the name, docstring, and other
# identifying metadata of the original function to this wrapper.
# * A private "__beartype_func" parameter initialized to this function.
# In theory, the "func" parameter passed to this decorator should be
# accessible as a closure-style local in this wrapper. For unknown
# reasons (presumably, a subtle bug in the exec() builtin), this is
# not the case. Instead, a closure-style local must be simulated by
# passing the "func" parameter to this function at function
# definition time as the default value of an arbitrary parameter. To
# ensure this default is *NOT* overwritten by a function accepting a
# parameter of the same name, this edge case is tested for below.
# * Assert statements type checking parameters passed to this callable.
# * A call to this callable.
# * An assert statement type checking the value returned by this
# callable.
#
# While there exist numerous alternatives (e.g., appending to a list or
# bytearray before joining the elements of that iterable into a string),
# these alternatives are either slower (as in the case of a list, due to
# the high up-front cost of list construction) or substantially more
# cumbersome (as in the case of a bytearray). Since string concatenation
# is heavily optimized by the official CPython interpreter, the simplest
# approach is (curiously) the most ideal.
func_body = '''
@wraps(__beartype_func)
def func_beartyped(*args, __beartype_func=__beartype_func, **kwargs):
'''
# "inspect.Signature" instance encapsulating this callable's signature.
func_sig = inspect.signature(func)
# Human-readable name of this function for use in exceptions.
func_name = func.__name__ + '()'
# For the name of each parameter passed to this callable and the
# "inspect.Parameter" instance encapsulating this parameter (in the
# passed order)...
for func_arg_index, func_arg in enumerate(func_sig.parameters.values()):
# If this callable redefines a parameter initialized to a default
# value by this wrapper, raise an exception. Permitting this
# unlikely edge case would permit unsuspecting users to
# "accidentally" override these defaults.
if func_arg.name == '__beartype_func':
raise NameError(
'Parameter {} reserved for use by @beartype.'.format(
func_arg.name))
# If this parameter is both annotated and non-ignorable for purposes
# of type checking, type check this parameter.
if (func_arg.annotation is not Parameter.empty and
func_arg.kind not in _PARAMETER_KIND_IGNORED):
# Validate this annotation.
_check_type_annotation(
annotation=func_arg.annotation,
label='{} parameter {} type'.format(
func_name, func_arg.name))
# String evaluating to this parameter's annotated type.
func_arg_type_expr = (
'__beartype_func.__annotations__[{!r}]'.format(
func_arg.name))
# String evaluating to this parameter's current value when
# passed as a keyword.
func_arg_value_key_expr = 'kwargs[{!r}]'.format(func_arg.name)
# If this parameter is keyword-only, type check this parameter
# only by lookup in the variadic "**kwargs" dictionary.
if func_arg.kind is Parameter.KEYWORD_ONLY:
func_body += '''
if {arg_name!r} in kwargs and not isinstance(
{arg_value_key_expr}, {arg_type_expr}):
raise TypeError(
'{func_name} keyword-only parameter '
'{arg_name}={{}} not a {{!r}}'.format(
{arg_value_key_expr}, {arg_type_expr}))
'''.format(
func_name=func_name,
arg_name=func_arg.name,
arg_type_expr=func_arg_type_expr,
arg_value_key_expr=func_arg_value_key_expr,
)
# Else, this parameter may be passed either positionally or as
# a keyword. Type check this parameter both by lookup in the
# variadic "**kwargs" dictionary *AND* by index into the
# variadic "*args" tuple.
else:
# String evaluating to this parameter's current value when
# passed positionally.
func_arg_value_pos_expr = 'args[{!r}]'.format(
func_arg_index)
func_body += '''
if not (
isinstance({arg_value_pos_expr}, {arg_type_expr})
if {arg_index} < len(args) else
isinstance({arg_value_key_expr}, {arg_type_expr})
if {arg_name!r} in kwargs else True):
raise TypeError(
'{func_name} parameter {arg_name}={{}} not of {{!r}}'.format(
{arg_value_pos_expr} if {arg_index} < len(args) else {arg_value_key_expr},
{arg_type_expr}))
'''.format(
func_name=func_name,
arg_name=func_arg.name,
arg_index=func_arg_index,
arg_type_expr=func_arg_type_expr,
arg_value_key_expr=func_arg_value_key_expr,
arg_value_pos_expr=func_arg_value_pos_expr,
)
# If this callable's return value is both annotated and non-ignorable
# for purposes of type checking, type check this value.
if func_sig.return_annotation not in _RETURN_ANNOTATION_IGNORED:
# Validate this annotation.
_check_type_annotation(
annotation=func_sig.return_annotation,
label='{} return type'.format(func_name))
# Strings evaluating to this parameter's annotated type and
# currently passed value, as above.
func_return_type_expr = (
"__beartype_func.__annotations__['return']")
# Call this callable, type check the returned value, and return this
# value from this wrapper.
func_body += '''
return_value = __beartype_func(*args, **kwargs)
if not isinstance(return_value, {return_type}):
raise TypeError(
'{func_name} return value {{}} not of {{!r}}'.format(
return_value, {return_type}))
return return_value
'''.format(func_name=func_name, return_type=func_return_type_expr)
# Else, call this callable and return this value from this wrapper.
else:
func_body += '''
return __beartype_func(*args, **kwargs)
'''
# Dictionary mapping from local attribute name to value. For efficiency,
# only those local attributes explicitly required in the body of this
# wrapper are copied from the current namespace. (See below.)
local_attrs = {'__beartype_func': func}
# Dynamically define this wrapper as a closure of this decorator. For
# obscure and presumably uninteresting reasons, Python fails to locally
# declare this closure when the locals() dictionary is passed; to
# capture this closure, a local dictionary must be passed instead.
exec(func_body, globals(), local_attrs)
# Return this wrapper.
return local_attrs['func_beartyped']
_PARAMETER_KIND_IGNORED = {
Parameter.POSITIONAL_ONLY, Parameter.VAR_POSITIONAL, Parameter.VAR_KEYWORD,
}
'''
Set of all `inspect.Parameter.kind` constants to be ignored during
annotation- based type checking in the `@beartype` decorator.
This includes:
* Constants specific to variadic parameters (e.g., `*args`, `**kwargs`).
Variadic parameters cannot be annotated and hence cannot be type checked.
* Constants specific to positional-only parameters, which apply to non-pure-
Python callables (e.g., defined by C extensions). The `@beartype`
decorator applies _only_ to pure-Python callables, which provide no
syntactic means of specifying positional-only parameters.
'''
_RETURN_ANNOTATION_IGNORED = {Signature.empty, None}
'''
Set of all annotations for return values to be ignored during annotation-
based type checking in the `@beartype` decorator.
This includes:
* `Signature.empty`, signifying a callable whose return value is _not_
annotated.
* `None`, signifying a callable returning no value. By convention, callables
returning no value are typically annotated to return `None`. Technically,
callables whose return values are annotated as `None` _could_ be
explicitly checked to return `None` rather than a none-`None` value. Since
return values are safely ignorable by callers, however, there appears to
be little real-world utility in enforcing this constraint.
'''
def _check_type_annotation(annotation: object, label: str) -> None:
'''
Validate the passed annotation to be a valid type supported by the
`@beartype` decorator.
Parameters
----------
annotation : object
Annotation to be validated.
label : str
Human-readable label describing this annotation, interpolated into
exceptions raised by this function.
Raises
----------
TypeError
If this annotation is neither a new-style class nor a tuple of
new-style classes.
'''
# If this annotation is a tuple, raise an exception if any member of
# this tuple is not a new-style class. Note that the "__name__"
# attribute tested below is not defined by old-style classes and hence
# serves as a helpful means of identifying new-style classes.
if isinstance(annotation, tuple):
for member in annotation:
if not (
isinstance(member, type) and hasattr(member, '__name__')):
raise TypeError(
'{} tuple member {} not a new-style class'.format(
label, member))
# Else if this annotation is not a new-style class, raise an exception.
elif not (
isinstance(annotation, type) and hasattr(annotation, '__name__')):
raise TypeError(
'{} {} neither a new-style class nor '
'tuple of such classes'.format(label, annotation))
# Else, the active Python interpreter is optimized. In this case, disable type
# checking by reducing this decorator to the identity decorator.
else:
def beartype(func: callable) -> callable:
return func
And leycec said, Let the @beartype
bring forth type checking fastly: and it was so.
Nothing is perfect. Even bear typing.
Bear typing does not type check unpassed parameters assigned default values. In theory, it could. But not in 275 lines or less and certainly not as a stackoverflow answer.
The safe (...probably totally unsafe) assumption is that function implementers claim they knew what they were doing when they defined default values. Since default values are typically constants (...they'd better be!), rechecking the types of constants that never change on each function call assigned one or more default values would contravene the fundamental tenet of bear typing: "Don't repeat yourself over and oooover and oooo-oooover again."
Show me wrong and I will shower you with upvotes.
PEP 484 ("Type Hints") formalized the use of function annotations first introduced by PEP 3107 ("Function Annotations"). Python 3.5 superficially supports this formalization with a new top-level typing
module, a standard API for composing arbitrarily complex types from simpler types (e.g., Callable[[Arg1Type, Arg2Type], ReturnType]
, a type describing a function accepting two arguments of type Arg1Type
and Arg2Type
and returning a value of type ReturnType
).
Bear typing supports none of them. In theory, it could. But not in 275 lines or less and certainly not as a stackoverflow answer.
Bear typing does, however, support unions of types in the same way that the isinstance()
builtin supports unions of types: as tuples. This superficially corresponds to the typing.Union
type – with the obvious caveat that typing.Union
supports arbitrarily complex types, while tuples accepted by @beartype
support only simple classes. In my defense, 275 lines.
Here's the gist of it. Get it, gist? I'll stop now.
As with the @beartype
decorator itself, these py.test
tests may be seamlessly integrated into existing test suites without modification. Precious, isn't it?
Now the mandatory neckbeard rant nobody asked for.
Python 3.5 provides no actual support for using PEP 484 types. wat?
It's true: no type checking, no type inference, no type nuthin'. Instead, developers are expected to routinely run their entire codebases through heavyweight third-party CPython interpreter wrappers implementing a facsimile of such support (e.g., mypy). Of course, these wrappers impose:
I ask Guido: "Why? Why bother inventing an abstract API if you weren't willing to pony up a concrete API actually doing something with that abstraction?" Why leave the fate of a million Pythonistas to the arthritic hand of the free open-source marketplace? Why create yet another techno-problem that could have been trivially solved with a 275-line decorator in the official Python stdlib?
I have no Python and I must scream.