Appendix A: helpers.py
tinygrad/helpers.py contains a bunch of helper functions that are used all over TinyGrad.
Go over the list, it will make reading the code easier. When in doubt, always refer to the code.
dedup(x: Iterable[T]) -> list[T]
- Description: Removes duplicate elements from an iterable while preserving the original order of the first appearance.
- Input: An iterable
x
. - Output: A
list
containing unique elements fromx
in their original order. - Example:
dedup([1, 3, 2, 3, 1])
returns[1, 3, 2]
. - Usage: Used throughout tinygrad to ensure uniqueness in lists where order matters, like kernel arguments, variable lists, or UOp sources.
- Gotchas: Requires hashable elements.
argfix(*x) -> tuple
- Description: Normalizes function arguments. It allows a function to accept arguments either as a sequence (list/tuple) or as individual arguments.
- Input: Variable positional arguments
*x
. - Output: A
tuple
. - Example:
argfix(1, 2, 3)
returns(1, 2, 3)
.argfix((1, 2, 3))
orargfix([1, 2, 3])
returns(1, 2, 3)
.
- Usage: Commonly used in
Tensor
methods (e.g.,reshape
,pad
,permute
,expand
) to allow flexible input liket.reshape(2, 3)
ort.reshape((2, 3))
. - Gotchas: Raises a
ValueError
if the first argument is a list/tuple and there are more arguments (e.g.,argfix((1, 2), 3)
is invalid).
argsort(x: Union[list, tuple])
- Description: Returns the indices that would sort the input iterable
x
. Mimicsnumpy.argsort
. - Input: An iterable
x
. - Output: A sequence of the same type as
x
(e.g., list or tuple) containing integer indices. - Example:
argsort([30, 10, 20])
returns[1, 2, 0]
. - Usage: Used internally, for example, in
gradient.py
to compute the inverse permutation needed for the gradient ofpermute
. Also used inView.__add__
for sorting strides during view merging/inversion. - Gotchas: Returns a list or a tuple, depending on the input type. Not intended to be used on Tensors.
all_same(items: Union[tuple[T, ...], list[T]]) -> bool
- Description: Checks if all elements in a non-empty list or tuple are identical (equal to the first element).
- Input: A
list
ortuple
. - Output:
bool
. - Example:
all_same([1, 1, 1])
returnsTrue
.all_same((2, 3, 2))
returnsFalse
. - Usage: Used for assertions and checks, e.g., ensuring source UOps have the same shape (
UOp.st
), checking if all elements in aVCONST
are the same (UOp.const
), verifying multi-device tensors have consistent properties (tensor.py
,multi.py
), checking kernel arguments (Kernel.linearize
). - Gotchas: Returns
True
for empty or single-element lists/tuples.
all_int(t: Sequence[Any]) -> TypeGuard[tuple[int, ...]]
- Description: Checks if all elements in a sequence are of type
int
. UsesTypeGuard
for static analysis benefits. - Input: A
Sequence
(likelist
ortuple
). - Output:
bool
. IfTrue
, type checkers can infer the input is specifically a sequence ofint
. - Example:
all_int([1, 2, 3])
returnsTrue
.all_int((1, 2.0))
returnsFalse
. - Usage: Crucial for distinguishing between concrete shapes (tuples of
int
) and symbolic shapes (tuples containingVariable
orUOp
). Used extensively inShapeTracker
,View
, andTensor
methods to determine if operations like size calculation or certain reshapes are possible without symbolic evaluation.
colored(st, color: Optional[str], background=False) -> str
- Description: Adds ANSI escape codes to a string
st
to color it for terminal output. Supports foreground and background colors. Bright colors can be selected by passing the color name in uppercase (e.g., “RED”). - Input: String
st
, optional color namecolor
(e.g., “red”, “GREEN”), booleanbackground
. - Output: A string, potentially with ANSI color codes.
- Example:
colored("Error", "RED")
returns"\u001b[91mError\u001b[0m"
. - Usage: Used for enhancing debug prints, progress bars (
tqdm
), visualizing kernel shapes (Kernel.colored_shape
), and highlighting warnings/errors. - Gotchas: Only effective in terminals that support ANSI escape codes.
colorize_float(x: float) -> str
- Description: Formats a float
x
(intended as a ratio, often speedup/slowdown) and colors it: green if < 0.75, yellow if 0.75 <= x <= 1.15, red if > 1.15. - Input: A
float
x
. - Output: A colored string representation (e.g.,
" 0.50x"
). - Example:
colorize_float(0.5)
returns a green string.colorize_float(1.2)
returns a red string. - Usage: Used in profiling output (
helpers.Profiling
) to visually indicate performance ratios.
time_to_str(t: float, w=8) -> str
- Description: Converts a time duration
t
(in seconds) into a human-readable string, automatically selecting units (s, ms, us) based on magnitude. Formats the output to a specified widthw
. - Input: Time
t
in seconds (float
), optional field widthw
(int
). - Output: A formatted string representing the time.
- Example:
time_to_str(0.0015)
might return" 1.50ms"
.time_to_str(1.2)
might return" 1.20s "
. - Usage: Used extensively in debug output (
DEBUG>=2
), profiling (Profiling
,Timing
), and beam search logging to display kernel runtimes and compile times concisely.
ansistrip(s: str) -> str
- Description: Removes ANSI escape codes (primarily color/formatting codes) from a string
s
. - Input: A string
s
. - Output: The string
s
with ANSI codes removed. - Example:
ansistrip("\u001b[91mError\u001b[0m")
returns"Error"
. - Usage: Used by
ansilen
to calculate visible length and byto_function_name
to sanitize potentially colored strings before creating identifiers.
ansilen(s: str) -> int
- Description: Calculates the visible length of a string
s
by first removing ANSI escape codes usingansistrip
. - Input: A string
s
. - Output: The length of the string without ANSI codes (
int
). - Example:
ansilen(colored("hello", "red"))
returns5
. - Usage: Useful for formatting output containing colored text, ensuring correct alignment based on visible characters (e.g., in
Kernel.colored_shape(pad=...)
).
make_tuple(x: Union[int, Sequence[int]], cnt: int) -> tuple[int, ...]
- Description: Creates a tuple of integers of length
cnt
. Ifx
is an integer, it’s repeatedcnt
times. Ifx
is a sequence, it’s converted to a tuple (its length should matchcnt
). - Input: An integer or sequence of integers
x
, and a target countcnt
. - Output: A
tuple
of integers of lengthcnt
. - Example:
make_tuple(3, 2)
returns(3, 3)
.make_tuple([1, 2], 2)
returns(1, 2)
. - Usage: Normalizes arguments like
stride
,padding
which can be a single value or a sequence. - Gotchas: Does not check if
len(x) == cnt
ifx
is not anint
.
flatten(l: Iterable[Iterable[T]]) -> list[T]
- Description: Flattens an iterable of iterables by one level.
- Input: An iterable of iterables
l
. - Output: A flattened
list
. - Example:
flatten([[1, 2], [3, 4]])
returns[1, 2, 3, 4]
. - Usage: Widespread utility function used throughout the codebase.
fully_flatten(l)
- Description: Recursively flattens nested lists/tuples into a single list. Handles numpy 0-dim arrays.
- Input: A potentially nested structure
l
. - Output: A completely flattened
list
. - Example:
fully_flatten([1, [2, [3, 4]]])
returns[1, 2, 3, 4]
. - Usage: Primarily used by
_frompy
to prepare data forstruct.pack
.
fromimport(mod, frm)
- Description: Dynamically imports
frm
from modulemod
. Equivalent tofrom mod import frm
, returningfrm
. - Input: Module name
mod
(str), item namefrm
(str). - Output: The imported object.
- Example:
fromimport("os", "path")
is equivalent tofrom os import path
.
strip_parens(fst: str) -> str
- Description: Heuristically removes the outermost matching parentheses from a string.
- Input: A string
fst
. - Output: The string with outermost parentheses removed, if they match.
- Example:
strip_parens("(a+b)")
returns"a+b"
.strip_parens("(a+b)*(c+d)")
returns"(a+b)*(c+d)"
. - Usage: Used by C-style renderers to simplify generated code expressions.
ceildiv(num, amt) -> int
- Description: Computes
ceil(num / amt)
using integer arithmetic. - Input: Numerator
num
, denominatoramt
. - Output: Ceiling of the division as an
int
. - Example:
ceildiv(5, 2)
returns3
. - Usage: Ubiquitous for calculating sizes, bounds, counts that require rounding up.
round_up(num: int, amt: int) -> int
- Description: Rounds
num
up to the nearest multiple ofamt
. - Input: Integer
num
, integeramt
. - Output:
num
rounded up to the nearest multiple ofamt
. - Example:
round_up(5, 4)
returns8
. - Usage: Used for alignment and ensuring sizes are multiples of factors (memory, block sizes).
lo32(x: Any) -> Any
, hi32(x: Any) -> Any
- Description: Extract the lower (
lo32
) or upper (hi32
) 32 bits of a (presumably 64-bit) integer or symbolic integer (sint
). - Input: A value
x
(typically an integer or symbolic integer). - Output: The lower or upper 32 bits of
x
. - Example:
lo32(0x12345678ABCDEF)
returns0xABCDEF
.hi32(0x12345678ABCDEF)
returns0x12345678
. - Usage: Used in low-level code generation or runtime interactions needing 32-bit components of 64-bit values (e.g., specific hardware instructions, register packing).
- Gotchas: Uses bitwise AND (
&
) and right shift (>>
). TheAny
type hint means behavior with non-integer/sint types is undefined.
data64(data: Any) -> tuple[Any, Any]
, data64_le(data: Any) -> tuple[Any, Any]
- Description: Split a 64-bit value into two 32-bit components.
data64
returns(high_32, low_32)
.data64_le
returns(low_32, high_32)
(Little-Endian order). - Input: A 64-bit value
data
. - Output: A tuple of two 32-bit components.
- Example:
data64(0x12345678ABCDEF)
returns(0x12345678, 0xABCDEF)
. - Usage: Similar to
lo32
/hi32
, used for low-level interactions requiring split 64-bit values, potentially considering endianness.
getbits(value: int, start: int, end: int) -> int
- Description: Extracts bits
start
throughend
(inclusive) fromvalue
. - Input: Integer
value
, bit positionsstart
andend
. - Output: The extracted bits as an integer.
- Example:
getbits(0b110101, 1, 3)
returns0b101
(5).
i2u(bits: int, value: int) -> int
- Description: Converts a signed
bits
-bit integervalue
to its unsigned representation. - Input: Bit width
bits
, signed integervalue
. - Output: Unsigned integer representation.
- Example:
i2u(8, -1)
returns255
.
merge_dicts(ds: Iterable[dict[T, U]]) -> dict[T, U]
- Description: Merges dictionaries, asserting that shared keys have identical values.
- Input: An iterable of dictionaries
ds
. - Output: A merged dictionary.
- Example:
merge_dicts([{"a": 1}, {"b": 2}])
returns{"a": 1, "b": 2}
. - Usage: Consistently combines
var_vals
dictionaries during graph processing and JIT compilation.
partition(itr: Iterable[T], fxn: Callable[[T], bool]) -> tuple[list[T], list[T]]
- Description: Splits iterable
itr
into two lists based on predicatefxn
. - Input: An iterable
itr
and a predicate functionfxn
. - Output: A tuple of two lists:
(items_where_fxn_is_true, items_where_fxn_is_false)
. - Example:
partition([1, 2, 3, 4], lambda x: x % 2 == 0)
returns([2, 4], [1, 3])
. - Usage: General utility for separating elements based on a condition, used in various places like separating UOps by type.
unwrap(x: Optional[T]) -> T
- Description: Asserts
x
is notNone
and returns it, satisfying type checkers. - Input: A potentially
None
valuex
. - Output: The value
x
if it’s notNone
. - Example:
unwrap(some_dict.get("key"))
raises an assertion error if the key doesn’t exist. - Usage: Widely used for safety and to satisfy type checkers, especially when accessing optional values that should be present.
get_single_element(x: list[T]) -> T
- Description: Asserts list
x
has exactly one element and returns it. - Input: A list
x
. - Output: The single element in the list.
- Example:
get_single_element([42])
returns42
. - Usage: Used when a list is expected to contain exactly one item, often in graph processing or optimization passes.
get_child(obj, key)
- Description: Accesses nested attributes/items in
obj
using a dot-separatedkey
string. - Input: An object
obj
and a stringkey
with dot notation (e.g., “attr1.attr2.0”). - Output: The nested attribute or item.
- Example:
get_child({"a": {"b": [10, 20]}}, "a.b.1")
returns20
. - Usage: Used for accessing deeply nested structures, especially in debugging and serialization.
- Note: Handles both attribute access (for objects) and item access (for dicts/lists), converting numeric strings to integers for indexing.
word_wrap(x, wrap=80)
- Description: Basic line wrapping at
wrap
characters. - Input: A string
x
and optional wrap lengthwrap
. - Output: The string with newlines inserted every
wrap
characters. - Example:
word_wrap("abcdefghijklmnopqrstuvwxyz", 10)
returns"abcdefghij\nklmnopqrst\nuvwxyz"
. - Usage: Used for formatting debug output and error messages.
- Note: Doesn’t respect word boundaries, simply inserts newlines every
wrap
characters.
pluralize(st: str, cnt: int) -> str
- Description: Formats count
cnt
with nounst
, adding ‘s’ ifcnt != 1
. - Input: A string
st
(noun) and an integercnt
. - Output: A formatted string with the count and properly pluralized noun.
- Example:
pluralize("item", 1)
returns"1 item"
.pluralize("item", 2)
returns"2 items"
. - Usage: Used for generating human-readable log messages and status reports.
- Note: Uses simple ‘s’ pluralization, which doesn’t handle irregular plurals.
polyN(x: T, p: list[float]) -> T
- Description: Evaluates polynomial
p
atx
using Horner’s method. - Input: A value
x
and a list of coefficientsp
in descending order of power. - Output: The polynomial evaluation result.
- Example:
polyN(2, [1, 2, 3])
computes1*2^2 + 2*2^1 + 3*2^0
=1*4 + 2*2 + 3*1
=11
. - Usage: Used in
transcendental.py
for efficient function approximations wherex
is often aUOp
. - Note: Coefficients are in descending order (highest power first).
to_function_name(s: str) -> str
- Description: Converts string
s
to a valid C/Python function name by removing ANSI codes and hex-encoding invalid characters. - Input: A string
s
. - Output: A valid function name string.
- Example:
to_function_name("kernel(a+b)")
might return"kernel_a_b_"
. - Usage: Generates valid, deterministic kernel function names for code generation.
- Note: Results are cached for performance. Handles ANSI color codes by removing them.
getenv(key: str, default=0)
- Description: Gets environment variable
key
, casting to type ofdefault
. - Input: Environment variable name
key
and optional default value. - Output: The environment variable value cast to the type of
default
. - Example:
getenv("DEBUG", 0)
returns the integer value of the DEBUG environment variable, or 0 if not set. - Usage: Pervasive configuration mechanism for tinygrad features and debug levels.
- Gotchas: Caches results for performance, which means runtime changes to environment variables aren’t seen after first access.
temp(x: str, append_user: bool = False) -> str
- Description: Creates a path string in the system’s temporary directory, optionally making it user-specific.
- Input: A string
x
(path component) and optional flagappend_user
. - Output: A full path string in the temporary directory.
- Example:
temp("tinygrad/cache")
might return/tmp/tinygrad/cache
. - Usage: Locating caches, downloads, profiling outputs in a system-appropriate temporary location.
- Gotchas: If
append_user
is True, appends the username to make the path user-specific.