Replies: 6 comments 9 replies
-
I like this idea and I'd be willing to sponsor a PEP for it. Some early feedback:
|
Beta Was this translation helpful? Give feedback.
-
A few thoughts...
Here are a few specific reactions to the initial proposal:
I spent some time this morning playing with some variant ideas. Here are some of my musings. Note that my example code builds on PEP 695 & 696 syntax and the experimental inlined TypedDict syntax discussed in this typing-sig thread. Exploration 1: How can we eliminate the magic of automatically associating KeyType and ValueType? How can we make this more explicit rather than implicit? I think we need to allow Note: I switch terminology from # KeyOf is a special form that represents a string literal type that is
# parameterized by a TypedDict type. It represents a key in the TypedDict.
class KeyOf[TD: TypedDict](LiteralString): ...
# ValueOf is a special form that represents the value type associated with
# a key in a TypedDict.
class ValueOf[TD: TypedDict, K: LiteralString]: ...
# Examples:
type TD1 = dict[{"a": int, "b": str}]
KeyOf[TD1] # Evaluates to Literal["a", "b"]
ValueOf[TD1, Literal["a"]] # Evaluates to int
ValueOf[TD1, Literal["a", "b"]] # Evaluates to int | str
ValueOf[TD1, KeyOf[TD1]] # Evaluates to int | str Exploration 2: How could this be used in a generic function? A generic method? # Generic function
def get[T: TypedDict, K: KeyOf[T]](d: T, k: K) -> ValueOf[T, K]:
return d[k]
# Usage:
td1: TD1 = {"a": 1, "b": "hi"}
get(td1, "a") # Evaluates to int
k1: Literal["a", "b"]
get(td1, k1) # Evaluates to int | str # Generic method that references a class-scoped TypeVar
class C[TD: TypedDict]:
def __init__(self, **kwargs: Unpack[TD]):
self.vals = kwargs
def __setitem__[K: KeyOf[TD]](self, index: K, value: ValueOf[TD, K]) -> None:
self.vals[index] = value
type MyTypes = dict[{"foo": int, "bar": str, "baz": NotRequired[bool]}]
c = C[MyTypes](**{"foo": 0, "bar": ""}) Exploration 3: How could "mapping" work such that we don't have a "single type param" limitation? # Variant 1: Use a type variable to represent the key type:
type Immutable[TD: TypedDict, K: KeyOf[TD] = KeyOf[TD]] = dict[K, ReadOnly[TD[K]]]
# Variant 2: A slightly different syntax:
type Immutable[TD: TypedDict, K: KeyOf[TD] = KeyOf[TD]] = dict[{K: ReadOnly[TD[K]]}]
# Variant 3: Using a dictionary comprehension in the type expression:
type Immutable[TD: TypedDict] = dict[{K: ReadOnly[ValueOf[TD, K]] for K in KeyOf[TD]}] Of these, I prefer variant 3. While it feels a bit weird to use a comprehension in a type expression, I think it's pretty readable and makes good use of existing Python syntax and concepts. Variant 3 is also much more flexible. type Partial[TD: TypedDict] = dict[{K: NotRequired[ValueOf[TD, K]] for K in KeyOf[TD]}]
type MakeList[TD: TypedDict] = dict[{K: list[ValueOf[TD, K]] for K in KeyOf[TD]}]
type MakeDict[TD: TypedDict] = dict[{K: dict[str, ValueOf[TD, K]] for K in KeyOf[TD]}] Exploration 4: Could this mapping proposal extend to TypeVarTuple mapping? def product[*Ts](*it: *tuple[Iterable[T] for T in Ts]) -> Iterable[tuple[*Ts]]: ... You asked "What if I want to express that my function takes a list of keys and a list of values, but they don't have to match?"? That's easy with this proposal. type MyDict = dict[{"a": int, "b": str}]
# Accepts any value of any key defined in MyDict
def f1(x: ValueOf[MyDict, KeyOf[MyDict]]) -> None:
pass
f1(3) # OK
f1(3.0) # Error
# Accepts any key and any (unrelated) value
def f2(keys: list[KeyOf[MyDict]], values: list[ValueOf[MyDict, KeyOf[MyDict]]]) -> None:
pass
f2(["a"], [3]) # OK
f2(["a", "b"], [""]) # OK
f2(["a", "c"], []) # Error
f2([], [3.0]) # Error
# Accepts a list of any keys and a list of related values
def f3[K: KeyOf[MyDict]](keys: list[K], values: list[ValueOf[MyDict, K]]) -> None:
pass
f3(["a"], [3]) # OK
f3(["a", "b"], [3, ""]) # OK
f3(["a"], [""]) # Error I hope that's useful. Please keep moving forward on this proposal! I think it has significant value. If/when we start to converge on a proposal, I can implement pieces of it in pyright. I find this useful for playing around with the idea in code before the spec gets cast in concrete. |
Beta Was this translation helpful? Give feedback.
-
I wonder if this could be generalized to attributes of non-TypedDict types. For example, we could perhaps use something similar to add precise types to |
Beta Was this translation helpful? Give feedback.
-
PEP 728 seems very related to all of this with its Fields and FieldNames special forms python/peps#3326 |
Beta Was this translation helpful? Give feedback.
-
Curious if there has been any recent movement on this? We have a pattern where we need to initialize a class Fooy(TypeDict):
some_str: string
some_str2: string
some_num: number
some_bool: boolean
# this datastructure will be "reduced" to generated at instance of `Fooy`
init = {
"some_str": 'from-lit',
"some_str2": lambda: 'from-fn',
"some_num": lambda: 42,
"some_bool": lambda: True
} It's very important to have type safety, but currently doesn't seem possible in python. In typescript this could be done very easily: type Fooy = {
someStr: string;
someStr2: string;
someNum: number;
someBool: boolean;
}
type InitFunctions<T> = {[key in keyof T]: (()=> T[key]) | T[key]}
const init: InitFunctions<Fooy> = {
someStr: 'from-lit',
someStr2: () => 'from-fn',
someNum: () => 42,
someBool: () => true
} |
Beta Was this translation helpful? Give feedback.
-
Coming from this discussion, I wanted to discuss some experiments I did (mainly for There seem to be two ideas emerging from this thread. Introducing a
|
Beta Was this translation helpful? Give feedback.
-
Previously:
The motivation is to provide a typing mechanism for dict-like containers (like
pandas.DataFrame
) to have key-wise type annotations – in the same way thatTypedDict
provides key-wise types fordict
.To this end, we introduce three new special forms which act as type operators on
Mapping
s andTypedDict
s:KeyType
,ElementType
andMap
.As a motivating example, here is how you would type-annotate
pandas.DataFrame
with the proposed mechanism – the most important part is the definition of__getitem__
. (In the comments, I’m sometimes using the syntaxTypedDict({"foo": int, "bar": str})
for “anonymous TypedDicts”.)Now I’ll explain everything in more detail.
How do
KeyType
andElementType
work?If
TD
is aTypedDict
(or aTypeVar
with boundTypedDict
), thenKeyType[TD]
is a union of all key types. Consider:ElementType
, on the other hand, represents the type of a specific element of a TypedDict:If the second type argument to
ElementType
is a union, then the resulting type is also a union:This means we can combine
KeyType
andElementType
like this:Let’s unpack what’s happening here. First, we apply
KeyType[]
toMyDict
. This results inLiteral["a", "b"]
. So, the definition off
above is equivalent to:Then, we can expand
ElementType
. It returns the union of all types that are associated with the given key types. The result is:In general,
ElementType[TD, KeyType[TD]]
is the union of all element types inTD
.However, where it gets really interesting is when we the function generic with
KeyType
as the bound:As discussed,
KeyType[MyDict]
isLiteral["a", "b"]
, so the type variableK
can either beLiteral["a"]
orLiteral["b"]
. Therefore, the above definition is equivalent to this overload:We can also combine this with a type variable that has
Mapping
as the bound:Here,
K: KeyType[TD]
means thatK
is a key type of the arbitrary MappingTD
. This is equivalent to:Example:
TypedMapping
Full example putting everything together:
KeyType
andElementType
on normaldict
types andMapping
typesFor a type like this:
we can also use
KeyType
andElementType
to extract types:It works the same for
Mapping
.How does
Map
work?To really make
TypeVar
withTypedDict
bound useful, we introduce the special formMap
as well.Map
was originally introduced in this proto-PEP.It works like this:
The first argument to
Map
has to be a generic type with exactly oneTypeVar
slot.This is needed for example in the definition of
read_csv
:The
dtype
object that you pass in will look something like{"col1": np.int64}
but that has typeTypedDict({"col1": type[np.int64]})
, and not typeTypedDict({"col1": np.int64})
which is what we need in order to infer the correct type for theDataFrame
.So, the
type[]
needs to be stripped away somehow. That is whatMap
does: thedtype
we pass in has typeTypedDict({"col1": type[np.int64]})
which gets matched toMap[type, TD]
which means thatTD
is inferred asTypedDict({"col1": np.int64})
, just as we wanted.Interaction with
KeyType
andElementType
Map
does not affect the keys. So we haveThe types of the elements are wrapped in the generic type:
Map
on normaldict
types andMapping
typesFor a type like this:
Map
still works:Aside on
TypeVarTuple
The proto-PEP linked above defines
Map
to be used onTypeVarTuples
like this:Interaction with other features of
TypedDict
total=False
andNotRequired
Not completely sure how to deal with fields from non-total TypedDicts and entries marked as
NotRequired
. I suppose the easiest solution is to drop those fields entirely inKeyType
.It should be allowed to map
Required
orNotRequired
withMap
over a TypedDict.readonly=True
andReadOnly
There is a PEP proposing to extend
TypedDict
with the ability to declare read-only fields. It would be nice to somehow honor this in dict-like containers that are derived fromTypedDict
via the mechanism shown forTypedMapping
above, but no proposal has been made yet for this.Who would use this?
Any library that has DataFrame-like objects:
Dict-wrappers like ModuleDict in PyTorch.
Relation to other PEPs
This proposal would synergize extremely well with a PEP that was proposing an inline syntax for TypedDict.
Survey of other programming languages
TypeScript
KeyType
corresponds straightforwardly to thekeyof
operator in TypeScript:ValueType
can also be expressed in TypeScript:This is possible because
Person["age"]
, for example, is used in TypeScript to refer to the type of the field "age" in the typePerson
.Generic types bound to the key types are also possible:
This syntax in TypeScript is very elegant but it is based on the
Person[K]
syntax (which returns the value type of the entry with keyK
), which does not seem to be possible in Python because Python already uses square brackets for generics (whereas TypeScript uses angled brackets for generics).The functionality of
Map
can be recreated with mapped types in TypeScript (though note that mapped types are more powerful than the proposedMap
operator):Flow
KeyType[T]
is$Keys<T>
in Flow:Flow now uses the same syntax as TypeScript for accessing the type of a certain entry:
Person[K]
, but before that, there was another, now deprecated mechanism for this:$ElementType<Person, K>
. This proposal is directly inspired by the deprecated Flow syntax. Again, we unfortunately can’t use thePerson[K]
syntax because we use square brackets for generics.$ObjMap
in Flow is very similar to the proposedMap
, though$ObjMap
is more powerful because it accepts an arbitrary lambda to map the entry types to other types.Prior art in typing of DataFrames
static_frame
static_frame
is a Python library for statically typing data frames. It uses variadic generics to allow specifying the types of the columns:The downside of this approach is that when columns are accessed by column name (as opposed to column index), the type system cannot infer the column's type.
Pandera
Pandera allows validation of pandas DataFrames:
This works for runtime type validation, but static type checkers can't understand it. With this PEP, it could be re-written as:
Beta Was this translation helpful? Give feedback.
All reactions