Split type-checking into interface and implementation in parallel workers#21119
Split type-checking into interface and implementation in parallel workers#21119ilevkivskyi wants to merge 23 commits intopython:masterfrom
Conversation
|
Oh btw, @JukkaL I think there is a bug in |
This comment has been minimized.
This comment has been minimized.
|
All things in (small) |
|
Could be worth adding a test for the discord.py improvement |
|
I'm planning to test this on a big internal repo (probably tomorrow). I'll also try parallel checking again -- last time memory usage was too high to use many workers, but things should be better now. |
|
I'm seeing mypy parallel run crashes with this PR when type checking the biggest internal codebase at work, but I'm not sure if they are caused by this -- this may just change the order of processing so that a pre-existing issue gets triggered. I will continue the investigation after the long weekend. |
|
@JukkaL can you post a traceback (and maybe a snippet of code where the crash happens)? It may well be some implicit assumption breaks when type-checking functions after top-levels. |
This comment has been minimized.
This comment has been minimized.
|
The internal codebase generates some syntax errors because of an issue with the native parser. After working around the syntax errors, the parallel run completes, so the crashes may be related to syntax errors. However, there are a handful of false positives. Also, this regresses performance -- now parallel checking with two workers is slower than sequential checking (about 10% slower), on macOS. On master parallel checking with two workers is about 13% faster (which is still not great). When looking at |
TBH this is really weird. Can you try running with |
|
I used 2 workers above instead of 3 in my older comment. I can try using 3 workers as well, I think I should have enough RAM for it. |
Some of the items we are already doing, and I am not going to do any of the rest. Performance on Linux seems good (btw do you have numbers for Linux with multiple workers?) If you (or anyone else) wants to work on Mac, you can do it in your own time. |
|
I don't have full numbers for Linux, but here are the ones I have (for split bodies only):
The overhead from multiple processes is much smaller compared to macOS. It's likely faster than sequential on two workers already, which sounds like a reasonable baseline performance target. I can continue working on macOS performance afterwards (doesn't block this PR). I have both personal and work mac laptops, so I can run measurements in a relatively clean environment without extra security software. I can measure parallel self check on my personal mac tomorrow (probably). I'm also planning to create a parallel checking synthetic benchmark with many small files, to measure coordination overhead. We can also add separate benchmarks with larger files, but it looks like the small file one would be the most helpful at this point. |
|
Couple observations:
I think we should focus on landing currently open PRs first, so that all building blocks are in place. In particular, this PR introduces (inevitable) semantic differences. I think we should agree on what to do with them. I wrote a whole lot of discussion about this in the PR description. But on the other hand, it is hard to keep so many moving parts in my head, so I am going to start merging soon. I will make another couple PRs one for more performance stats, and will try to adjust GC freeze hack so that it works on all Python versions for both sequential and parallel runs (this may be non-trivial). |
|
I will focus on getting more information about the crash and the false positives that I mentioned above so that we can move forward with this PR soon. Since we have ideas about addressing the macOS bottlenecks (even if we might not fully understand the problem yet), this can happen separately from this PR (and I can work on them). |
|
Here's a simplification of one of the new false positives from the big internal codebase, which seems similar to the # mypy: local-partial-types
class C:
x = None # type: ignore[var-annotated]
def f(self) -> object:
if not C.x:
C.x = 1 # New error here with split bodies
return C.x |
|
@JukkaL Yeah, I think this one is fine. Btw I get the same error with |
|
Another simplified regression: from typing import Any, Callable, overload
@overload
def option(*, callback: Callable[[str], object] = ...) -> Any: ...
@overload
def option(*, callback: Callable[[int], object] = ...) -> Any: ...
def option(**kwargs: object) -> None: pass
@option(callback=lambda x: [y for y in x]) # Error here
def f() -> None: passWhen using |
|
I have one more potential regression. I'll investigate it tomorrow. |
|
@JukkaL Yeah, that exposes an existing bug, here is a repro without splitting (fails on master): Fix should be simple: reset lambda argument types when visiting in empty context. As I mentioned before this PR may uncover many weird edge cases where type checker is not idempotent. |
|
The final potential regression actually looks like a fixed bug. First we need from typing import NewType
class C:
V = NewType('V', int)
X = C.V(0)Now this file has a new error when using split bodies ( # mypy: warn-redundant-casts
from enum import IntEnum
from typing import cast
import m
def g(type: m.C.V) -> None: ...
def f(type: C) -> None:
g(cast(m.C.V, type.value)) # new error: Redundant cast to "V"
class C(IntEnum):
X = m.C.XHowever, it looks like the type of |
|
Actually |
|
Diff from mypy_primer, showing the effect of this PR on open source code: discord.py (https://github.com/Rapptz/discord.py)
- discord/backoff.py:63: error: Incompatible default for parameter "integral" (default has type "Literal[False]", parameter has type "Literal[True]") [assignment]
+ discord/backoff.py:63: error: Incompatible default for parameter "integral" (default has type "Literal[False]", parameter has type "T") [assignment]
- discord/interactions.py:1109: error: Incompatible default for parameter "delay" (default has type "float | None", parameter has type "float") [assignment]
- discord/interactions.py:1255: error: Incompatible default for parameter "delay" (default has type "float | None", parameter has type "float") [assignment]
- discord/interactions.py:1645: error: Incompatible default for parameter "delay" (default has type "float | None", parameter has type "float") [assignment]
- discord/webhook/async_.py:969: error: Incompatible default for parameter "delay" (default has type "float | None", parameter has type "float") [assignment]
cki-lib (https://gitlab.com/cki-project/cki-lib)
- cki_lib/krb_ticket_refresher.py:26: error: Call to untyped function "_close_to_expire_ticket" in typed context [no-untyped-call]
+ cki_lib/krb_ticket_refresher.py:26: error: Call to untyped function "_close_to_expire_ticket" of "RefreshKerberosTicket" in typed context [no-untyped-call]
freqtrade (https://github.com/freqtrade/freqtrade)
- freqtrade/templates/FreqaiExampleStrategy.py:245: error: Invalid index type "tuple[tuple[IndexOpsMixin[Any, Any] | Series[builtins.bool] | ndarray[tuple[Any, ...], dtype[numpy.bool[builtins.bool]]] | list[builtins.bool] | str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any] | Sequence[str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any]] | slice[Any, Any, Any], ...], list[str]]" for "_LocIndexerFrame[DataFrame]"; expected type "tuple[tuple[IndexOpsMixin[Any, Any] | Series[builtins.bool] | ndarray[tuple[Any, ...], dtype[numpy.bool[builtins.bool]]] | list[builtins.bool] | str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any] | Sequence[str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any]] | slice[Any, Any, Any], ...], Hashable]" [index]
- freqtrade/templates/FreqaiExampleStrategy.py:245: error: Unsupported left operand type for & ("tuple[IndexOpsMixin[Any, Any] | Series[builtins.bool] | ndarray[tuple[Any, ...], dtype[numpy.bool[builtins.bool]]] | list[builtins.bool] | str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any] | Sequence[str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any]] | slice[Any, Any, Any], ...]") [operator]
- freqtrade/templates/FreqaiExampleStrategy.py:245: error: Argument 2 to "reduce" has incompatible type "list[Series[builtins.bool]]"; expected "Iterable[tuple[IndexOpsMixin[Any, Any] | Series[builtins.bool] | ndarray[tuple[Any, ...], dtype[numpy.bool[builtins.bool]]] | list[builtins.bool] | str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any] | Sequence[str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any]] | slice[Any, Any, Any], ...]]" [arg-type]
- freqtrade/templates/FreqaiExampleStrategy.py:255: error: Invalid index type "tuple[tuple[IndexOpsMixin[Any, Any] | Series[builtins.bool] | ndarray[tuple[Any, ...], dtype[numpy.bool[builtins.bool]]] | list[builtins.bool] | str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any] | Sequence[str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any]] | slice[Any, Any, Any], ...], list[str]]" for "_LocIndexerFrame[DataFrame]"; expected type "tuple[tuple[IndexOpsMixin[Any, Any] | Series[builtins.bool] | ndarray[tuple[Any, ...], dtype[numpy.bool[builtins.bool]]] | list[builtins.bool] | str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any] | Sequence[str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any]] | slice[Any, Any, Any], ...], Hashable]" [index]
- freqtrade/templates/FreqaiExampleStrategy.py:255: error: Unsupported left operand type for & ("tuple[IndexOpsMixin[Any, Any] | Series[builtins.bool] | ndarray[tuple[Any, ...], dtype[numpy.bool[builtins.bool]]] | list[builtins.bool] | str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any] | Sequence[str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any]] | slice[Any, Any, Any], ...]") [operator]
- freqtrade/templates/FreqaiExampleStrategy.py:255: error: Argument 2 to "reduce" has incompatible type "list[Series[builtins.bool]]"; expected "Iterable[tuple[IndexOpsMixin[Any, Any] | Series[builtins.bool] | ndarray[tuple[Any, ...], dtype[numpy.bool[builtins.bool]]] | list[builtins.bool] | str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any] | Sequence[str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any]] | slice[Any, Any, Any], ...]]" [arg-type]
- freqtrade/templates/FreqaiExampleStrategy.py:263: error: Unsupported left operand type for & ("tuple[IndexOpsMixin[Any, Any] | Series[builtins.bool] | ndarray[tuple[Any, ...], dtype[numpy.bool[builtins.bool]]] | list[builtins.bool] | str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any] | Sequence[str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any]] | slice[Any, Any, Any], ...]") [operator]
- freqtrade/templates/FreqaiExampleStrategy.py:263: error: Argument 2 to "reduce" has incompatible type "list[Series[builtins.bool]]"; expected "Iterable[tuple[IndexOpsMixin[Any, Any] | Series[builtins.bool] | ndarray[tuple[Any, ...], dtype[numpy.bool[builtins.bool]]] | list[builtins.bool] | str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any] | Sequence[str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any]] | slice[Any, Any, Any], ...]]" [arg-type]
- freqtrade/templates/FreqaiExampleStrategy.py:267: error: Unsupported left operand type for & ("tuple[IndexOpsMixin[Any, Any] | Series[builtins.bool] | ndarray[tuple[Any, ...], dtype[numpy.bool[builtins.bool]]] | list[builtins.bool] | str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any] | Sequence[str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any]] | slice[Any, Any, Any], ...]") [operator]
- freqtrade/templates/FreqaiExampleStrategy.py:267: error: Argument 2 to "reduce" has incompatible type "list[Series[builtins.bool]]"; expected "Iterable[tuple[IndexOpsMixin[Any, Any] | Series[builtins.bool] | ndarray[tuple[Any, ...], dtype[numpy.bool[builtins.bool]]] | list[builtins.bool] | str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any] | Sequence[str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any]] | slice[Any, Any, Any], ...]]" [arg-type]
|
|
The new entry in the primer ( |
|
Diff from mypy_primer, showing the effect of this PR on open source code: discord.py (https://github.com/Rapptz/discord.py)
- discord/backoff.py:63: error: Incompatible default for parameter "integral" (default has type "Literal[False]", parameter has type "Literal[True]") [assignment]
+ discord/backoff.py:63: error: Incompatible default for parameter "integral" (default has type "Literal[False]", parameter has type "T") [assignment]
- discord/interactions.py:1109: error: Incompatible default for parameter "delay" (default has type "float | None", parameter has type "float") [assignment]
- discord/interactions.py:1255: error: Incompatible default for parameter "delay" (default has type "float | None", parameter has type "float") [assignment]
- discord/interactions.py:1645: error: Incompatible default for parameter "delay" (default has type "float | None", parameter has type "float") [assignment]
- discord/webhook/async_.py:969: error: Incompatible default for parameter "delay" (default has type "float | None", parameter has type "float") [assignment]
cki-lib (https://gitlab.com/cki-project/cki-lib)
- cki_lib/krb_ticket_refresher.py:26: error: Call to untyped function "_close_to_expire_ticket" in typed context [no-untyped-call]
+ cki_lib/krb_ticket_refresher.py:26: error: Call to untyped function "_close_to_expire_ticket" of "RefreshKerberosTicket" in typed context [no-untyped-call]
freqtrade (https://github.com/freqtrade/freqtrade)
- freqtrade/templates/FreqaiExampleStrategy.py:245: error: Invalid index type "tuple[tuple[IndexOpsMixin[Any, Any] | Series[builtins.bool] | ndarray[tuple[Any, ...], dtype[numpy.bool[builtins.bool]]] | list[builtins.bool] | str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any] | Sequence[str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any]] | slice[Any, Any, Any], ...], list[str]]" for "_LocIndexerFrame[DataFrame]"; expected type "tuple[tuple[IndexOpsMixin[Any, Any] | Series[builtins.bool] | ndarray[tuple[Any, ...], dtype[numpy.bool[builtins.bool]]] | list[builtins.bool] | str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any] | Sequence[str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any]] | slice[Any, Any, Any], ...], Hashable]" [index]
- freqtrade/templates/FreqaiExampleStrategy.py:245: error: Unsupported left operand type for & ("tuple[IndexOpsMixin[Any, Any] | Series[builtins.bool] | ndarray[tuple[Any, ...], dtype[numpy.bool[builtins.bool]]] | list[builtins.bool] | str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any] | Sequence[str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any]] | slice[Any, Any, Any], ...]") [operator]
- freqtrade/templates/FreqaiExampleStrategy.py:245: error: Argument 2 to "reduce" has incompatible type "list[Series[builtins.bool]]"; expected "Iterable[tuple[IndexOpsMixin[Any, Any] | Series[builtins.bool] | ndarray[tuple[Any, ...], dtype[numpy.bool[builtins.bool]]] | list[builtins.bool] | str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any] | Sequence[str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any]] | slice[Any, Any, Any], ...]]" [arg-type]
- freqtrade/templates/FreqaiExampleStrategy.py:255: error: Invalid index type "tuple[tuple[IndexOpsMixin[Any, Any] | Series[builtins.bool] | ndarray[tuple[Any, ...], dtype[numpy.bool[builtins.bool]]] | list[builtins.bool] | str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any] | Sequence[str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any]] | slice[Any, Any, Any], ...], list[str]]" for "_LocIndexerFrame[DataFrame]"; expected type "tuple[tuple[IndexOpsMixin[Any, Any] | Series[builtins.bool] | ndarray[tuple[Any, ...], dtype[numpy.bool[builtins.bool]]] | list[builtins.bool] | str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any] | Sequence[str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any]] | slice[Any, Any, Any], ...], Hashable]" [index]
- freqtrade/templates/FreqaiExampleStrategy.py:255: error: Unsupported left operand type for & ("tuple[IndexOpsMixin[Any, Any] | Series[builtins.bool] | ndarray[tuple[Any, ...], dtype[numpy.bool[builtins.bool]]] | list[builtins.bool] | str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any] | Sequence[str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any]] | slice[Any, Any, Any], ...]") [operator]
- freqtrade/templates/FreqaiExampleStrategy.py:255: error: Argument 2 to "reduce" has incompatible type "list[Series[builtins.bool]]"; expected "Iterable[tuple[IndexOpsMixin[Any, Any] | Series[builtins.bool] | ndarray[tuple[Any, ...], dtype[numpy.bool[builtins.bool]]] | list[builtins.bool] | str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any] | Sequence[str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any]] | slice[Any, Any, Any], ...]]" [arg-type]
- freqtrade/templates/FreqaiExampleStrategy.py:263: error: Unsupported left operand type for & ("tuple[IndexOpsMixin[Any, Any] | Series[builtins.bool] | ndarray[tuple[Any, ...], dtype[numpy.bool[builtins.bool]]] | list[builtins.bool] | str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any] | Sequence[str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any]] | slice[Any, Any, Any], ...]") [operator]
- freqtrade/templates/FreqaiExampleStrategy.py:263: error: Argument 2 to "reduce" has incompatible type "list[Series[builtins.bool]]"; expected "Iterable[tuple[IndexOpsMixin[Any, Any] | Series[builtins.bool] | ndarray[tuple[Any, ...], dtype[numpy.bool[builtins.bool]]] | list[builtins.bool] | str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any] | Sequence[str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any]] | slice[Any, Any, Any], ...]]" [arg-type]
- freqtrade/templates/FreqaiExampleStrategy.py:267: error: Unsupported left operand type for & ("tuple[IndexOpsMixin[Any, Any] | Series[builtins.bool] | ndarray[tuple[Any, ...], dtype[numpy.bool[builtins.bool]]] | list[builtins.bool] | str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any] | Sequence[str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any]] | slice[Any, Any, Any], ...]") [operator]
- freqtrade/templates/FreqaiExampleStrategy.py:267: error: Argument 2 to "reduce" has incompatible type "list[Series[builtins.bool]]"; expected "Iterable[tuple[IndexOpsMixin[Any, Any] | Series[builtins.bool] | ndarray[tuple[Any, ...], dtype[numpy.bool[builtins.bool]]] | list[builtins.bool] | str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any] | Sequence[str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any]] | slice[Any, Any, Any], ...]]" [arg-type]
|
|
Diff from mypy_primer, showing the effect of this PR on open source code: discord.py (https://github.com/Rapptz/discord.py)
- discord/backoff.py:63: error: Incompatible default for parameter "integral" (default has type "Literal[False]", parameter has type "Literal[True]") [assignment]
+ discord/backoff.py:63: error: Incompatible default for parameter "integral" (default has type "Literal[False]", parameter has type "T") [assignment]
- discord/interactions.py:1109: error: Incompatible default for parameter "delay" (default has type "float | None", parameter has type "float") [assignment]
- discord/interactions.py:1255: error: Incompatible default for parameter "delay" (default has type "float | None", parameter has type "float") [assignment]
- discord/interactions.py:1645: error: Incompatible default for parameter "delay" (default has type "float | None", parameter has type "float") [assignment]
- discord/webhook/async_.py:969: error: Incompatible default for parameter "delay" (default has type "float | None", parameter has type "float") [assignment]
cki-lib (https://gitlab.com/cki-project/cki-lib)
- cki_lib/krb_ticket_refresher.py:26: error: Call to untyped function "_close_to_expire_ticket" in typed context [no-untyped-call]
+ cki_lib/krb_ticket_refresher.py:26: error: Call to untyped function "_close_to_expire_ticket" of "RefreshKerberosTicket" in typed context [no-untyped-call]
freqtrade (https://github.com/freqtrade/freqtrade)
- freqtrade/templates/FreqaiExampleStrategy.py:245: error: Invalid index type "tuple[tuple[IndexOpsMixin[Any, Any] | Series[builtins.bool] | ndarray[tuple[Any, ...], dtype[numpy.bool[builtins.bool]]] | list[builtins.bool] | str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any] | Sequence[str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any]] | slice[Any, Any, Any], ...], list[str]]" for "_LocIndexerFrame[DataFrame]"; expected type "tuple[tuple[IndexOpsMixin[Any, Any] | Series[builtins.bool] | ndarray[tuple[Any, ...], dtype[numpy.bool[builtins.bool]]] | list[builtins.bool] | str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any] | Sequence[str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any]] | slice[Any, Any, Any], ...], Hashable]" [index]
- freqtrade/templates/FreqaiExampleStrategy.py:245: error: Unsupported left operand type for & ("tuple[IndexOpsMixin[Any, Any] | Series[builtins.bool] | ndarray[tuple[Any, ...], dtype[numpy.bool[builtins.bool]]] | list[builtins.bool] | str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any] | Sequence[str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any]] | slice[Any, Any, Any], ...]") [operator]
- freqtrade/templates/FreqaiExampleStrategy.py:245: error: Argument 2 to "reduce" has incompatible type "list[Series[builtins.bool]]"; expected "Iterable[tuple[IndexOpsMixin[Any, Any] | Series[builtins.bool] | ndarray[tuple[Any, ...], dtype[numpy.bool[builtins.bool]]] | list[builtins.bool] | str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any] | Sequence[str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any]] | slice[Any, Any, Any], ...]]" [arg-type]
- freqtrade/templates/FreqaiExampleStrategy.py:255: error: Invalid index type "tuple[tuple[IndexOpsMixin[Any, Any] | Series[builtins.bool] | ndarray[tuple[Any, ...], dtype[numpy.bool[builtins.bool]]] | list[builtins.bool] | str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any] | Sequence[str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any]] | slice[Any, Any, Any], ...], list[str]]" for "_LocIndexerFrame[DataFrame]"; expected type "tuple[tuple[IndexOpsMixin[Any, Any] | Series[builtins.bool] | ndarray[tuple[Any, ...], dtype[numpy.bool[builtins.bool]]] | list[builtins.bool] | str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any] | Sequence[str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any]] | slice[Any, Any, Any], ...], Hashable]" [index]
- freqtrade/templates/FreqaiExampleStrategy.py:255: error: Unsupported left operand type for & ("tuple[IndexOpsMixin[Any, Any] | Series[builtins.bool] | ndarray[tuple[Any, ...], dtype[numpy.bool[builtins.bool]]] | list[builtins.bool] | str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any] | Sequence[str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any]] | slice[Any, Any, Any], ...]") [operator]
- freqtrade/templates/FreqaiExampleStrategy.py:255: error: Argument 2 to "reduce" has incompatible type "list[Series[builtins.bool]]"; expected "Iterable[tuple[IndexOpsMixin[Any, Any] | Series[builtins.bool] | ndarray[tuple[Any, ...], dtype[numpy.bool[builtins.bool]]] | list[builtins.bool] | str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any] | Sequence[str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any]] | slice[Any, Any, Any], ...]]" [arg-type]
- freqtrade/templates/FreqaiExampleStrategy.py:263: error: Unsupported left operand type for & ("tuple[IndexOpsMixin[Any, Any] | Series[builtins.bool] | ndarray[tuple[Any, ...], dtype[numpy.bool[builtins.bool]]] | list[builtins.bool] | str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any] | Sequence[str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any]] | slice[Any, Any, Any], ...]") [operator]
- freqtrade/templates/FreqaiExampleStrategy.py:263: error: Argument 2 to "reduce" has incompatible type "list[Series[builtins.bool]]"; expected "Iterable[tuple[IndexOpsMixin[Any, Any] | Series[builtins.bool] | ndarray[tuple[Any, ...], dtype[numpy.bool[builtins.bool]]] | list[builtins.bool] | str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any] | Sequence[str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any]] | slice[Any, Any, Any], ...]]" [arg-type]
- freqtrade/templates/FreqaiExampleStrategy.py:267: error: Unsupported left operand type for & ("tuple[IndexOpsMixin[Any, Any] | Series[builtins.bool] | ndarray[tuple[Any, ...], dtype[numpy.bool[builtins.bool]]] | list[builtins.bool] | str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any] | Sequence[str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any]] | slice[Any, Any, Any], ...]") [operator]
- freqtrade/templates/FreqaiExampleStrategy.py:267: error: Argument 2 to "reduce" has incompatible type "list[Series[builtins.bool]]"; expected "Iterable[tuple[IndexOpsMixin[Any, Any] | Series[builtins.bool] | ndarray[tuple[Any, ...], dtype[numpy.bool[builtins.bool]]] | list[builtins.bool] | str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any] | Sequence[str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any]] | slice[Any, Any, Any], ...]]" [arg-type]
|
|
Diff from mypy_primer, showing the effect of this PR on open source code: discord.py (https://github.com/Rapptz/discord.py)
- discord/backoff.py:63: error: Incompatible default for parameter "integral" (default has type "Literal[False]", parameter has type "Literal[True]") [assignment]
+ discord/backoff.py:63: error: Incompatible default for parameter "integral" (default has type "Literal[False]", parameter has type "T") [assignment]
- discord/interactions.py:1109: error: Incompatible default for parameter "delay" (default has type "float | None", parameter has type "float") [assignment]
- discord/interactions.py:1255: error: Incompatible default for parameter "delay" (default has type "float | None", parameter has type "float") [assignment]
- discord/interactions.py:1645: error: Incompatible default for parameter "delay" (default has type "float | None", parameter has type "float") [assignment]
- discord/webhook/async_.py:969: error: Incompatible default for parameter "delay" (default has type "float | None", parameter has type "float") [assignment]
cki-lib (https://gitlab.com/cki-project/cki-lib)
- cki_lib/krb_ticket_refresher.py:26: error: Call to untyped function "_close_to_expire_ticket" in typed context [no-untyped-call]
+ cki_lib/krb_ticket_refresher.py:26: error: Call to untyped function "_close_to_expire_ticket" of "RefreshKerberosTicket" in typed context [no-untyped-call]
freqtrade (https://github.com/freqtrade/freqtrade)
- freqtrade/templates/FreqaiExampleStrategy.py:245: error: Invalid index type "tuple[tuple[IndexOpsMixin[Any, Any] | Series[builtins.bool] | ndarray[tuple[Any, ...], dtype[numpy.bool[builtins.bool]]] | list[builtins.bool] | str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any] | Sequence[str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any]] | slice[Any, Any, Any], ...], list[str]]" for "_LocIndexerFrame[DataFrame]"; expected type "tuple[tuple[IndexOpsMixin[Any, Any] | Series[builtins.bool] | ndarray[tuple[Any, ...], dtype[numpy.bool[builtins.bool]]] | list[builtins.bool] | str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any] | Sequence[str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any]] | slice[Any, Any, Any], ...], Hashable]" [index]
- freqtrade/templates/FreqaiExampleStrategy.py:245: error: Unsupported left operand type for & ("tuple[IndexOpsMixin[Any, Any] | Series[builtins.bool] | ndarray[tuple[Any, ...], dtype[numpy.bool[builtins.bool]]] | list[builtins.bool] | str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any] | Sequence[str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any]] | slice[Any, Any, Any], ...]") [operator]
- freqtrade/templates/FreqaiExampleStrategy.py:245: error: Argument 2 to "reduce" has incompatible type "list[Series[builtins.bool]]"; expected "Iterable[tuple[IndexOpsMixin[Any, Any] | Series[builtins.bool] | ndarray[tuple[Any, ...], dtype[numpy.bool[builtins.bool]]] | list[builtins.bool] | str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any] | Sequence[str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any]] | slice[Any, Any, Any], ...]]" [arg-type]
- freqtrade/templates/FreqaiExampleStrategy.py:255: error: Invalid index type "tuple[tuple[IndexOpsMixin[Any, Any] | Series[builtins.bool] | ndarray[tuple[Any, ...], dtype[numpy.bool[builtins.bool]]] | list[builtins.bool] | str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any] | Sequence[str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any]] | slice[Any, Any, Any], ...], list[str]]" for "_LocIndexerFrame[DataFrame]"; expected type "tuple[tuple[IndexOpsMixin[Any, Any] | Series[builtins.bool] | ndarray[tuple[Any, ...], dtype[numpy.bool[builtins.bool]]] | list[builtins.bool] | str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any] | Sequence[str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any]] | slice[Any, Any, Any], ...], Hashable]" [index]
- freqtrade/templates/FreqaiExampleStrategy.py:255: error: Unsupported left operand type for & ("tuple[IndexOpsMixin[Any, Any] | Series[builtins.bool] | ndarray[tuple[Any, ...], dtype[numpy.bool[builtins.bool]]] | list[builtins.bool] | str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any] | Sequence[str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any]] | slice[Any, Any, Any], ...]") [operator]
- freqtrade/templates/FreqaiExampleStrategy.py:255: error: Argument 2 to "reduce" has incompatible type "list[Series[builtins.bool]]"; expected "Iterable[tuple[IndexOpsMixin[Any, Any] | Series[builtins.bool] | ndarray[tuple[Any, ...], dtype[numpy.bool[builtins.bool]]] | list[builtins.bool] | str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any] | Sequence[str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any]] | slice[Any, Any, Any], ...]]" [arg-type]
- freqtrade/templates/FreqaiExampleStrategy.py:263: error: Unsupported left operand type for & ("tuple[IndexOpsMixin[Any, Any] | Series[builtins.bool] | ndarray[tuple[Any, ...], dtype[numpy.bool[builtins.bool]]] | list[builtins.bool] | str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any] | Sequence[str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any]] | slice[Any, Any, Any], ...]") [operator]
- freqtrade/templates/FreqaiExampleStrategy.py:263: error: Argument 2 to "reduce" has incompatible type "list[Series[builtins.bool]]"; expected "Iterable[tuple[IndexOpsMixin[Any, Any] | Series[builtins.bool] | ndarray[tuple[Any, ...], dtype[numpy.bool[builtins.bool]]] | list[builtins.bool] | str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any] | Sequence[str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any]] | slice[Any, Any, Any], ...]]" [arg-type]
- freqtrade/templates/FreqaiExampleStrategy.py:267: error: Unsupported left operand type for & ("tuple[IndexOpsMixin[Any, Any] | Series[builtins.bool] | ndarray[tuple[Any, ...], dtype[numpy.bool[builtins.bool]]] | list[builtins.bool] | str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any] | Sequence[str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any]] | slice[Any, Any, Any], ...]") [operator]
- freqtrade/templates/FreqaiExampleStrategy.py:267: error: Argument 2 to "reduce" has incompatible type "list[Series[builtins.bool]]"; expected "Iterable[tuple[IndexOpsMixin[Any, Any] | Series[builtins.bool] | ndarray[tuple[Any, ...], dtype[numpy.bool[builtins.bool]]] | list[builtins.bool] | str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any] | Sequence[str | bytes | date | datetime | timedelta | <7 more items> | complex | integer[Any] | floating[Any] | complexfloating[Any, Any]] | slice[Any, Any, Any], ...]]" [arg-type]
|
|
Btw, I was thinking about batching SCCs in requests to workers, and I think there may be a simple conservative batching that (almost?) guarantees that we don't loose performance from less optimal bin packing on platforms with low socket overhead. I will play with this later, and if it will work, we can just do this on all platforms (as you know I am not a big fan of proliferation of platform-specific logic, even for optimizations). @JukkaL just to clarify: I have no planned changes to this PR, this is ready for review (and I am still waiting for answers on questions I asked in the PR description). |
JukkaL
left a comment
There was a problem hiding this comment.
Here's my review of the PR. I'll answer the questions in a separate comment.
| a_keys = {k for k in keys if "/a." in k or k.startswith("a.")} | ||
| assert len(a_keys) == 0, f"Unexpected a.* entries in diff: {a_keys}" | ||
| assert len(b_keys) == 2, f"Expected 2 b.* entries in diff, got: {b_keys}" | ||
| assert len(b_keys) == 3, f"Expected 2 b.* entries in diff, got: {b_keys}" |
| write_cache_meta_ex(meta_file, meta_ex, manager) | ||
| scc_result[id] = ModuleResult(None, formatted) | ||
| else: | ||
| # If there are bo error, only write the cache, don't send anything back |
| todo = [] | ||
| # Passing impl_only will select only "leaf" nodes (not the TypeInfos). | ||
| for _, node, info in tree.local_definitions(impl_only=True): | ||
| assert isinstance(node.node, (FuncDef, OverloadedFuncDef, Decorator)) |
There was a problem hiding this comment.
Did you check that overload bodies are only checked once? I think visit_overloaded_func_def in checker is now called twice, based on a quick experiment, when using parallel checking.
There was a problem hiding this comment.
Yes, the overload "header" is checked twice (and similar for decorators), while the actual body should be checked only once. I didn't fix this because: a) I am a bit worried fixing this may (subtly) break the daemon, b) performance impact from checking the header twice seems to be small.
However, thinking a bit more about this, we should at least try, either in this PR or in a follow-up.
| scc_result = [] | ||
| meta_tuples = {} | ||
| for id in stale: | ||
| meta_tuple = graph[id].write_cache() |
There was a problem hiding this comment.
It looks like write_cache() might return None? This seems to crash, and it's potentially related to this:
python -m mypy --cache-dir=/dev/null --num-workers=2 <some_file>.py
There was a problem hiding this comment.
Yeah, we should not allow disabling cache in parallel mode.
Agreed.
Yeah, this can be fixed later.
Sounds good.
Agreed, we don't want to have subtle apparently unrelated behavior changes
Yes, I like this idea. |
The idea is simple: new parser doesn't need the GIL, so we can parse files in parallel. Because it is tricky to apply parallelization _only_ to parallelizeable code, the most I see is ~4-5x speed-up with 8 threads, if I add more threads, it doesn't get visibly faster (I have 16 physical cores). Some notes on implementation: * I use stdlib `ThreadPoolExecutor`, it seems to work OK. * I refactored `parse_file()` a bit, so that we can parallelize (mostly) just the actual parsing. I see measurable degradation if I try to parallelize all of `parse_file()`. * I do not always use `psutil` because it is an optional dependency. We may want to actually make it a required dependency at some point. * It looks like there is a weird mypyc bug, that causes `ast_serialize` to be `None` sometimes in some threads. I simply add an ugly workaround for now. * It looks like I need to apply wrap_context() more consistently now. A bunch of tests used to pass accidentally before. * I only implement parallelization in the coordinator process. The workers counterpart can be done after #21119 is merged (it will be trivial).
JukkaL
left a comment
There was a problem hiding this comment.
Feel free to merge this when you feel it's ready. Some improvements can be done in follow-up PRs.
The general idea is very straightforward: when doing type-checking, we first type-check only module top-levels and those functions/methods that define/infer externally visible variables. Then we write cache and send new interface hash back to coordinator to unblock more SCCs early. This makes parallel type-checking ~25% faster.
However, this simple idea surfaced multiple quirks and old hacks. I address some of them in this PR, but I decided to handle the rest in follow up PR(s) to limit the size of this one.
First, important implementation details:
select()call, coordinator collects all responses, both interface and implementation ones, and processes them as a single batch. This simplifies reasoning and shouldn't affect performance.foo.meta_ex.ff. Not 100% sure about the name, couldn't find anything more meaningful.testWalrus.local_definitions()now do not yield methods of classes nested in functions. We add such methods to both symbol table of their actual class, and to the module top-level symbol table, thus causing double-processing.Now some smaller things I already fixed:
TypeFormsupport. I think two is enough, so I deleted the last one.AwaitableGeneratorreturn type wrapping used to happen during processing of function body, which is obviously wrong.Finally, some remaining problems and how I propose to address them in followups:
testNarrowingOfFinalPersistsInFunctions. Supporting this will be tricky/expensive, it would require preserving binder state at the point of each function definition, and restoring it later. IMO this is a relatively niche edge case, and we can simply "un-support" it (there is a simple workaround, add an assert in function body). To be clear, there are no problems with a much more common use of this feature: preserving narrowing in nested functions/lambdas.--disallow-incomplete-defsin plugins doesn't work, seetestDisallowIncompleteDefsAttrsPartialAnnotations. I think this should be not hard to fix (with some dedicated cleaner support). I can do this in a follow-up PR soon.testPEP695InferVarianceNotReadyWhenNeeded. However, when processing function/method bodies in a later phase, variance is ready more often. Although this is an improvement, it creates an inconsistency between parallel mode, and regular mode. I propose to address this by making the two-phase logic default even without parallel checking, see below.--local-partial-typeswhen behavior is different in parallel mode, see e.g.testLocalPartialTypesWithGlobalInitializedToNone. Again the new behavior is IMO clearly better. However, it again creates an inconsistency with non-parallel mode. I propose to address this by enabling two-phase (interface then implementation) checking whenever--local-partial-typesis enabled (globally, not per-file), even without parallel checking. Since--local-partial-typeswill be default behavior soon (and hopefully the only behavior at some point), this will allow us to avoid discrepancies between parallel and regular checking. @JukkaL what do you think?