feat: add ApifyApiError subclasses grouped by HTTP status#737
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #737 +/- ##
==========================================
- Coverage 95.36% 95.13% -0.24%
==========================================
Files 45 45
Lines 5118 5156 +38
==========================================
+ Hits 4881 4905 +24
- Misses 237 251 +14
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Let me guys know whether you think this is worth it. Currently, I'm slightly more inclined not to do it and keep the status quo. Since the API type errors are just general strings, they can often change. |
Arguments in favor
Arguments against
|
|
I would add one convenience argument against this change, which would only be used as a tie breaker: Adding this feature is non-breaking. Removing it is breaking. So I would wait until we are more certain about it, as we can easily add this in any minor release. |
Addresses #423. Each `ErrorType` enum member from the OpenAPI spec now has a matching `ApifyApiError` subclass (e.g. `RecordNotFoundError`) generated into `_generated_errors.py`. `ApifyApiError.__new__` dispatches to the right subclass based on the response's `error.type`, so `except ApifyApiError` keeps working while `except RecordNotFoundError` becomes possible. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…d __init__ `__new__` used to parse `response.json()` for dispatch and `__init__` parsed it again for attribute assignment. Now `__new__` stashes the parsed `error` dict on the instance and `__init__` reads it, so each raised error parses the body once. Also trims the stale star-import comment. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
189e0aa to
cb57e87
Compare
There was a problem hiding this comment.
Pull request overview
This PR introduces HTTP-status-based ApifyApiError subclasses so callers can catch specific API failure modes (e.g., NotFoundError for 404) without relying on endpoint-specific error.type strings.
Changes:
- Added
ApifyApiError.__new__dispatch to map selected HTTP statuses to dedicated exception subclasses (and 5xx toServerError). - Refactored API error-body parsing into
_extract_error_payload()and kepttype/message/dataas metadata. - Updated
catch_not_found_or_throw()and unit tests to useNotFoundError/ status-based behavior.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
src/apify_client/errors.py |
Adds status→exception dispatch and new ApifyApiError subclasses; refactors payload extraction. |
src/apify_client/_utils.py |
Updates not-found suppression to use isinstance(..., NotFoundError). |
tests/unit/test_utils.py |
Adjusts not-found suppression expectations and mocks JSON payload parsing. |
tests/unit/test_client_errors.py |
Adds tests asserting correct subclass dispatch for mapped statuses, streamed responses, 5xx, and fallback cases. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| is_not_found_status = exc.status_code == HTTPStatus.NOT_FOUND | ||
| is_not_found_type = exc.type in ['record-not-found', 'record-or-token-not-found'] | ||
| if not (is_not_found_status and is_not_found_type): | ||
| if not isinstance(exc, NotFoundError): |
There was a problem hiding this comment.
I am not sure about this (but the previous variant had the same issues.)
The problem of swallowing 404s is that you lose the information about what was not found. For example
dataset_1 = client.run(run_id="RUN DOES NOT EXIST").dataset().get() # Run does not exist
dataset_2 = client.run(run_id="RUN EXISTS").dataset().get() # Dataset does not exist
In both cases it returns None and user has no idea which of the cases happened.
There was a problem hiding this comment.
We should probably just swallow 404 related to the last resource in the chain and throw all the others?
There was a problem hiding this comment.
Thanks for bringing this up.
We should probably just swallow 404 related to the last resource in the chain and throw all the others?
I implemented it this way. It's certainly better, but I'm not sure in terms of consistency - now for the get we for some cases return None and for some other raise exception, hmm?
There was a problem hiding this comment.
Ok, lets discuss this in the team https://apify.slack.com/archives/C02JQSN79V4/p1776845186944399
There was a problem hiding this comment.
Let's remove the last commit, merge it as it is, and do this in a dedicated one if there is a consensus.
Pijukatel
left a comment
There was a problem hiding this comment.
I am fine with both 404 variants.
c6ee543 to
7bc0d51
Compare
Follow-up to #737 (comment) `.get()` previously collapsed every 404 into `None` via `catch_not_found_or_throw`. That works for direct, ID-identified fetches (`client.dataset(id).get()`), where a 404 unambiguously means the named resource is missing. It's misleading whenever a 404 is ambiguous — the client can't tell which resource in the path is actually gone, and silently returning `None` hides the cause. This PR identifies three categories where a 404 is ambiguous and propagates `NotFoundError` instead: 1. **Chained calls without an ID** (`run.dataset()`, `run.key_value_store()`, `run.request_queue()`, `run.log()`) — a 404 could mean either the parent run is missing or the default sub-resource. Covered by the base `_get` / `_delete` via a `resource_id is None → raise` guard, so `.delete()` on a chained client also raises now; direct `dataset(id).delete()` keeps its idempotent-DELETE semantics. 2. **Chained `LogClient`** — `run.log().get()` / `.get_as_bytes()` / `.stream()` raise; direct `client.log(id).get()` still returns `None`. 3. **Singleton sub-path endpoints** — `ScheduleClient.get_log`, `TaskClient.get_input`, `DatasetClient.get_statistics`, `UserClient.monthly_usage`, `UserClient.limits`, `WebhookClient.test`. These hit a fixed path (`/.../{id}/log`, `/.../{id}/input`, etc.) where a 404 effectively always means the parent is missing. Return types moved from `T | None` to `T`. Record-by-key lookups (`KeyValueStoreClient.get_record(key)`, `RequestQueueClient.get_request(request_id)`) keep the existing "None on missing" behavior — the 404 is specifically about the record/request, which is the natural meaning. A shared helper `catch_not_found_for_resource_or_throw(exc, resource_id)` in `_utils.py` centralizes the `resource_id is None → raise` pattern across all 10 call sites. The v3 upgrade guide documents the new semantics. Sync/async tests cover every code path, including `client.actor('id').last_run().dataset().get()` (happy path + ambiguous-404 case).
Summary
Addresses #423. Adds a small set of
ApifyApiErrorsubclasses so callers can narrowexceptclauses to specific failure modes. Dispatch is driven by HTTP status — a stable contract — rather than the per-endpointerror.typestring.type,message, anddataremain exposed as metadata.All subclasses inherit from
ApifyApiError, so existingexcept ApifyApiErrorhandlers keep matching.InvalidRequestErrorUnauthorizedErrorForbiddenErrorNotFoundErrorConflictErrorRateLimitErrorServerErrorUnmapped statuses and unparsable bodies fall back to the base
ApifyApiError.Migration note
Constructing
ApifyApiError(response, ...)directly now returns an instance of the matching subclass.except ApifyApiErroris unaffected; tests assertingtype(exc) is ApifyApiErroron a mapped status would need updates.catch_not_found_or_thrownow usesisinstance(exc, NotFoundError)instead of a hardcodederror.typeallowlist.