Skip to content

Conversation

@mikejmorgan-ai
Copy link
Member

@mikejmorgan-ai mikejmorgan-ai commented Dec 9, 2025

Summary

Implements Issue #43: Smart Retry Logic with Exponential Backoff

This PR adds a comprehensive retry module (cortex/retry.py) that provides robust retry mechanisms for operations that may fail transiently.

Features

  • RetryConfig: Configurable retry behavior (attempts, delays, strategies)
  • RetryManager: Core class for executing operations with retry logic
  • Multiple Backoff Strategies:
    • Exponential (default) - 1s, 2s, 4s, 8s...
    • Linear - 1s, 2s, 3s, 4s...
    • Constant - fixed delay
    • Fibonacci - 1s, 1s, 2s, 3s, 5s...
  • Jitter: Randomized delay variation to prevent thundering herd
  • @Retry Decorator: Easy function decoration
  • Preset Configs: Optimized settings for network, API, and apt operations

Usage Examples

from cortex.retry import retry, RetryStrategy, retry_api_call

# Using decorator
@retry(max_attempts=3, base_delay=1.0)
def fetch_packages():
    return api.get_packages()

# Using convenience function
result = retry_api_call(llm.generate, prompt="Install docker")
if result.success:
    print(result.result)

Test Plan

  • 33 unit tests covering all functionality
  • Tests for all backoff strategies
  • Tests for jitter behavior
  • Tests for edge cases and error handling
  • Integration scenario tests

Files Changed

  • cortex/retry.py - New retry module (320 lines)
  • tests/test_retry.py - Comprehensive test suite (441 lines)

Closes #43

🤖 Generated with Claude Code

Summary by CodeRabbit

Release Notes

  • New Features

    • Added comprehensive retry system supporting multiple backoff strategies: exponential, linear, constant, and Fibonacci with intelligent jitter.
    • Preconfigured retry settings optimized for network operations, API calls, and system-level tasks.
    • Built-in error tracking with callback support to monitor and respond to retry events.
  • Tests

    • Comprehensive test coverage validating all retry scenarios and edge cases.

✏️ Tip: You can customize this high-level summary in your review settings.

Implements Issue #43 - Smart Retry Logic with Exponential Backoff

Features:
- RetryConfig dataclass with configurable parameters
- RetryManager class for executing operations with retry logic
- Multiple backoff strategies: exponential, linear, constant, fibonacci
- Configurable jitter to prevent thundering herd problem
- @Retry decorator for easy function decoration
- Preset configurations for network, API, and apt operations
- Comprehensive test suite (33 tests)

The module provides robust retry mechanisms for:
- Network operations with transient failures
- LLM API calls with rate limiting
- Package manager operations with lock file issues

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 9, 2025

Walkthrough

Introduces a new retry module with configurable retry logic supporting multiple backoff strategies (exponential, linear, constant, Fibonacci) and jitter. Includes RetryManager class for executing retries, a decorator for easy function wrapping, preset configurations for common use cases (network, API, APT operations), and a comprehensive test suite validating all functionality.

Changes

Cohort / File(s) Summary
Core retry module
cortex/retry.py
Introduces RetryStrategy enum, RetryConfig dataclass with validation, RetryResult container, and RetryManager class handling retry execution with delay calculation across multiple strategies, jitter support, and error tracking. Includes retry decorator, preset configurations (NETWORK_RETRY_CONFIG, API_RETRY_CONFIG, APT_RETRY_CONFIG), and convenience functions (retry_apt_operation, retry_api_call, retry_network_operation).
Comprehensive test suite
tests/test_retry.py
Adds extensive tests validating RetryConfig validation, RetryManager behavior (successful execution, error handling, delay calculations, jitter bounds, callbacks), retry decorator functionality, preset configurations, and convenience helpers. Covers all backoff strategies, argument propagation, timing assertions, and integration scenarios.

Sequence Diagram

sequenceDiagram
    actor User
    participant RetryManager
    participant TargetFunc as Target Function
    participant Callback as on_retry Callback
    
    User->>RetryManager: execute(func, args, kwargs)
    
    loop Retry Attempts (max_attempts)
        RetryManager->>TargetFunc: invoke with args/kwargs
        
        alt Success
            TargetFunc-->>RetryManager: result
            Note over RetryManager: Return RetryResult (success=True)
        else Exception Raised
            TargetFunc-->>RetryManager: exception
            
            alt Retryable Exception & Attempts Remaining
                RetryManager->>RetryManager: _calculate_delay(attempt, strategy)
                Note over RetryManager: Apply jitter if configured
                
                RetryManager->>Callback: on_retry(attempt, exception, delay)
                Callback-->>RetryManager: callback complete
                
                Note over RetryManager: Wait delay seconds
                Note over RetryManager: Collect error, increment attempt
            else Non-Retryable or No Attempts Left
                Note over RetryManager: Return RetryResult<br/>(success=False, final_error)
            end
        end
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

  • RetryConfig validation: Verify post_init edge cases handle all constraint combinations correctly (max_attempts, base_delay, max_delay, jitter_range bounds)
  • Delay calculation logic: Cross-check _calculate_delay implementation for exponential, linear, constant, and Fibonacci strategies; verify max_delay capping and jitter application are correct
  • Error handling & propagation: Ensure retryable_exceptions filtering works properly and final_error is correctly set when retries exhaust
  • Callback integration: Validate on_retry callback receives correct parameters and doesn't interfere with retry flow
  • Timing accuracy: Verify RetryResult.total_time is accurately calculated across all strategies and jitter scenarios

Possibly related issues

  • Smart Retry Logic with Exponential Backoff #43 (Smart Retry Logic with Exponential Backoff): This PR directly implements the issue's requirements for exponential backoff, configurable max retries, logging of retry attempts, different handling strategies (network vs other errors), and comprehensive test coverage with examples.

Poem

🐰 Hops with glee through retry streams
Exponential dreams, jitter beams!
Five times max, but backoff's wise,
Transient faults? Watch code rise!

Pre-merge checks and finishing touches

✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely summarizes the main change: adding smart retry logic with exponential backoff, directly addressing Issue #43.
Description check ✅ Passed The description follows the repository template with all required sections (Summary, Type of Change, Checklist, Testing) and provides comprehensive implementation details with usage examples.
Linked Issues check ✅ Passed The PR implementation meets most core requirements: exponential backoff [#43], configurable max retries [#43], retry logging [#43], multiple error-handling strategies [#43], jitter for mitigation [#43], and extensive tests >80% coverage [#43]. Supports user progress notification through on_retry callback.
Out of Scope Changes check ✅ Passed All changes are directly related to implementing Issue #43: the new retry.py module and test suite for retry functionality. No unrelated modifications detected.
Docstring Coverage ✅ Passed Docstring coverage is 80.39% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feature/smart-retry-logic

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@sonarqubecloud
Copy link

sonarqubecloud bot commented Dec 9, 2025

Quality Gate Failed Quality Gate failed

Failed conditions
C Reliability Rating on New Code (required ≥ A)

See analysis details on SonarQube Cloud

Catch issues before they fail your Quality Gate with our IDE extension SonarQube for IDE

import random
import logging
import functools
from typing import Callable, TypeVar, Optional, Tuple, Type, Union, List
"""

import pytest
import time

import pytest
import time
from unittest.mock import Mock, patch, call
Comment on lines +10 to +22
from cortex.retry import (
RetryConfig,
RetryStrategy,
RetryResult,
RetryManager,
retry,
NETWORK_RETRY_CONFIG,
API_RETRY_CONFIG,
APT_RETRY_CONFIG,
retry_apt_operation,
retry_api_call,
retry_network_operation,
)
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
cortex/retry.py (3)

53-61: Consider validating exponential_base.

The validation logic is thorough, but exponential_base is not validated. A value ≤ 0 would cause issues (negative/zero delays or math errors), and a value < 1 would cause decreasing delays instead of increasing backoff.

     def __post_init__(self):
         if self.max_attempts < 1:
             raise ValueError("max_attempts must be at least 1")
         if self.base_delay < 0:
             raise ValueError("base_delay must be non-negative")
         if self.max_delay < self.base_delay:
             raise ValueError("max_delay must be >= base_delay")
         if not 0 <= self.jitter_range <= 1:
             raise ValueError("jitter_range must be between 0 and 1")
+        if self.exponential_base <= 0:
+            raise ValueError("exponential_base must be positive")

87-88: Annotate class constant with ClassVar and prefer tuple for immutability.

The _FIBONACCI sequence is a class-level constant that should be annotated with ClassVar and made immutable.

+from typing import ClassVar
+
 class RetryManager:
     """Manages retry operations with configurable backoff strategies."""
 
     # Precomputed Fibonacci sequence for fibonacci backoff
-    _FIBONACCI = [1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144]
+    _FIBONACCI: ClassVar[tuple[int, ...]] = (1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144)

179-183: Use logging.exception to preserve the stack trace.

When logging the final failure, logging.exception automatically includes the traceback, which aids debugging.

                 else:
-                    logger.error(
+                    logger.exception(
                         f"All {self.config.max_attempts} attempts failed. "
                         f"Final error: {e}"
                     )
tests/test_retry.py (1)

350-375: Consider adding retry scenario tests for convenience functions.

The convenience function tests only cover the immediate success case. Consider adding at least one test that verifies the retry behavior through these helpers (e.g., a mock that fails once then succeeds) to ensure the preset configs are applied correctly.

def test_retry_network_operation_with_retries(self):
    """Test retry_network_operation actually retries on failure."""
    mock_func = Mock(side_effect=[ConnectionError("fail"), b"data"])
    result = retry_network_operation(mock_func)

    assert result.success is True
    assert result.attempts == 2
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f18bc09 and dbc70eb.

📒 Files selected for processing (2)
  • cortex/retry.py (1 hunks)
  • tests/test_retry.py (1 hunks)
🧰 Additional context used
🪛 GitHub Check: SonarCloud Code Analysis
tests/test_retry.py

[warning] 340-340: Do not perform equality checks with floating point values.

See more on https://sonarcloud.io/project/issues?id=cortexlinux_cortex&issues=AZsDZ6yu-W86pjtWi7WE&open=AZsDZ6yu-W86pjtWi7WE&pullRequest=278


[warning] 176-176: Do not perform equality checks with floating point values.

See more on https://sonarcloud.io/project/issues?id=cortexlinux_cortex&issues=AZsDZ6yu-W86pjtWi7V7&open=AZsDZ6yu-W86pjtWi7V7&pullRequest=278


[warning] 162-162: Do not perform equality checks with floating point values.

See more on https://sonarcloud.io/project/issues?id=cortexlinux_cortex&issues=AZsDZ6yu-W86pjtWi7V5&open=AZsDZ6yu-W86pjtWi7V5&pullRequest=278


[warning] 181-181: Do not perform equality checks with floating point values.

See more on https://sonarcloud.io/project/issues?id=cortexlinux_cortex&issues=AZsDZ6yu-W86pjtWi7WA&open=AZsDZ6yu-W86pjtWi7WA&pullRequest=278


[warning] 48-48: Do not perform equality checks with floating point values.

See more on https://sonarcloud.io/project/issues?id=cortexlinux_cortex&issues=AZsDZ6yu-W86pjtWi7Vu&open=AZsDZ6yu-W86pjtWi7Vu&pullRequest=278


[warning] 334-334: Do not perform equality checks with floating point values.

See more on https://sonarcloud.io/project/issues?id=cortexlinux_cortex&issues=AZsDZ6yu-W86pjtWi7WD&open=AZsDZ6yu-W86pjtWi7WD&pullRequest=278


[warning] 130-130: Do not perform equality checks with floating point values.

See more on https://sonarcloud.io/project/issues?id=cortexlinux_cortex&issues=AZsDZ6yu-W86pjtWi7Vw&open=AZsDZ6yu-W86pjtWi7Vw&pullRequest=278


[warning] 195-195: Do not perform equality checks with floating point values.

See more on https://sonarcloud.io/project/issues?id=cortexlinux_cortex&issues=AZsDZ6yu-W86pjtWi7WB&open=AZsDZ6yu-W86pjtWi7WB&pullRequest=278


[warning] 333-333: Do not perform equality checks with floating point values.

See more on https://sonarcloud.io/project/issues?id=cortexlinux_cortex&issues=AZsDZ6yu-W86pjtWi7WC&open=AZsDZ6yu-W86pjtWi7WC&pullRequest=278


[warning] 34-34: Do not perform equality checks with floating point values.

See more on https://sonarcloud.io/project/issues?id=cortexlinux_cortex&issues=AZsDZ6yu-W86pjtWi7Vs&open=AZsDZ6yu-W86pjtWi7Vs&pullRequest=278


[warning] 132-132: Do not perform equality checks with floating point values.

See more on https://sonarcloud.io/project/issues?id=cortexlinux_cortex&issues=AZsDZ6yu-W86pjtWi7Vy&open=AZsDZ6yu-W86pjtWi7Vy&pullRequest=278


[warning] 179-179: Do not perform equality checks with floating point values.

See more on https://sonarcloud.io/project/issues?id=cortexlinux_cortex&issues=AZsDZ6yu-W86pjtWi7V-&open=AZsDZ6yu-W86pjtWi7V-&pullRequest=278


[warning] 161-161: Do not perform equality checks with floating point values.

See more on https://sonarcloud.io/project/issues?id=cortexlinux_cortex&issues=AZsDZ6yu-W86pjtWi7V4&open=AZsDZ6yu-W86pjtWi7V4&pullRequest=278


[warning] 49-49: Do not perform equality checks with floating point values.

See more on https://sonarcloud.io/project/issues?id=cortexlinux_cortex&issues=AZsDZ6yu-W86pjtWi7Vv&open=AZsDZ6yu-W86pjtWi7Vv&pullRequest=278


[warning] 341-341: Do not perform equality checks with floating point values.

See more on https://sonarcloud.io/project/issues?id=cortexlinux_cortex&issues=AZsDZ6yu-W86pjtWi7WF&open=AZsDZ6yu-W86pjtWi7WF&pullRequest=278


[warning] 149-149: Do not perform equality checks with floating point values.

See more on https://sonarcloud.io/project/issues?id=cortexlinux_cortex&issues=AZsDZ6yu-W86pjtWi7V3&open=AZsDZ6yu-W86pjtWi7V3&pullRequest=278


[warning] 148-148: Do not perform equality checks with floating point values.

See more on https://sonarcloud.io/project/issues?id=cortexlinux_cortex&issues=AZsDZ6yu-W86pjtWi7V2&open=AZsDZ6yu-W86pjtWi7V2&pullRequest=278


[warning] 131-131: Do not perform equality checks with floating point values.

See more on https://sonarcloud.io/project/issues?id=cortexlinux_cortex&issues=AZsDZ6yu-W86pjtWi7Vx&open=AZsDZ6yu-W86pjtWi7Vx&pullRequest=278


[warning] 346-346: Do not perform equality checks with floating point values.

See more on https://sonarcloud.io/project/issues?id=cortexlinux_cortex&issues=AZsDZ6yu-W86pjtWi7WG&open=AZsDZ6yu-W86pjtWi7WG&pullRequest=278


[warning] 33-33: Do not perform equality checks with floating point values.

See more on https://sonarcloud.io/project/issues?id=cortexlinux_cortex&issues=AZsDZ6yu-W86pjtWi7Vr&open=AZsDZ6yu-W86pjtWi7Vr&pullRequest=278


[warning] 32-32: Do not perform equality checks with floating point values.

See more on https://sonarcloud.io/project/issues?id=cortexlinux_cortex&issues=AZsDZ6yu-W86pjtWi7Vq&open=AZsDZ6yu-W86pjtWi7Vq&pullRequest=278


[warning] 133-133: Do not perform equality checks with floating point values.

See more on https://sonarcloud.io/project/issues?id=cortexlinux_cortex&issues=AZsDZ6yu-W86pjtWi7Vz&open=AZsDZ6yu-W86pjtWi7Vz&pullRequest=278


[warning] 177-177: Do not perform equality checks with floating point values.

See more on https://sonarcloud.io/project/issues?id=cortexlinux_cortex&issues=AZsDZ6yu-W86pjtWi7V8&open=AZsDZ6yu-W86pjtWi7V8&pullRequest=278


[warning] 163-163: Do not perform equality checks with floating point values.

See more on https://sonarcloud.io/project/issues?id=cortexlinux_cortex&issues=AZsDZ6yu-W86pjtWi7V6&open=AZsDZ6yu-W86pjtWi7V6&pullRequest=278


[warning] 414-414: Replace this generic exception class with a more specific one.

See more on https://sonarcloud.io/project/issues?id=cortexlinux_cortex&issues=AZsDZ6yu-W86pjtWi7WH&open=AZsDZ6yu-W86pjtWi7WH&pullRequest=278


[warning] 36-36: Do not perform equality checks with floating point values.

See more on https://sonarcloud.io/project/issues?id=cortexlinux_cortex&issues=AZsDZ6yu-W86pjtWi7Vt&open=AZsDZ6yu-W86pjtWi7Vt&pullRequest=278


[warning] 178-178: Do not perform equality checks with floating point values.

See more on https://sonarcloud.io/project/issues?id=cortexlinux_cortex&issues=AZsDZ6yu-W86pjtWi7V9&open=AZsDZ6yu-W86pjtWi7V9&pullRequest=278


[warning] 147-147: Do not perform equality checks with floating point values.

See more on https://sonarcloud.io/project/issues?id=cortexlinux_cortex&issues=AZsDZ6yu-W86pjtWi7V1&open=AZsDZ6yu-W86pjtWi7V1&pullRequest=278


[warning] 180-180: Do not perform equality checks with floating point values.

See more on https://sonarcloud.io/project/issues?id=cortexlinux_cortex&issues=AZsDZ6yu-W86pjtWi7V_&open=AZsDZ6yu-W86pjtWi7V_&pullRequest=278


[warning] 146-146: Do not perform equality checks with floating point values.

See more on https://sonarcloud.io/project/issues?id=cortexlinux_cortex&issues=AZsDZ6yu-W86pjtWi7V0&open=AZsDZ6yu-W86pjtWi7V0&pullRequest=278

🪛 Ruff (0.14.8)
cortex/retry.py

55-55: Avoid specifying long messages outside the exception class

(TRY003)


57-57: Avoid specifying long messages outside the exception class

(TRY003)


59-59: Avoid specifying long messages outside the exception class

(TRY003)


61-61: Avoid specifying long messages outside the exception class

(TRY003)


88-88: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)


126-126: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)


180-183: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

tests/test_retry.py

292-292: Avoid specifying long messages outside the exception class

(TRY003)


302-302: Avoid specifying long messages outside the exception class

(TRY003)


389-389: Avoid specifying long messages outside the exception class

(TRY003)


414-414: Create your own exception

(TRY002)


414-414: Avoid specifying long messages outside the exception class

(TRY003)


428-428: Avoid specifying long messages outside the exception class

(TRY003)

🔇 Additional comments (5)
cortex/retry.py (2)

44-44: Verify default max_attempts against issue requirements.

Issue #43 specifies "Maximum retry attempts configurable (default specified as 5)", but the default here is 3. Please confirm whether this is intentional or should be updated to match the specification.


250-273: LGTM!

The preset configurations are well-designed for their respective use cases, with appropriate tuning of delays and jitter settings.

tests/test_retry.py (3)

25-70: LGTM!

Comprehensive validation tests covering all edge cases in RetryConfig.__post_init__. The floating-point equality checks are safe here since jitter is disabled and values are deterministic.


117-181: LGTM!

Thorough delay calculation tests covering all four backoff strategies with deterministic assertions.


378-434: LGTM!

Solid integration tests covering realistic retry scenarios including transient failures, rate limiting, and permanent failures.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Smart Retry Logic with Exponential Backoff

2 participants