{
  "id": "tensorflow/tf-function-recompilation-cache-limit",
  "signature": "ResourceExhaustedError: The function 'train_step' has been retraced 1000 times. The tracing limit has been reached. This may be caused by passing Python literals or tensors with changing shapes.",
  "signature_zh": "资源耗尽错误：函数 'train_step' 已被重跟踪 1000 次。已达到跟踪限制。这可能是由于传递 Python 字面量或形状变化的张量导致的。",
  "regex": "The function.*has been retraced.*times.*tracing limit",
  "domain": "tensorflow",
  "category": "resource_error",
  "subcategory": null,
  "root_cause": "A tf.function-decorated function (e.g., train_step) is being re-traced excessively because it receives arguments with varying shapes or Python values that are not cached as part of the function's input signature, exhausting the tracing cache.",
  "root_cause_type": "generic",
  "root_cause_zh": "被 tf.function 装饰的函数（例如 train_step）因接收形状变化或 Python 值的参数而被过度重跟踪，这些参数未作为函数输入签名的一部分被缓存，耗尽了跟踪缓存。",
  "versions": [
    {
      "version": "TensorFlow 2.6.0",
      "introduced": null,
      "deprecated": null,
      "removed": null,
      "behavior_change": null,
      "status": "active"
    },
    {
      "version": "TensorFlow 2.10.0",
      "introduced": null,
      "deprecated": null,
      "removed": null,
      "behavior_change": null,
      "status": "active"
    }
  ],
  "os_specific": {},
  "dead_ends": [
    {
      "action": "",
      "why_fails": "Running eagerly defeats the purpose of tf.function (performance), and increasing the limit only delays the error without fixing the root cause of shape variability.",
      "fail_rate": 0.9,
      "condition": "",
      "sources": []
    },
    {
      "action": "",
      "why_fails": "While this can reduce retracing for shape changes, it does not address retracing caused by changing Python literal values (e.g., integer arguments), and may still hit the limit.",
      "fail_rate": 0.7,
      "condition": "",
      "sources": []
    },
    {
      "action": "",
      "why_fails": "This eliminates retracing but also removes the performance benefits of graph compilation, potentially slowing training significantly.",
      "fail_rate": 0.6,
      "condition": "",
      "sources": []
    }
  ],
  "workarounds": [
    {
      "action": "Ensure that all tensor arguments to the tf.function have consistent shapes. Pad or resize inputs to a fixed shape before passing them. For Python arguments, convert them to tensors or use tf.constant() to make them part of the graph signature.",
      "success_rate": 0.9,
      "how": "Ensure that all tensor arguments to the tf.function have consistent shapes. Pad or resize inputs to a fixed shape before passing them. For Python arguments, convert them to tensors or use tf.constant() to make them part of the graph signature.",
      "condition": "",
      "sources": []
    },
    {
      "action": "Define the input signature explicitly using tf.TensorSpec to prevent retracing due to shape or dtype variations. This tells TensorFlow to use a single graph for all calls matching the signature.",
      "success_rate": 0.85,
      "how": "Define the input signature explicitly using tf.TensorSpec to prevent retracing due to shape or dtype variations. This tells TensorFlow to use a single graph for all calls matching the signature.",
      "condition": "",
      "sources": []
    },
    {
      "action": "If the function uses Python integer or boolean arguments that change, convert them to tensors or move them outside the tf.function by using tf.cond() or tf.switch_case() for control flow.",
      "success_rate": 0.8,
      "how": "If the function uses Python integer or boolean arguments that change, convert them to tensors or move them outside the tf.function by using tf.cond() or tf.switch_case() for control flow.",
      "condition": "",
      "sources": []
    }
  ],
  "workarounds_zh": [
    "确保传递给 tf.function 的所有张量参数具有一致的形状。在传递之前将输入填充或调整为固定形状。对于 Python 参数，将其转换为张量或使用 tf.constant() 使其成为图签名的一部分。",
    "使用 tf.TensorSpec 显式定义输入签名，以防止因形状或数据类型变化而重跟踪。这告诉 TensorFlow 对匹配签名的所有调用使用单个图。",
    "如果函数使用变化的 Python 整数或布尔参数，将其转换为张量，或使用 tf.cond() 或 tf.switch_case() 进行控制流，将其移出 tf.function。"
  ],
  "transition_graph": {
    "leads_to": [],
    "preceded_by": [],
    "frequently_confused_with": []
  },
  "official_doc_url": "https://www.tensorflow.org/guide/function#controlling_retracing",
  "official_doc_section": null,
  "error_code": "ERTL",
  "verification_tier": "ai_generated",
  "confidence": 0.89,
  "fix_success_rate": 0.88,
  "resolvable": "true",
  "first_seen": "2023-08-12",
  "last_confirmed": "2024-06-01",
  "last_updated": "2024-06-01",
  "evidence_count": 1,
  "tags": [],
  "locale": "en",
  "aliases": []
}