{
  "id": "tensorflow/tfdata-shuffle-buffer-size",
  "signature": "InvalidArgumentError: shuffle buffer must have at least one element. [Op:ShuffleDataset]",
  "signature_zh": "无效参数错误：shuffle 缓冲区必须至少包含一个元素。 [Op:ShuffleDataset]",
  "regex": "shuffle buffer must have at least one element",
  "domain": "tensorflow",
  "category": "data_error",
  "subcategory": null,
  "root_cause": "The tf.data.Dataset.shuffle() method is called with a buffer_size that is larger than the dataset size, or the dataset is empty, causing the shuffle operation to fail because it cannot fill the buffer.",
  "root_cause_type": "generic",
  "root_cause_zh": "tf.data.Dataset.shuffle() 方法的 buffer_size 参数大于数据集大小，或者数据集为空，导致 shuffle 操作无法填充缓冲区而失败。",
  "versions": [
    {
      "version": "TensorFlow 2.9.0",
      "introduced": null,
      "deprecated": null,
      "removed": null,
      "behavior_change": null,
      "status": "active"
    },
    {
      "version": "TensorFlow 2.11.0",
      "introduced": null,
      "deprecated": null,
      "removed": null,
      "behavior_change": null,
      "status": "active"
    }
  ],
  "os_specific": {},
  "dead_ends": [
    {
      "action": "",
      "why_fails": "If the dataset has fewer elements than the buffer_size, the shuffle operation still fails because it cannot fill the buffer.",
      "fail_rate": 0.9,
      "condition": "",
      "sources": []
    },
    {
      "action": "",
      "why_fails": "This avoids the error but loses the desired data shuffling, which may negatively affect model training convergence.",
      "fail_rate": 0.5,
      "condition": "",
      "sources": []
    },
    {
      "action": "",
      "why_fails": "While repeat() can increase the effective dataset size, it does not change the underlying cardinality; the error persists if the original dataset is empty or too small.",
      "fail_rate": 0.7,
      "condition": "",
      "sources": []
    }
  ],
  "workarounds": [
    {
      "action": "Ensure the dataset has at least as many elements as the buffer_size. Use dataset.cardinality() to check the size and set buffer_size to min(dataset_size, buffer_size). For example: buffer_size = min(1000, dataset.cardinality().numpy()).",
      "success_rate": 0.95,
      "how": "Ensure the dataset has at least as many elements as the buffer_size. Use dataset.cardinality() to check the size and set buffer_size to min(dataset_size, buffer_size). For example: buffer_size = min(1000, dataset.cardinality().numpy()).",
      "condition": "",
      "sources": []
    },
    {
      "action": "If the dataset is empty, add a dummy element or filter out empty datasets before shuffling. Use dataset.filter() to remove empty entries.",
      "success_rate": 0.85,
      "how": "If the dataset is empty, add a dummy element or filter out empty datasets before shuffling. Use dataset.filter() to remove empty entries.",
      "condition": "",
      "sources": []
    },
    {
      "action": "Use a fallback: if the dataset is small, skip shuffling or use a smaller buffer. This can be done with a conditional: if dataset.cardinality() > 1: dataset = dataset.shuffle(buffer_size).",
      "success_rate": 0.9,
      "how": "Use a fallback: if the dataset is small, skip shuffling or use a smaller buffer. This can be done with a conditional: if dataset.cardinality() > 1: dataset = dataset.shuffle(buffer_size).",
      "condition": "",
      "sources": []
    }
  ],
  "workarounds_zh": [
    "确保数据集至少与 buffer_size 有相同数量的元素。使用 dataset.cardinality() 检查大小，并将 buffer_size 设置为 min(数据集大小, buffer_size)。例如：buffer_size = min(1000, dataset.cardinality().numpy())。",
    "如果数据集为空，在 shuffle 之前添加一个虚拟元素或过滤掉空数据集。使用 dataset.filter() 移除空条目。",
    "使用回退方案：如果数据集很小，跳过 shuffle 或使用更小的缓冲区。可以使用条件语句：if dataset.cardinality() > 1: dataset = dataset.shuffle(buffer_size)。"
  ],
  "transition_graph": {
    "leads_to": [],
    "preceded_by": [],
    "frequently_confused_with": []
  },
  "official_doc_url": "https://www.tensorflow.org/api_docs/python/tf/data/Dataset#shuffle",
  "official_doc_section": null,
  "error_code": "EDSF",
  "verification_tier": "ai_generated",
  "confidence": 0.87,
  "fix_success_rate": 0.9,
  "resolvable": "true",
  "first_seen": "2023-04-10",
  "last_confirmed": "2024-06-01",
  "last_updated": "2024-06-01",
  "evidence_count": 1,
  "tags": [],
  "locale": "en",
  "aliases": []
}