Ajuster le collage des autocollants (espacement et labels visibles)

Ajouter l’étape 35 : extraction et collage des autocollants
Fiabiliser le téléchargement des autocollants sans URL
2025-12-03 20:54:53 +01:00 · 2025-12-03 18:03:43 +01:00 · 2025-12-03 18:00:25 +01:00 · 2025-12-03 17:49:46 +01:00 · 2025-12-03 17:37:01 +01:00 · 2025-12-03 17:21:00 +01:00
54 changed files with 618 additions and 6 deletions
--- a/README.md
+++ b/README.md
@@ -383,3 +383,16 @@ Le calcul lit `data/intermediate/parts_filtered.csv`, `data/raw/parts.csv`, `dat
 Le téléchargement s’appuie sur `REBRICKABLE_TOKEN` et place les visuels des pièces dans `figures/rebrickable/{set_id}/rare_parts/{part_num}.jpg`, en journalisant les manques dans `data/intermediate/part_rarity_download_log.csv`.

 Le tracé `figures/step34/part_rarity.png` juxtapose, pour chaque pièce de `part_rarity_exclusive.csv`, les occurrences dans les sets filtrés vs le reste du catalogue avec les images incrustées.
+
+### Étape 35 : planches d'autocollants (collage)
+
+1. `source .venv/bin/activate`
+2. `python -m scripts.compute_sticker_parts`
+3. `python -m scripts.download_sticker_resources`
+4. `python -m scripts.plot_sticker_sheets`
+
+Le calcul lit `data/intermediate/parts_filtered.csv`, `data/raw/parts.csv` et `data/intermediate/sets_enriched.csv`, conserve les pièces de catégorie 58 (stickers) hors rechanges et produit `data/intermediate/sticker_parts.csv` avec set, année, nom, référence et quantité.
+
+Le téléchargement s’appuie sur `REBRICKABLE_TOKEN` et enregistre les visuels dans `figures/rebrickable/{set_id}/stickers/{part_num}.jpg`, en journalisant les manques dans `data/intermediate/sticker_download_log.csv` (cache partagé `data/intermediate/part_img_cache.csv`).
+
+Le collage `figures/step35/sticker_sheets.png` assemble toutes les planches trouvées (triées par année puis set) avec, sous chaque image, l’année, l’identifiant de set et la référence de la planche.
--- a/figures/rebrickable/75917/stickers/21368.jpg
+++ b/figures/rebrickable/75917/stickers/21368.jpg
--- a/figures/rebrickable/75926/stickers/38421.jpg
+++ b/figures/rebrickable/75926/stickers/38421.jpg
--- a/figures/rebrickable/75927/stickers/38537.jpg
+++ b/figures/rebrickable/75927/stickers/38537.jpg
--- a/figures/rebrickable/75928/stickers/38593.jpg
+++ b/figures/rebrickable/75928/stickers/38593.jpg
--- a/figures/rebrickable/75929/stickers/38966.jpg
+++ b/figures/rebrickable/75929/stickers/38966.jpg
--- a/figures/rebrickable/75930/stickers/38979.jpg
+++ b/figures/rebrickable/75930/stickers/38979.jpg
--- a/figures/rebrickable/75931/stickers/38968.jpg
+++ b/figures/rebrickable/75931/stickers/38968.jpg
--- a/figures/rebrickable/75932/stickers/38918.jpg
+++ b/figures/rebrickable/75932/stickers/38918.jpg
--- a/figures/rebrickable/75933/stickers/38830.jpg
+++ b/figures/rebrickable/75933/stickers/38830.jpg
--- a/figures/rebrickable/75934/stickers/54121.jpg
+++ b/figures/rebrickable/75934/stickers/54121.jpg
--- a/figures/rebrickable/75935/stickers/54122.jpg
+++ b/figures/rebrickable/75935/stickers/54122.jpg
--- a/figures/rebrickable/75936/stickers/60387.jpg
+++ b/figures/rebrickable/75936/stickers/60387.jpg
--- a/figures/rebrickable/75937/stickers/65179.jpg
+++ b/figures/rebrickable/75937/stickers/65179.jpg
--- a/figures/rebrickable/75938/stickers/65181.jpg
+++ b/figures/rebrickable/75938/stickers/65181.jpg
--- a/figures/rebrickable/75939/stickers/68068.jpg
+++ b/figures/rebrickable/75939/stickers/68068.jpg
--- a/figures/rebrickable/75940/stickers/68131.jpg
+++ b/figures/rebrickable/75940/stickers/68131.jpg
--- a/figures/rebrickable/75941/stickers/68169.jpg
+++ b/figures/rebrickable/75941/stickers/68169.jpg
--- a/figures/rebrickable/76940/stickers/80659.jpg
+++ b/figures/rebrickable/76940/stickers/80659.jpg
--- a/figures/rebrickable/76948/stickers/78448.jpg
+++ b/figures/rebrickable/76948/stickers/78448.jpg
--- a/figures/rebrickable/76949/stickers/78449.jpg
+++ b/figures/rebrickable/76949/stickers/78449.jpg
--- a/figures/rebrickable/76951/stickers/78451.jpg
+++ b/figures/rebrickable/76951/stickers/78451.jpg
--- a/figures/rebrickable/76956/stickers/10069000.jpg
+++ b/figures/rebrickable/76956/stickers/10069000.jpg
--- a/figures/rebrickable/76958/stickers/10103484.jpg
+++ b/figures/rebrickable/76958/stickers/10103484.jpg
--- a/figures/rebrickable/76959/stickers/10103485.jpg
+++ b/figures/rebrickable/76959/stickers/10103485.jpg
--- a/figures/rebrickable/76960/stickers/10103486.jpg
+++ b/figures/rebrickable/76960/stickers/10103486.jpg
--- a/figures/rebrickable/76961/stickers/10103487.jpg
+++ b/figures/rebrickable/76961/stickers/10103487.jpg
--- a/figures/rebrickable/76964/stickers/10106409.jpg
+++ b/figures/rebrickable/76964/stickers/10106409.jpg
--- a/figures/rebrickable/76965/stickers/10107854.jpg
+++ b/figures/rebrickable/76965/stickers/10107854.jpg
--- a/figures/rebrickable/76966/stickers/10107855.jpg
+++ b/figures/rebrickable/76966/stickers/10107855.jpg
--- a/figures/rebrickable/76969/stickers/10111048.jpg
+++ b/figures/rebrickable/76969/stickers/10111048.jpg
--- a/figures/rebrickable/76974/stickers/10114025.jpg
+++ b/figures/rebrickable/76974/stickers/10114025.jpg
--- a/figures/rebrickable/76976/stickers/10114047.jpg
+++ b/figures/rebrickable/76976/stickers/10114047.jpg
--- a/figures/rebrickable/TRUJWGATE/stickers/stickerupn0066.jpg
+++ b/figures/rebrickable/TRUJWGATE/stickers/stickerupn0066.jpg
--- a/figures/step29/part_categories_heatmap_log.png
+++ b/figures/step29/part_categories_heatmap_log.png
--- a/figures/step34/part_rarity_no_print.png
+++ b/figures/step34/part_rarity_no_print.png
--- a/figures/step34/printed_exclusive_parts.png
+++ b/figures/step34/printed_exclusive_parts.png
--- a/figures/step35/sticker_sheets.png
+++ b/figures/step35/sticker_sheets.png
--- a/lib/plots/part_categories.py
+++ b/lib/plots/part_categories.py
@@ -101,6 +101,41 @@ def plot_part_categories_heatmap(categories_by_year_path: Path, destination_path
    plt.close(fig)


+def plot_part_categories_heatmap_log(categories_by_year_path: Path, destination_path: Path) -> None:
+    """Heatmap des quantités (log1p) par catégorie et par année, en excluant les catégories vides."""
+    rows = load_rows(categories_by_year_path)
+    years = extract_years(rows)
+    totals: Dict[str, int] = {}
+    quantity_lookup = {(row["year"], row["category_id"]): int(row["quantity_non_spare"]) for row in rows}
+    for row in rows:
+        totals[row["category_id"]] = totals.get(row["category_id"], 0) + int(row["quantity_non_spare"])
+    categories = sorted([cat_id for cat_id, total in totals.items() if total > 0], key=lambda cat_id: -totals[cat_id])
+    if not categories:
+        return
+    matrix = np.zeros((len(categories), len(years)))
+    for i, cat_id in enumerate(categories):
+        for j, year in enumerate(years):
+            matrix[i, j] = np.log1p(quantity_lookup.get((year, cat_id), 0))
+    fig, ax = plt.subplots(figsize=(12, 10))
+    cmap = plt.get_cmap("magma")
+    im = ax.imshow(matrix, aspect="auto", cmap=cmap, norm=Normalize(vmin=0, vmax=matrix.max() if matrix.max() > 0 else 1))
+    ax.set_xticks(np.arange(len(years)))
+    ax.set_xticklabels(years, rotation=45, ha="right")
+    labels = {row["category_id"]: row["category_name"] for row in rows}
+    ax.set_yticks(np.arange(len(categories)))
+    ax.set_yticklabels([labels[cat_id] for cat_id in categories])
+    ax.set_xlabel("Année")
+    ax.set_ylabel("Catégorie de pièce")
+    ax.set_title("Intensité des catégories de pièces par année (log des quantités)")
+    cbar = fig.colorbar(ScalarMappable(norm=im.norm, cmap=cmap), ax=ax, fraction=0.025, pad=0.015)
+    cbar.ax.set_ylabel("log1p(quantité)", rotation=90)
+
+    ensure_parent_dir(destination_path)
+    fig.tight_layout()
+    fig.savefig(destination_path, dpi=170)
+    plt.close(fig)
+
+
 def plot_structural_share_timeline(categories_by_year_path: Path, destination_path: Path) -> None:
    """Trace l'évolution de la part des catégories structurelles."""
    rows = load_rows(categories_by_year_path)
--- a/lib/plots/part_rarity.py
+++ b/lib/plots/part_rarity.py
@@ -6,7 +6,7 @@ from typing import List

 import matplotlib.pyplot as plt
 from matplotlib.offsetbox import AnnotationBbox, OffsetImage
-from PIL import Image
+from PIL import Image, ImageDraw, ImageFont

 from lib.filesystem import ensure_parent_dir

@@ -21,6 +21,22 @@ def load_part_rarity(path: Path) -> List[dict]:
    return rows


+def select_printed_exclusive(rows: List[dict], resources_dir: Path) -> List[dict]:
+    """Filtre les pièces imprimées exclusives aux sets filtrés disposant d'une image locale."""
+    filtered: List[dict] = []
+    for row in rows:
+        if row.get("other_sets_quantity", "0") != "0":
+            continue
+        if "print" not in row["part_name"].lower():
+            continue
+        image_path = resources_dir / row.get("sample_set_id", "") / "rare_parts" / f"{row['part_num']}.jpg"
+        if not image_path.exists():
+            continue
+        filtered.append(row)
+    filtered.sort(key=lambda r: (r["part_name"], r["part_num"]))
+    return filtered
+
+
 def format_label(row: dict) -> str:
    """Formate l’étiquette de l’axe vertical."""
    return f"{row['part_num']} — {row['part_name']}"
@@ -84,3 +100,56 @@ def plot_part_rarity(
    ensure_parent_dir(destination_path)
    fig.savefig(destination_path, dpi=150)
    plt.close(fig)
+
+
+def plot_printed_exclusive_parts(
+    path: Path,
+    destination_path: Path,
+    resources_dir: Path = Path("figures/rebrickable"),
+    columns: int = 5,
+) -> None:
+    """Assemble les images des pièces imprimées exclusives aux sets filtrés."""
+    rows = load_part_rarity(path)
+    selected = select_printed_exclusive(rows, resources_dir)
+    selected.sort(key=lambda r: (int(r.get("sample_set_year", "9999") or 9999), r["sample_set_num"], r["part_num"]))
+    if not selected:
+        return
+    images: List[Image.Image] = []
+    labels: List[str] = []
+    for row in selected:
+        image_path = resources_dir / row["sample_set_id"] / "rare_parts" / f"{row['part_num']}.jpg"
+        img = Image.open(image_path).convert("RGBA")
+        max_side = 180
+        ratio = min(max_side / img.width, max_side / img.height, 1.0)
+        if ratio < 1.0:
+            img = img.resize((int(img.width * ratio), int(img.height * ratio)))
+        images.append(img)
+        labels.append(f"{row.get('sample_set_year', '')} • {row['sample_set_num']}")
+
+    columns = max(1, columns)
+    rows_count = (len(images) + columns - 1) // columns
+    cell_width = 220
+    font = ImageFont.load_default()
+    draw_temp = ImageDraw.Draw(Image.new("RGB", (10, 10)))
+    def measure(text: str) -> tuple[int, int]:
+        bbox = draw_temp.textbbox((0, 0), text, font=font)
+        return bbox[2] - bbox[0], bbox[3] - bbox[1]
+    text_height = max(measure(label)[1] for label in labels)
+    cell_height = 190 + text_height + 14
+    width = columns * cell_width
+    height = rows_count * cell_height
+    canvas = Image.new("RGBA", (width, height), (255, 255, 255, 255))
+    draw = ImageDraw.Draw(canvas)
+    for index, (img, label) in enumerate(zip(images, labels)):
+        col = index % columns
+        row_idx = index // columns
+        x = col * cell_width + (cell_width - img.width) // 2
+        y = row_idx * cell_height + 8
+        canvas.paste(img, (x, y), img)
+        text_width, _ = measure(label)
+        text_x = col * cell_width + (cell_width - text_width) // 2
+        text_y = y + img.height + 6
+        draw.text((text_x, text_y), label, fill="#111111", font=font)
+
+    ensure_parent_dir(destination_path)
+    canvas.convert("RGB").save(destination_path, "PNG")
--- a/lib/plots/sticker_sheets.py
+++ b/lib/plots/sticker_sheets.py
@@ -0,0 +1,72 @@
+"""Assemblage visuel des planches d'autocollants des sets filtrés."""
+
+from pathlib import Path
+from typing import List
+
+from PIL import Image, ImageDraw, ImageFont
+
+from lib.filesystem import ensure_parent_dir
+from lib.rebrickable.stats import read_rows
+
+
+def load_sticker_parts(path: Path) -> List[dict]:
+    """Charge la liste des autocollants par set."""
+    return read_rows(path)
+
+
+def plot_sticker_sheets(
+    stickers_path: Path,
+    destination_path: Path,
+    resources_dir: Path = Path("figures/rebrickable"),
+    columns: int = 6,
+) -> None:
+    """Assemble les images d'autocollants exclusifs en grille triée par année."""
+    rows = load_sticker_parts(stickers_path)
+    rows.sort(key=lambda r: (int(r["year"]), r["set_num"], r["part_num"]))
+    selected: List[dict] = []
+    images: List[Image.Image] = []
+    for row in rows:
+        image_path = resources_dir / row["set_id"] / "stickers" / f"{row['part_num']}.jpg"
+        if not image_path.exists():
+            continue
+        img = Image.open(image_path).convert("RGBA")
+        max_side = 260
+        ratio = min(max_side / img.width, max_side / img.height, 1.0)
+        if ratio < 1.0:
+            img = img.resize((int(img.width * ratio), int(img.height * ratio)))
+        images.append(img)
+        selected.append(row)
+    if not images:
+        return
+
+    font = ImageFont.load_default()
+    def measure(text: str) -> tuple[int, int]:
+        bbox = ImageDraw.Draw(Image.new("RGB", (10, 10))).textbbox((0, 0), text, font=font)
+        return bbox[2] - bbox[0], bbox[3] - bbox[1]
+
+    labels = [f"{row['year']} • {row['set_id']} • {row['part_num']}" for row in selected]
+    text_height = max(measure(label)[1] for label in labels)
+    max_width = max(img.width for img in images)
+    max_height = max(img.height for img in images)
+
+    columns = max(1, columns)
+    rows_count = (len(images) + columns - 1) // columns
+    cell_width = max(max_width + 40, 240)
+    cell_height = max_height + text_height + 20
+    width = columns * cell_width
+    height = rows_count * cell_height
+    canvas = Image.new("RGBA", (width, height), (255, 255, 255, 255))
+    draw = ImageDraw.Draw(canvas)
+    for index, (img, label) in enumerate(zip(images, labels)):
+        col = index % columns
+        row_idx = index // columns
+        x = col * cell_width + (cell_width - img.width) // 2
+        y = row_idx * cell_height + 6
+        canvas.paste(img, (x, y), img)
+        text_width, _ = measure(label)
+        text_x = col * cell_width + (cell_width - text_width) // 2
+        text_y = y + img.height + 6
+        draw.text((text_x, text_y), label, fill="#111111", font=font)
+
+    ensure_parent_dir(destination_path)
+    canvas.convert("RGB").save(destination_path, "PNG")
--- a/lib/rebrickable/part_rarity.py
+++ b/lib/rebrickable/part_rarity.py
@@ -46,6 +46,7 @@ def aggregate_filtered_parts(
    parts_catalog: Dict[str, dict],
    ignored_categories: Set[str] = IGNORED_PART_CATEGORY_IDS,
    ignored_minifig_categories: Set[str] = MINIFIG_PART_CATEGORY_IDS,
+    exclude_printed: bool = False,
 ) -> Dict[str, dict]:
    """Agrège les quantités par pièce pour les sets filtrés (rechanges incluses)."""
    aggregated: Dict[str, dict] = {}
@@ -57,6 +58,8 @@ def aggregate_filtered_parts(
            continue
        if part["part_cat_id"] in ignored_minifig_categories:
            continue
+        if exclude_printed and "print" in part["name"].lower():
+            continue
        entry = aggregated.get(row["part_num"])
        if entry is None:
            entry = {"quantity": 0, "set_numbers": set()}
@@ -73,6 +76,7 @@ def compute_other_set_usage(
    filtered_set_numbers: Set[str],
    ignored_categories: Set[str] = IGNORED_PART_CATEGORY_IDS,
    ignored_minifig_categories: Set[str] = MINIFIG_PART_CATEGORY_IDS,
+    exclude_printed: bool = False,
 ) -> Dict[str, int]:
    """Compte les occurrences des pièces dans le reste du catalogue (rechanges incluses)."""
    inventories = select_latest_inventories(inventories_path)
@@ -87,6 +91,8 @@ def compute_other_set_usage(
                continue
            if part["part_cat_id"] in ignored_minifig_categories:
                continue
+            if exclude_printed and "print" in part["name"].lower():
+                continue
            totals[row["part_num"]] = totals.get(row["part_num"], 0) + int(row["quantity"])
    return totals

@@ -98,6 +104,7 @@ def build_part_rarity(
    parts_catalog_path: Path,
    part_categories_path: Path,
    filtered_sets_path: Path,
+    exclude_printed: bool = False,
 ) -> List[dict]:
    """Construit le classement de rareté des pièces filtrées."""
    parts_catalog = load_parts_catalog(parts_catalog_path)
@@ -105,12 +112,13 @@ def build_part_rarity(
    filtered_sets = load_filtered_sets(filtered_sets_path)
    filtered_set_numbers = set(filtered_sets.keys())
    filtered_rows = read_rows(parts_filtered_path)
-    filtered_usage = aggregate_filtered_parts(filtered_rows, parts_catalog)
+    filtered_usage = aggregate_filtered_parts(filtered_rows, parts_catalog, exclude_printed=exclude_printed)
    other_usage = compute_other_set_usage(
        inventories_path,
        inventory_parts_path,
        parts_catalog,
        filtered_set_numbers,
+        exclude_printed=exclude_printed,
    )
    rows: List[dict] = []
    for part_num, entry in filtered_usage.items():
@@ -118,7 +126,8 @@ def build_part_rarity(
        other_quantity = other_usage.get(part_num, 0)
        total_quantity = entry["quantity"] + other_quantity
        sample_set_num = sorted(entry["set_numbers"])[0]
-        sample_set_id = filtered_sets[sample_set_num]["set_id"]
+        sample_set_row = filtered_sets[sample_set_num]
+        sample_set_id = sample_set_row["set_id"]
        rows.append(
            {
                "part_num": part_num,
@@ -127,6 +136,7 @@ def build_part_rarity(
                "part_category": categories[part["part_cat_id"]],
                "sample_set_num": sample_set_num,
                "sample_set_id": sample_set_id,
+                "sample_set_year": sample_set_row["year"],
                "filtered_quantity": str(entry["quantity"]),
                "filtered_set_count": str(len(entry["set_numbers"])),
                "other_sets_quantity": str(other_quantity),
@@ -148,6 +158,7 @@ def write_part_rarity(destination_path: Path, rows: Sequence[dict]) -> None:
        "part_category",
        "sample_set_num",
        "sample_set_id",
+        "sample_set_year",
        "filtered_quantity",
        "filtered_set_count",
        "other_sets_quantity",
--- a/lib/rebrickable/sticker_parts.py
+++ b/lib/rebrickable/sticker_parts.py
@@ -0,0 +1,86 @@
+"""Sélection des planches d'autocollants pour les sets filtrés."""
+
+import csv
+from pathlib import Path
+from typing import Dict, Iterable, List, Tuple
+
+from lib.filesystem import ensure_parent_dir
+from lib.rebrickable.stats import read_rows
+
+
+STICKER_CATEGORY_ID = "58"
+
+
+def load_parts_catalog(path: Path) -> Dict[str, dict]:
+    """Indexe les pièces par référence."""
+    catalog: Dict[str, dict] = {}
+    with path.open() as csv_file:
+        reader = csv.DictReader(csv_file)
+        for row in reader:
+            catalog[row["part_num"]] = row
+    return catalog
+
+
+def load_sets(path: Path) -> Dict[str, dict]:
+    """Indexe les sets enrichis par set_num."""
+    lookup: Dict[str, dict] = {}
+    for row in read_rows(path):
+        lookup[row["set_num"]] = row
+    return lookup
+
+
+def aggregate_stickers(
+    rows: Iterable[dict],
+    parts_catalog: Dict[str, dict],
+) -> Dict[Tuple[str, str], int]:
+    """Cumule les quantités d'autocollants par set et référence."""
+    aggregated: Dict[Tuple[str, str], int] = {}
+    for row in rows:
+        if row["is_spare"] == "true":
+            continue
+        part = parts_catalog[row["part_num"]]
+        if part["part_cat_id"] != STICKER_CATEGORY_ID:
+            continue
+        key = (row["set_num"], row["part_num"])
+        aggregated[key] = aggregated.get(key, 0) + int(row["quantity_in_set"])
+    return aggregated
+
+
+def build_sticker_parts(
+    parts_filtered_path: Path,
+    parts_catalog_path: Path,
+    sets_path: Path,
+) -> List[dict]:
+    """Construit la liste des planches d'autocollants par set."""
+    rows = read_rows(parts_filtered_path)
+    parts_catalog = load_parts_catalog(parts_catalog_path)
+    sets_lookup = load_sets(sets_path)
+    aggregated = aggregate_stickers(rows, parts_catalog)
+    stickers: List[dict] = []
+    for (set_num, part_num), quantity in aggregated.items():
+        set_row = sets_lookup[set_num]
+        part = parts_catalog[part_num]
+        stickers.append(
+            {
+                "set_num": set_num,
+                "set_id": set_row["set_id"],
+                "year": set_row["year"],
+                "name": set_row["name"],
+                "part_num": part_num,
+                "part_name": part["name"],
+                "quantity": str(quantity),
+            }
+        )
+    stickers.sort(key=lambda r: (int(r["year"]), r["set_num"], r["part_num"]))
+    return stickers
+
+
+def write_sticker_parts(destination_path: Path, rows: Iterable[dict]) -> None:
+    """Écrit le CSV des autocollants par set."""
+    ensure_parent_dir(destination_path)
+    fieldnames = ["set_num", "set_id", "year", "name", "part_num", "part_name", "quantity"]
+    with destination_path.open("w", newline="") as csv_file:
+        writer = csv.DictWriter(csv_file, fieldnames=fieldnames)
+        writer.writeheader()
+        for row in rows:
+            writer.writerow(row)
--- a/scripts/compute_part_rarity.py
+++ b/scripts/compute_part_rarity.py
@@ -12,7 +12,9 @@ PARTS_CATALOG_PATH = Path("data/raw/parts.csv")
 PART_CATEGORIES_PATH = Path("data/raw/part_categories.csv")
 FILTERED_SETS_PATH = Path("data/intermediate/sets_enriched.csv")
 DESTINATION_PATH = Path("data/intermediate/part_rarity.csv")
+DESTINATION_PRINTED_EXCLUDED_PATH = Path("data/intermediate/part_rarity_no_print.csv")
 TOP_DESTINATION_PATH = Path("data/intermediate/part_rarity_exclusive.csv")
+TOP_PRINTED_EXCLUDED_PATH = Path("data/intermediate/part_rarity_exclusive_no_print.csv")


 def main() -> None:
@@ -29,6 +31,19 @@ def main() -> None:
    top_rows = select_until_reused(rows)
    write_part_rarity(TOP_DESTINATION_PATH, top_rows)

+    rows_no_print = build_part_rarity(
+        PARTS_FILTERED_PATH,
+        INVENTORIES_PATH,
+        INVENTORY_PARTS_PATH,
+        PARTS_CATALOG_PATH,
+        PART_CATEGORIES_PATH,
+        FILTERED_SETS_PATH,
+        exclude_printed=True,
+    )
+    write_part_rarity(DESTINATION_PRINTED_EXCLUDED_PATH, rows_no_print)
+    top_rows_no_print = select_until_reused(rows_no_print)
+    write_part_rarity(TOP_PRINTED_EXCLUDED_PATH, top_rows_no_print)
+

 if __name__ == "__main__":
    main()
--- a/scripts/compute_sticker_parts.py
+++ b/scripts/compute_sticker_parts.py
@@ -0,0 +1,21 @@
+"""Extrait les planches d'autocollants des sets filtrés."""
+
+from pathlib import Path
+
+from lib.rebrickable.sticker_parts import build_sticker_parts, write_sticker_parts
+
+
+PARTS_FILTERED_PATH = Path("data/intermediate/parts_filtered.csv")
+PARTS_CATALOG_PATH = Path("data/raw/parts.csv")
+SETS_PATH = Path("data/intermediate/sets_enriched.csv")
+DESTINATION_PATH = Path("data/intermediate/sticker_parts.csv")
+
+
+def main() -> None:
+    """Construit le CSV des autocollants présents dans les sets filtrés."""
+    stickers = build_sticker_parts(PARTS_FILTERED_PATH, PARTS_CATALOG_PATH, SETS_PATH)
+    write_sticker_parts(DESTINATION_PATH, stickers)
+
+
+if __name__ == "__main__":
+    main()
--- a/scripts/download_sticker_resources.py
+++ b/scripts/download_sticker_resources.py
@@ -0,0 +1,76 @@
+"""Télécharge les images des planches d'autocollants des sets filtrés."""
+
+import csv
+import os
+from pathlib import Path
+
+import requests
+from dotenv import load_dotenv
+
+from lib.filesystem import ensure_parent_dir
+from lib.rebrickable.resources import (
+    build_part_img_lookup,
+    download_binary,
+    download_resources,
+    fetch_part_img_url,
+    load_part_img_cache,
+    persist_part_img_cache,
+)
+from lib.rebrickable.stats import read_rows
+
+
+STICKER_PARTS_PATH = Path("data/intermediate/sticker_parts.csv")
+RESOURCES_DIR = Path("figures/rebrickable")
+PART_IMG_CACHE_PATH = Path("data/intermediate/part_img_cache.csv")
+DOWNLOAD_LOG_PATH = Path("data/intermediate/sticker_download_log.csv")
+REQUEST_DELAY_SECONDS_IMAGES = 0.35
+REQUEST_DELAY_SECONDS_LOOKUP = 0.6
+
+
+def main() -> None:
+    """Construit les URLs manquantes et télécharge les planches d'autocollants."""
+    load_dotenv()
+    token = os.environ["REBRICKABLE_TOKEN"]
+    session = requests.Session()
+
+    stickers = read_rows(STICKER_PARTS_PATH)
+    cache = load_part_img_cache(PART_IMG_CACHE_PATH)
+    part_img_lookup = build_part_img_lookup(
+        {row["part_num"] for row in stickers},
+        fetcher=lambda part_num: fetch_part_img_url(part_num, token, session),
+        cache_path=PART_IMG_CACHE_PATH,
+        existing_cache=cache,
+        delay_seconds=REQUEST_DELAY_SECONDS_LOOKUP,
+    )
+    if cache:
+        part_img_lookup.update(cache)
+    persist_part_img_cache(PART_IMG_CACHE_PATH, part_img_lookup)
+
+    plan = []
+    missing_log = []
+    for row in stickers:
+        url = part_img_lookup.get(row["part_num"])
+        path = RESOURCES_DIR / row["set_id"] / "stickers" / f"{row['part_num']}.jpg"
+        if not url or not str(url).startswith("http"):
+            missing_log.append({"url": url or "", "path": str(path), "status": "missing_url"})
+            continue
+        plan.append({"url": url, "path": path})
+
+    download_resources(
+        plan,
+        downloader=lambda url, path: download_binary(url, path, session),
+        delay_seconds=REQUEST_DELAY_SECONDS_IMAGES,
+        log_path=DOWNLOAD_LOG_PATH if not missing_log else None,
+    )
+
+    if missing_log:
+        ensure_parent_dir(DOWNLOAD_LOG_PATH)
+        with DOWNLOAD_LOG_PATH.open("w", newline="") as csv_file:
+            writer = csv.DictWriter(csv_file, fieldnames=["url", "path", "status"])
+            writer.writeheader()
+            for row in missing_log:
+                writer.writerow(row)
+
+
+if __name__ == "__main__":
+    main()
--- a/scripts/plot_part_categories.py
+++ b/scripts/plot_part_categories.py
@@ -4,6 +4,7 @@ from pathlib import Path

 from lib.plots.part_categories import (
    plot_part_categories_heatmap,
+    plot_part_categories_heatmap_log,
    plot_structural_share_timeline,
    plot_top_part_categories_area,
 )
@@ -13,6 +14,7 @@ CATEGORIES_BY_YEAR_PATH = Path("data/intermediate/part_categories_by_year.csv")
 CATEGORIES_GLOBAL_PATH = Path("data/intermediate/part_categories_global.csv")
 AREA_DESTINATION = Path("figures/step29/top_part_categories_area.png")
 HEATMAP_DESTINATION = Path("figures/step29/part_categories_heatmap.png")
+HEATMAP_LOG_DESTINATION = Path("figures/step29/part_categories_heatmap_log.png")
 STRUCTURAL_DESTINATION = Path("figures/step29/structural_share_timeline.png")


@@ -20,6 +22,7 @@ def main() -> None:
    """Génère les visuels de répartition par catégorie."""
    plot_top_part_categories_area(CATEGORIES_BY_YEAR_PATH, CATEGORIES_GLOBAL_PATH, AREA_DESTINATION)
    plot_part_categories_heatmap(CATEGORIES_BY_YEAR_PATH, HEATMAP_DESTINATION)
+    plot_part_categories_heatmap_log(CATEGORIES_BY_YEAR_PATH, HEATMAP_LOG_DESTINATION)
    plot_structural_share_timeline(CATEGORIES_BY_YEAR_PATH, STRUCTURAL_DESTINATION)


--- a/scripts/plot_part_rarity.py
+++ b/scripts/plot_part_rarity.py
@@ -2,17 +2,23 @@

 from pathlib import Path

-from lib.plots.part_rarity import plot_part_rarity
+from lib.plots.part_rarity import plot_part_rarity, plot_printed_exclusive_parts


 PART_RARITY_TOP_PATH = Path("data/intermediate/part_rarity_exclusive.csv")
 DESTINATION_PATH = Path("figures/step34/part_rarity.png")
 RESOURCES_DIR = Path("figures/rebrickable")
+PART_RARITY_NO_PRINT_PATH = Path("data/intermediate/part_rarity_exclusive_no_print.csv")
+DESTINATION_NO_PRINT = Path("figures/step34/part_rarity_no_print.png")
+PART_RARITY_FULL_PATH = Path("data/intermediate/part_rarity.csv")
+DESTINATION_PRINTED_COLLAGE = Path("figures/step34/printed_exclusive_parts.png")


 def main() -> None:
    """Charge le top des pièces rares et produit le graphique illustré."""
    plot_part_rarity(PART_RARITY_TOP_PATH, DESTINATION_PATH, resources_dir=RESOURCES_DIR)
+    plot_part_rarity(PART_RARITY_NO_PRINT_PATH, DESTINATION_NO_PRINT, resources_dir=RESOURCES_DIR)
+    plot_printed_exclusive_parts(PART_RARITY_FULL_PATH, DESTINATION_PRINTED_COLLAGE, resources_dir=RESOURCES_DIR)


 if __name__ == "__main__":
--- a/scripts/plot_sticker_sheets.py
+++ b/scripts/plot_sticker_sheets.py
@@ -0,0 +1,19 @@
+"""Assemble les visuels des planches d'autocollants des sets filtrés."""
+
+from pathlib import Path
+
+from lib.plots.sticker_sheets import plot_sticker_sheets
+
+
+STICKER_PARTS_PATH = Path("data/intermediate/sticker_parts.csv")
+DESTINATION_PATH = Path("figures/step35/sticker_sheets.png")
+RESOURCES_DIR = Path("figures/rebrickable")
+
+
+def main() -> None:
+    """Construit le collage des planches d'autocollants."""
+    plot_sticker_sheets(STICKER_PARTS_PATH, DESTINATION_PATH, resources_dir=RESOURCES_DIR)
+
+
+if __name__ == "__main__":
+    main()
--- a/tests/test_part_categories_plot.py
+++ b/tests/test_part_categories_plot.py
@@ -5,6 +5,7 @@ from pathlib import Path

 from lib.plots.part_categories import (
    plot_part_categories_heatmap,
+    plot_part_categories_heatmap_log,
    plot_structural_share_timeline,
    plot_top_part_categories_area,
 )
@@ -31,15 +32,19 @@ def test_plot_part_categories_outputs_images(tmp_path: Path) -> None:
    )
    area_dest = tmp_path / "figures" / "step29" / "top_part_categories_area.png"
    heatmap_dest = tmp_path / "figures" / "step29" / "part_categories_heatmap.png"
+    heatmap_log_dest = tmp_path / "figures" / "step29" / "part_categories_heatmap_log.png"
    structural_dest = tmp_path / "figures" / "step29" / "structural_share_timeline.png"

    plot_top_part_categories_area(by_year, by_global, area_dest, top_n=2)
    plot_part_categories_heatmap(by_year, heatmap_dest)
+    plot_part_categories_heatmap_log(by_year, heatmap_log_dest)
    plot_structural_share_timeline(by_year, structural_dest)

    assert area_dest.exists()
    assert heatmap_dest.exists()
+    assert heatmap_log_dest.exists()
    assert structural_dest.exists()
    assert area_dest.stat().st_size > 0
    assert heatmap_dest.stat().st_size > 0
+    assert heatmap_log_dest.stat().st_size > 0
    assert structural_dest.stat().st_size > 0
--- a/tests/test_part_rarity.py
+++ b/tests/test_part_rarity.py
@@ -58,6 +58,7 @@ def test_build_part_rarity_counts_spares_and_ignores_categories(tmp_path: Path)
            ["p4", "Figure Limb", "41", "Plastic"],
            ["p5", "Sticker Sheet", "58", "Plastic"],
            ["p6", "Exclusive Tile", "1", "Plastic"],
+            ["p7", "Slope 45 print", "1", "Plastic"],
        ],
    )
    part_categories = tmp_path / "part_categories.csv"
@@ -95,6 +96,7 @@ def test_build_part_rarity_counts_spares_and_ignores_categories(tmp_path: Path)
            ["3", "p4", "1", "4", "True", ""],
            ["4", "p1", "1", "8", "False", ""],
            ["5", "p5", "1", "9", "False", ""],
+            ["5", "p7", "1", "5", "False", ""],
        ],
    )

@@ -115,6 +117,7 @@ def test_build_part_rarity_counts_spares_and_ignores_categories(tmp_path: Path)
            "part_category": "Bricks",
            "sample_set_num": "2000-1",
            "sample_set_id": "2000",
+            "sample_set_year": "2021",
            "filtered_quantity": "1",
            "filtered_set_count": "1",
            "other_sets_quantity": "0",
@@ -128,6 +131,7 @@ def test_build_part_rarity_counts_spares_and_ignores_categories(tmp_path: Path)
            "part_category": "Bricks",
            "sample_set_num": "1000-1",
            "sample_set_id": "1000",
+            "sample_set_year": "2020",
            "filtered_quantity": "3",
            "filtered_set_count": "2",
            "other_sets_quantity": "3",
@@ -141,6 +145,7 @@ def test_build_part_rarity_counts_spares_and_ignores_categories(tmp_path: Path)
            "part_category": "Large Buildable Figures",
            "sample_set_num": "2000-1",
            "sample_set_id": "2000",
+            "sample_set_year": "2021",
            "filtered_quantity": "2",
            "filtered_set_count": "1",
            "other_sets_quantity": "4",
@@ -150,6 +155,17 @@ def test_build_part_rarity_counts_spares_and_ignores_categories(tmp_path: Path)
    ]
    assert select_until_reused(rows) == [rows[0], rows[1]]

+    rows_no_print = build_part_rarity(
+        parts_filtered,
+        inventories,
+        inventory_parts,
+        parts_catalog,
+        part_categories,
+        sets_enriched,
+        exclude_printed=True,
+    )
+    assert all(r["part_num"] != "p7" for r in rows_no_print)
+

 def test_write_part_rarity_outputs_csv(tmp_path: Path) -> None:
    """Sérialise le classement de rareté."""
@@ -162,6 +178,7 @@ def test_write_part_rarity_outputs_csv(tmp_path: Path) -> None:
            "part_category": "Bricks",
            "sample_set_num": "123-1",
            "sample_set_id": "123",
+            "sample_set_year": "2020",
            "filtered_quantity": "3",
            "filtered_set_count": "2",
            "other_sets_quantity": "3",
@@ -175,7 +192,7 @@ def test_write_part_rarity_outputs_csv(tmp_path: Path) -> None:
    assert destination.exists()
    content = destination.read_text().strip().splitlines()
    assert content[0] == (
-        "part_num,part_name,part_cat_id,part_category,sample_set_num,sample_set_id,filtered_quantity,filtered_set_count,"
+        "part_num,part_name,part_cat_id,part_category,sample_set_num,sample_set_id,sample_set_year,filtered_quantity,filtered_set_count,"
        "other_sets_quantity,catalog_total_quantity,filtered_share"
    )
-    assert content[1] == "p1,Brick 1x1,1,Bricks,123-1,123,3,2,3,6,0.5000"
+    assert content[1] == "p1,Brick 1x1,1,Bricks,123-1,123,2020,3,2,3,6,0.5000"
--- a/tests/test_part_rarity_printed_collage.py
+++ b/tests/test_part_rarity_printed_collage.py
@@ -0,0 +1,33 @@
+"""Tests du collage des pièces imprimées exclusives."""
+
+import matplotlib
+from pathlib import Path
+from PIL import Image
+
+from lib.plots.part_rarity import plot_printed_exclusive_parts
+
+
+matplotlib.use("Agg")
+
+
+def test_plot_printed_exclusive_parts(tmp_path: Path) -> None:
+    """Génère un collage des pièces imprimées exclusives avec images locales."""
+    data_path = tmp_path / "part_rarity.csv"
+    resources_dir = tmp_path / "figures" / "rebrickable"
+    resources_dir.mkdir(parents=True)
+    (resources_dir / "1000" / "rare_parts").mkdir(parents=True)
+    (resources_dir / "2000" / "rare_parts").mkdir(parents=True)
+    Image.new("RGB", (60, 40), color=(255, 0, 0)).save(resources_dir / "1000" / "rare_parts" / "p1.jpg")
+    Image.new("RGB", (60, 40), color=(0, 255, 0)).save(resources_dir / "2000" / "rare_parts" / "p2.jpg")
+    data_path.write_text(
+        "part_num,part_name,part_cat_id,part_category,sample_set_num,sample_set_id,sample_set_year,filtered_quantity,filtered_set_count,other_sets_quantity,catalog_total_quantity,filtered_share\n"
+        "p1,Slope print,1,Bricks,1000-1,1000,2020,3,2,0,3,1.0000\n"
+        "p2,Tile print,1,Bricks,2000-1,2000,2021,2,1,0,2,1.0000\n"
+        "p3,Tile plain,1,Bricks,2000-1,2000,2021,2,1,0,2,1.0000\n"
+    )
+    destination = tmp_path / "figures" / "step34" / "printed_exclusive_parts.png"
+
+    plot_printed_exclusive_parts(data_path, destination, resources_dir=resources_dir, columns=2)
+
+    assert destination.exists()
+    assert destination.stat().st_size > 0
--- a/tests/test_sticker_parts.py
+++ b/tests/test_sticker_parts.py
@@ -0,0 +1,104 @@
+"""Tests de l'extraction des planches d'autocollants."""
+
+import csv
+from pathlib import Path
+
+from lib.rebrickable.sticker_parts import build_sticker_parts, write_sticker_parts
+
+
+def write_csv(path: Path, headers: list[str], rows: list[list[str]]) -> None:
+    """Écrit un CSV simple."""
+    with path.open("w", newline="") as csv_file:
+        writer = csv.writer(csv_file)
+        writer.writerow(headers)
+        writer.writerows(rows)
+
+
+def test_build_sticker_parts_filters_category_and_spares(tmp_path: Path) -> None:
+    """Ne conserve que les autocollants (catégorie 58) hors rechanges."""
+    parts_filtered = tmp_path / "parts_filtered.csv"
+    write_csv(
+        parts_filtered,
+        [
+            "part_num",
+            "color_rgb",
+            "is_translucent",
+            "set_num",
+            "set_id",
+            "year",
+            "quantity_in_set",
+            "is_spare",
+            "is_minifig_part",
+        ],
+        [
+            ["st1", "AAAAAA", "false", "1000-1", "1000", "2020", "1", "false", "false"],
+            ["st1", "AAAAAA", "false", "1000-1", "1000", "2020", "2", "true", "false"],
+            ["br1", "BBBBBB", "false", "1000-1", "1000", "2020", "5", "false", "false"],
+            ["st2", "CCCCCC", "false", "2000-1", "2000", "2021", "3", "false", "false"],
+        ],
+    )
+    parts_catalog = tmp_path / "parts.csv"
+    write_csv(
+        parts_catalog,
+        ["part_num", "name", "part_cat_id", "part_material"],
+        [
+            ["st1", "Sticker Sheet 1", "58", "Plastic"],
+            ["st2", "Sticker Sheet 2", "58", "Plastic"],
+            ["br1", "Brick", "1", "Plastic"],
+        ],
+    )
+    sets_enriched = tmp_path / "sets_enriched.csv"
+    write_csv(
+        sets_enriched,
+        ["set_num", "set_id", "name", "year", "in_collection"],
+        [
+            ["1000-1", "1000", "Set A", "2020", "true"],
+            ["2000-1", "2000", "Set B", "2021", "false"],
+        ],
+    )
+
+    stickers = build_sticker_parts(parts_filtered, parts_catalog, sets_enriched)
+
+    assert stickers == [
+        {
+            "set_num": "1000-1",
+            "set_id": "1000",
+            "year": "2020",
+            "name": "Set A",
+            "part_num": "st1",
+            "part_name": "Sticker Sheet 1",
+            "quantity": "1",
+        },
+        {
+            "set_num": "2000-1",
+            "set_id": "2000",
+            "year": "2021",
+            "name": "Set B",
+            "part_num": "st2",
+            "part_name": "Sticker Sheet 2",
+            "quantity": "3",
+        },
+    ]
+
+
+def test_write_sticker_parts_outputs_csv(tmp_path: Path) -> None:
+    """Sérialise la liste des autocollants par set."""
+    destination = tmp_path / "sticker_parts.csv"
+    rows = [
+        {
+            "set_num": "123-1",
+            "set_id": "123",
+            "year": "2020",
+            "name": "Set",
+            "part_num": "st1",
+            "part_name": "Sticker",
+            "quantity": "1",
+        }
+    ]
+
+    write_sticker_parts(destination, rows)
+
+    assert destination.exists()
+    content = destination.read_text().strip().splitlines()
+    assert content[0] == "set_num,set_id,year,name,part_num,part_name,quantity"
+    assert content[1] == "123-1,123,2020,Set,st1,Sticker,1"
--- a/tests/test_sticker_sheets_plot.py
+++ b/tests/test_sticker_sheets_plot.py
@@ -0,0 +1,27 @@
+"""Tests du collage des planches d'autocollants."""
+
+from pathlib import Path
+from PIL import Image
+
+from lib.plots.sticker_sheets import plot_sticker_sheets
+
+
+def test_plot_sticker_sheets(tmp_path: Path) -> None:
+    """Génère une grille de planches d'autocollants avec labels."""
+    stickers_path = tmp_path / "sticker_parts.csv"
+    resources_dir = tmp_path / "figures" / "rebrickable"
+    (resources_dir / "1000" / "stickers").mkdir(parents=True)
+    (resources_dir / "2000" / "stickers").mkdir(parents=True)
+    Image.new("RGB", (120, 80), color=(255, 0, 0)).save(resources_dir / "1000" / "stickers" / "st1.jpg")
+    Image.new("RGB", (100, 60), color=(0, 255, 0)).save(resources_dir / "2000" / "stickers" / "st2.jpg")
+    stickers_path.write_text(
+        "set_num,set_id,year,name,part_num,part_name,quantity\n"
+        "1000-1,1000,2020,Set A,st1,Sticker 1,1\n"
+        "2000-1,2000,2021,Set B,st2,Sticker 2,1\n"
+    )
+    destination = tmp_path / "figures" / "step35" / "sticker_sheets.png"
+
+    plot_sticker_sheets(stickers_path, destination, resources_dir=resources_dir, columns=2)
+
+    assert destination.exists()
+    assert destination.stat().st_size > 0
Author	SHA1	Message	Date
Richard Dern	18d5895a05	Ajuster le collage des autocollants (espacement et labels visibles)	2025-12-03 20:54:53 +01:00
Richard Dern	ff7a6145a6	Ajouter l’étape 35 : extraction et collage des autocollants	2025-12-03 18:03:43 +01:00
Richard Dern	083becd2c0	Fiabiliser le téléchargement des autocollants sans URL	2025-12-03 18:00:25 +01:00
Richard Dern	107bc5b533	Ajouter la heatmap log des catégories et le collage des pièces imprimées	2025-12-03 17:49:46 +01:00
Richard Dern	cebe3046db	Collage des pièces imprimées exclusives et variante sans impressions	2025-12-03 17:37:01 +01:00
Richard Dern	8560f15b41	Exclusion des pièces imprimées dans la recherche de rareté	2025-12-03 17:21:00 +01:00