1

Remplacement des outils

This commit is contained in:
2026-03-28 02:00:07 +01:00
parent a2cbaeacd1
commit 44dc63bebf
80 changed files with 73 additions and 13693 deletions

1
.gitignore vendored
View File

@@ -13,3 +13,4 @@ __pycache__
.env
.codex
drafts
.cache

View File

@@ -69,8 +69,8 @@ L'installation d'un [runner](https://docs.gitea.com/next/usage/actions/act-runne
Tout d'abord, il faut obtenir un token depuis Gitea.
Selon où vous le demandez, il est possible que le runner soit attribué à un unique projet, à une organisation, ou à toutes les organisations de l'instance.
Dans tous les cas, c'est dans les paramètres (du projet, de l'organisation, ou de l'instance de Gitea) qu'il faut aller, puis dans l'onglet "*Runners*".
On clique ensuite sur le bouton "*Create runner*", et on copie le token.
Dans tous les cas, c'est dans les paramètres (du projet, de l'organisation, ou de l'instance de Gitea) qu'il faut aller, puis dans l'onglet "_Runners_".
On clique ensuite sur le bouton "_Create runner_", et on copie le token.
Pour intégrer ce token à notre configuration de NixOS, nous avons plusieurs options :
@@ -130,7 +130,7 @@ echo -n "<token>" > ./gitea-actions-runner-token
Pour rappel, l'option `-n` permet d'éviter la ligne vide en fin de fichier (ce qui invaliderait le token), et là aussi, on remplacera `<token>` par le token copié précédemment.
***
---
Ensuite, on doit configurer des [labels](https://docs.gitea.com/next/usage/actions/act-runner#labels).
En - très - gros, il s'agit d'associer un nom à un container dans lequel seront exécutées les actions ou à la machine hôte.
@@ -138,8 +138,8 @@ En - très - gros, il s'agit d'associer un nom à un container dans lequel seron
Le package NixOS [ne propose pas encore de valeur par défaut](https://search.nixos.org/options?channel=23.05&show=services.gitea-actions-runner.instances.%3Cname%3E.labels&from=0&size=50&sort=alpha_asc&type=packages&query=gitea), et ce paramètre ne peut pas être vide.
Heureusement, la documentation de Gitea nous indique les valeurs utilisées par défaut par l'exécutable - mais sont obsolètes. Je me suis inspiré de la valeur d'exemple donnée par NixOS pour ma propre configuration.
Dans l'exemple que je vous montre, je pourrai lancer mes actions sur "*ubuntu-latest*", qui correspond à l'image docker de nodejs v18 sur une debian bullseye.
Et ce n'est qu'en écrivant ces lignes que je me rends compte que cet exemple est mauvais : le label *ubuntu-latest* ne correspond en rien à l'image utilisée...
Dans l'exemple que je vous montre, je pourrai lancer mes actions sur "_ubuntu-latest_", qui correspond à l'image docker de nodejs v18 sur une debian bullseye.
Et ce n'est qu'en écrivant ces lignes que je me rends compte que cet exemple est mauvais : le label _ubuntu-latest_ ne correspond en rien à l'image utilisée...
Je propose au lecteur, en guise d'exercice, de modifier cette ligne comme bon lui semblera 😊
Autre label que j'utilise, `native:host`, qui permettra de lancer les actions sur la cible `native`, la machine hôte.
@@ -181,7 +181,7 @@ Avec Gitea Actions, cela passe - par exemple - par le "code" yaml suivant :
uses: actions/checkout@v3
```
Donc, pour récupérer **mes** sources sur **mon** serveur, il faut d'abord cloner le dépôt https://github.com/actions/checkout.
Donc, pour récupérer **mes** sources sur **mon** serveur, il faut d'abord cloner le dépôt <https://github.com/actions/checkout>.
Surprise : c'est tout en javascript.
Le redistribuable pèse pas loin d'1Mo de javascript, et [les sources](https://github.com/actions/checkout/tree/main/src) sont imbitables et en plus pas documentées.
Mais que fait donc tout ce javascript que le binaire `git` ne fait pas, à part quelques options de configuration ?
@@ -221,7 +221,7 @@ Pour "activer" l'usage de LFS, il faut rajouter une ligne :
```
Je soupçonne là encore une intégration avec l'API de Github, parce qu'au niveau de git, il n'est nullement nécessaire de préciser qu'on veut utiliser LFS lors d'un `git clone`.
Mais sans cette option, le clonage de mon dépôt se fait sans LFS *sans que je sois au courant*.
Mais sans cette option, le clonage de mon dépôt se fait sans LFS _sans que je sois au courant_.
Ce n'est que bien plus tard, lors de la compilation des fichiers du site, que Hugo plante en essayant d'accéder à des fichiers qui n'existent pas.
Et c'est symptomatique d'absolument **tout** l'écosystème javascript (et Go, d'ailleurs) : **rien, absolument rien** n'est intuitif.
@@ -229,7 +229,7 @@ Tout est - mal - repensé, tout est superposition d'abstractions, et un utilisat
Ce fonctionnement contre-intuitif m'a forcé à remettre en question toute la chaîne de production de mon blog : est-ce que j'ai introduit un bug dans la gestion des images ? est-ce que j'utilise la bonne version d'Hugo ? me manque-t'il une dépendance ? pourquoi ça fonctionne sous Drone et pas sous Actions ? quelles différences entre les images dockers ? etc.
À aucun moment je n'ai eu droit à un message *explicite*.
À aucun moment je n'ai eu droit à un message _explicite_.
Et c'est encore une fois symptomatique de la philosophie moderne du développement : on doit cacher les erreurs et les problèmes à l'utilisateur pour ne pas créer de situations anxiogènes.
On n'arrête l'exécution d'un processus qu'en cas de situation critique.
On doit poursuivre l'exécution coûte que coûte.
@@ -238,7 +238,7 @@ Résultat, Hugo refuse de compiler certains templates parce que le "format d'une
Je modifie le template en question, essayant de corriger un bug qui n'existe pas.
Je relance la CI, je dois attendre d'arriver à la compilation des ressources après avoir téléchargé la moitié d'Internet pour la centième fois, et constater que le "bug" s'est "étendu" à d'autres fichiers.
Quand une application me dit "format d'image incorrect", je me dis qu'il manque une librairie particulière (du genre *libpng* ou *libjpeg*).
Quand une application me dit "format d'image incorrect", je me dis qu'il manque une librairie particulière (du genre _libpng_ ou _libjpeg_).
Je ne pense pas du tout à activer une option yaml pour que le clonage se fasse avec LFS...
C'est pas fini...
@@ -329,7 +329,7 @@ Tout à la fois content de trouver cette fonctionnalité assez attendue parmi le
Je lance, j'attends qu'on arrive à cette étape, et ça crashe, avec un message me disant que ma clé est invalide, dans une magistrale illustration de ma remarque d'avant (on poursuit d'exécution quoiqu'il arrive).
L'action a spammé mon serveur SSH après son infructueuse tentative d'utilisation de ma clé "invalide", continuant chaque étape comme si la clé avait été acceptée, pour faire... je ne sais pas trop quoi en fait, je voulais juste un `rsync` moi...
``` {linenos=false,class=not-prose}
```{linenos=false,class=not-prose}
[DIR] Creating /***/.ssh dir in workspace ***
✅ [DIR] dir created.
[FILE] writing /***/.ssh/known_hosts file ... 0
@@ -410,7 +410,7 @@ rsync error: unexplained error (code 255) at io.c(228) [sender=3.2.3]
Pourtant, cette même clé privée était utilisée par Drone depuis quelques années, sans le moindre problème.
En fait, c'est Gitea qu'il faut pointer du doigt ici.
Le formulaire pour soumettre un secret [supprime les espaces et les lignes vides](https://github.com/go-gitea/gitea/issues/24721), au début et *à la fin* de la chaîne soumise.
Le formulaire pour soumettre un secret [supprime les espaces et les lignes vides](https://github.com/go-gitea/gitea/issues/24721), au début et _à la fin_ de la chaîne soumise.
Or, une ligne vide est exigée en fin de chaîne pour qu'une clé privée soit considérée valide.
Et Gitea m'a prévenu : dans la zone de texte, où je colle ma clé privée, il est clairement indiqué que les espaces vides au début et en fin de chaîne seront supprimés, ce que je n'ai vu qu'après une journée de recherche de tout ce qui peut bien merder avec ma clé privée.
Et comme c'est un secret, je ne peux même pas essayer de l'afficher pour la débugguer, et vu qu'elle aurait de toute façon été formatée pour l'affichage dans l'onglet de la CI, je n'aurai probablement pas vu que la ligne vide finale était absente.
@@ -441,5 +441,5 @@ Mais moi qui n'y ais jamais mis les pieds, c'est, je le répète, absurdement co
En ce qui me concerne, j'arrête Gitea Actions, et je vais creuser du côté de [Woodpecker](https://woodpecker-ci.org/), fork de Drone qui dispose de paquets pour NixOS.
Le pire c'est que pour une fois, je ne suis pas fier de mon *rant* : j'avais vraiment envie que ça fonctionne, que ça me plaise, que ça me corresponde, que ça m'enthousiasme.
Le pire c'est que pour une fois, je ne suis pas fier de mon _rant_ : j'avais vraiment envie que ça fonctionne, que ça me plaise, que ça me corresponde, que ça m'enthousiasme.
Mais j'en suis très loin...

View File

@@ -1 +1 @@
attribution: https://commons.wikimedia.org/wiki/File:Chromium_47_Aw,_Snap!_screenshot_(en).png
attribution: <https://commons.wikimedia.org/wiki/File:Chromium_47_Aw,_Snap!_screenshot_(en).png>

View File

@@ -1 +1 @@
attribution: https://commons.wikimedia.org/wiki/File:Windows_10_%26_11_BSOD_(new_version).png
attribution: <https://commons.wikimedia.org/wiki/File:Windows_10_%26_11_BSOD_(new_version).png>

View File

@@ -1,4 +1,4 @@
title: "Catopsbaatar"
attribution: "By Artwork by Bogusław Waksmundzki. Article by Zofia Kielan-Jaworowska and Jørn H. Hurum - http://app.pan.pl/article/item/app51-393.html, CC BY 2.0, https://commons.wikimedia.org/w/index.php?curid=27466723"
attribution: "By Artwork by Bogusław Waksmundzki. Article by Zofia Kielan-Jaworowska and Jørn H. Hurum - <http://app.pan.pl/article/item/app51-393.html>, CC BY 2.0, https://commons.wikimedia.org/w/index.php?curid=27466723"
description: "Restoration of [Catopsbaatar](https://en.wikipedia.org/wiki/Catopsbaatar)"
#prompt: ""

View File

@@ -1,4 +1,4 @@
#title: ""
attribution: "Par Zofia Kielan-Jaworowska and Jørn H. Hurum — http://app.pan.pl/article/item/app51-393.html, CC BY 2.0, https://commons.wikimedia.org/w/index.php?curid=27534900"
attribution: "Par Zofia Kielan-Jaworowska and Jørn H. Hurum — <http://app.pan.pl/article/item/app51-393.html>, CC BY 2.0, https://commons.wikimedia.org/w/index.php?curid=27534900"
description: "Early eutherian [Eomaia scansoria](https://fr.wikipedia.org/wiki/Eomaia) Ji Q., Luo, Yuan, Wible, Hang, and Georgi, 2002. (CAGS 01IG1)."
#prompt: ""

50
data/external_links.yaml Normal file
View File

@@ -0,0 +1,50 @@
generatedAt: '2026-03-28 01:53:10'
deadLinks:
- url: http://abc.go.com/shows/lost
locations:
- /critiques/series/lost
code: 404
- url: http://archive.wikiwix.com/cache/display2.php/WMR_documents.final_27_April_1.FINAL.pdf?url=http%3A%2F%2Fwww.wmo.int%2Fpages%2Fprog%2Farep%2Fwwrp%2Fnew%2Fdocuments%2FWMR_documents.final_27_April_1.FINAL.pdf
locations:
- /interets/philosophie/2023/06/22/lhumain-cette-espece-primitive-emancipation-ou-asservissement
code: 403
- url: https://danielbmarkham.com/twilight-of-the-programmers/
locations:
- /interets/liens-interessants/2023/06/23/a8972f33
code: 404
- url: https://git.dern.ovh/Blog/contenu
locations:
- /interets/informatique/2023/09/03/nouveau-site-en-ligne
code: 404
- url: https://jarredsumner.com/codeblog/
locations:
- /interets/liens-interessants/2022/07/06/536ac204
code: 404
- url: https://kbdfans.com/collections/keyboard-stabilizer/products/gmk-screw-in-stabilizers?variant=22154915348528
locations:
- /interets/informatique/2022/01/11/a-la-recherche-du-clavier-parfait-un-clavier-100-custom
code: 404
- url: https://kbdfans.com/collections/wrist-rest/products/handmade-resin-wrist-rest-1?variant=39444177223819
locations:
- /interets/informatique/2022/01/11/a-la-recherche-du-clavier-parfait-un-clavier-100-custom
code: 404
- url: https://kbdfans.com/products/dz60rgb-ansi-pcb-foam
locations:
- /interets/informatique/2022/01/11/a-la-recherche-du-clavier-parfait-un-clavier-100-custom
code: 404
- url: https://keygem.store/collections/tools/products/kbdfans-switch-lube-station
locations:
- /interets/informatique/2022/01/11/a-la-recherche-du-clavier-parfait-un-clavier-100-custom
code: 404
- url: https://liftoffsoftware.com/Products/GateOne
locations:
- /interets/informatique/2025/12/12/piloter-ses-serveurs-avec-un-emulateur-de-terminal-web
code: 408
- url: https://mintrocketgames.com/en/DaveTheDiver
locations:
- /critiques/jeux-video/dave-the-diver
code: 404
- url: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3956428
locations:
- /interets/informatique/2025/12/10/j-ai-quitte-instagram
code: 403

117
deploy.sh
View File

@@ -1,117 +0,0 @@
#!/run/current-system/sw/bin/bash
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
if [ -f "$SCRIPT_DIR/.env" ]; then
# Export all variables from .env for the duration of the script
set -a
. "$SCRIPT_DIR/.env"
set +a
fi
# Configuration
DEST_USER="${DEPLOY_DEST_USER:?DEPLOY_DEST_USER manquant}"
DEST_HOST="${DEPLOY_DEST_HOST:?DEPLOY_DEST_HOST manquant}"
SECOND_DEST_HOST="${DEPLOY_SECOND_DEST_HOST:?DEPLOY_SECOND_DEST_HOST manquant}"
DEST_DIR="/var/lib/www/richard-dern.fr/"
HUGO_ENV="production"
TARGET_OWNER="caddy:caddy"
CHOWN_BIN="/run/current-system/sw/bin/chown"
SETFACL_BIN="$(realpath /run/current-system/sw/bin/setfacl)"
is_local_host() {
local target="$1"
local hostnames
hostnames="$(hostname) $(hostname -f) $(hostname -s) localhost 127.0.0.1 ::1"
for name in $hostnames; do
if [ "$target" = "$name" ]; then
return 0
fi
done
return 1
}
DEST_HOSTS=("$DEST_HOST")
if [ "$SECOND_DEST_HOST" != "$DEST_HOST" ]; then
DEST_HOSTS+=("$SECOND_DEST_HOST")
fi
deploy_to_host() {
local host="$1"
local local_deploy=false
local cmd
if is_local_host "$host"; then
local_deploy=true
fi
if [ "$local_deploy" = true ]; then
if [ ! -d "$DEST_DIR" ]; then
echo "Dossier de destination introuvable: $DEST_DIR" >&2
exit 1
fi
echo "==> Vérification/pose des ACL sur $DEST_DIR"
sudo "$SETFACL_BIN" -R -m u:"$USER":rwx -m m:rwx "$DEST_DIR"
sudo "$SETFACL_BIN" -dR -m u:"$USER":rwx -m m:rwx "$DEST_DIR"
echo "==> Synchronisation locale du site vers $DEST_DIR"
rsync -rlvz --delete \
--exclude='data/***' \
--no-owner --no-group \
--no-perms --omit-dir-times --no-times \
--checksum \
public/ "$DEST_DIR"
find "$DEST_DIR" -type d -name data -exec rm -rf {} +
sudo "$CHOWN_BIN" -R "$TARGET_OWNER" "$DEST_DIR"
return
fi
echo "==> Synchronisation du site vers le serveur $host"
rsync -rlvz --delete \
--exclude='data/***' \
--no-owner --no-group \
--no-perms --omit-dir-times --no-times \
--checksum \
public/ "$DEST_USER@$host:$DEST_DIR"
cmd="find \"$DEST_DIR\" -type d -name data -exec rm -rf {} +"
ssh "$DEST_USER@$host" "$cmd"
ssh "$DEST_USER@$host" "$CHOWN_BIN -R $TARGET_OWNER '$DEST_DIR'"
}
echo "==> Vérification des liens externes"
node "$SCRIPT_DIR/tools/check_external_links.js"
echo "==> Vérification des liens internes"
node "$SCRIPT_DIR/tools/check_internal_links.js"
echo "==> Enrichissement météo des articles"
node "$SCRIPT_DIR/tools/add_weather.js"
echo "==> Synchronisation des commentaires Lemmy"
node "$SCRIPT_DIR/tools/sync_lemmy_comments.js"
echo "==> Génération des statistiques"
npm run stats:generate
# echo "==> Application des taxonomies et mots-clés"
# node "$SCRIPT_DIR/tools/link_taxonomy_terms.js"
# echo "==> Ajout des liens vers les mots-clés du frontmatter"
# node "$SCRIPT_DIR/tools/link_frontmatter_keywords.js"
echo "==> Génération du site Hugo pour l'environnement $HUGO_ENV (avec nettoyage de destination)"
hugo --environment "$HUGO_ENV" --cleanDestinationDir
for host in "${DEST_HOSTS[@]}"; do
deploy_to_host "$host"
done
node tools/update_lemmy_post_dates.js
echo "==> Déploiement terminé avec succès."

4
manage Normal file
View File

@@ -0,0 +1,4 @@
#!/usr/bin/env bash
set -euo pipefail
export MANAGE_ROOT_DIR="$(CDPATH= cd -- "$(dirname -- "$0")" && pwd)"
exec /var/lib/richard/blog/manager/manage "$@"

25
test.sh
View File

@@ -1,25 +0,0 @@
#!/run/current-system/sw/bin/bash
set -euo pipefail
detect_dev_host() {
local ip_bin="/run/current-system/sw/bin/ip"
[ -x "$ip_bin" ] || { echo "Binaire ip introuvable : $ip_bin" >&2; exit 1; }
local iface candidate
iface="$("$ip_bin" -4 route show default | awk 'NR==1 {print $5}')"
[ -n "$iface" ] || { echo "Impossible de déterminer l'interface par défaut." >&2; exit 1; }
candidate="$("$ip_bin" -4 addr show dev "$iface" scope global | awk '/inet / {print $2}' | cut -d/ -f1 | head -n1)"
[ -n "$candidate" ] || { echo "Aucune adresse IPv4 trouvée sur l'interface $iface." >&2; exit 1; }
printf '%s\n' "$candidate"
}
DEV_HOST="$(detect_dev_host)"
DEV_PORT=1313
if [ -z "${MEILI_SEARCH_API_KEY+x}" ]; then
export MEILI_SEARCH_API_KEY="dummy"
fi
hugo server -F --bind "$DEV_HOST" --port $DEV_PORT --baseURL "http://$DEV_HOST:$DEV_PORT"

View File

@@ -1,4 +1,4 @@
{{- $defaultReportPath := "tools/cache/external_links.yaml" -}}
{{- $defaultReportPath := "data/external_links.yaml" -}}
{{- $reportPath := default $defaultReportPath .ReportPath -}}
{{- $report := default (dict) .Report -}}
{{- if or (eq (len $report) 0) (not (isset $report "links")) -}}

View File

@@ -1,454 +0,0 @@
#!/usr/bin/env node
/*
Adds a LEGO set to the collection using Rebrickable API.
Usage: node tools/add_lego.js <query>
Flow:
1) Search sets with the provided query
2) Interactive choice if multiple results
3) Fetch chosen set details and its theme
4) Create content/collections/lego/<theme-slug>/<set-id>/
- Always create images/
- Create data/images/
- Download all available set images from Rebrickable
- Create data/images/<image-name>.yaml with attribution
- Create index.md with title and date (only if not existing),
including markdown references to images found
*/
const fs = require('fs/promises');
const fsSync = require('fs');
const path = require('path');
const https = require('https');
const readline = require('readline');
const { loadToolsConfig } = require('./lib/config');
const { formatDateTime } = require('./lib/datetime');
const ROOT = process.cwd();
const LEGO_ROOT = path.join(ROOT, 'content', 'collections', 'lego');
const CACHE_ROOT = path.join(ROOT, 'tools', 'cache', 'rebrickable');
const ONE_WEEK_MS = 7 * 24 * 60 * 60 * 1000;
function slugify(input) {
return input
.normalize('NFD')
.replace(/\p{Diacritic}/gu, '')
.toLowerCase()
.replace(/[^a-z0-9]+/g, '-')
.replace(/^-+|-+$/g, '')
.replace(/-{2,}/g, '-');
}
function prompt(question) {
const rl = readline.createInterface({ input: process.stdin, output: process.stdout });
return new Promise((resolve) => rl.question(question, (ans) => { rl.close(); resolve(ans.trim()); }));
}
function apiGetJson(host, path, apiKey) {
return new Promise((resolve) => {
const req = https.request({
host,
path,
method: 'GET',
headers: {
'Authorization': `key ${apiKey}`,
'Accept': 'application/json',
'User-Agent': 'add_lego_script/1.0',
},
}, (res) => {
let data = '';
res.on('data', (chunk) => { data += chunk; });
res.on('end', () => { resolve(JSON.parse(data)); });
});
req.end();
});
}
function downloadFile(url, destPath) {
return new Promise((resolve) => {
const file = fsSync.createWriteStream(destPath);
https.get(url, (res) => {
res.pipe(file);
file.on('finish', () => file.close(() => resolve()));
});
});
}
async function ensureDir(dir) {
await fs.mkdir(dir, { recursive: true });
}
async function saveJson(relPath, data) {
const out = path.join(CACHE_ROOT, relPath);
await ensureDir(path.dirname(out));
await fs.writeFile(out, JSON.stringify(data, null, 2), 'utf8');
}
async function cachedApiJson(relPath, host, apiPath, apiKey, label, options = {}) {
const full = path.join(CACHE_ROOT, relPath);
if (options.noCache) {
const json = await apiGetJson(host, apiPath, apiKey);
await ensureDir(path.dirname(full));
await fs.writeFile(full, JSON.stringify(json, null, 2), 'utf8');
if (label) console.log(`[fetch] ${label} <- ${apiPath} [nocache]`);
return json;
}
try {
const st = await fs.stat(full);
const ageMs = Date.now() - st.mtimeMs;
if (ageMs < ONE_WEEK_MS) {
try {
const raw = await fs.readFile(full, 'utf8');
const json = JSON.parse(raw);
if (label) console.log(`[cache] ${label}`);
return json;
} catch (_) {
// Corrupt cache, fall through to refetch
if (label) console.log(`[stale] ${label} (cache unreadable) -> refetch`);
}
} else {
if (label) console.log(`[stale] ${label} (age ${(ageMs / ONE_WEEK_MS).toFixed(2)} weeks) -> refetch`);
}
} catch (_) {
// No cache, fetch
}
const json = await apiGetJson(host, apiPath, apiKey);
await ensureDir(path.dirname(full));
await fs.writeFile(full, JSON.stringify(json, null, 2), 'utf8');
if (label) console.log(`[fetch] ${label} <- ${apiPath}`);
return json;
}
function escapeMarkdownAlt(text) {
return String(text || '')
.replace(/\r?\n/g, ' ')
.replace(/\\/g, '\\\\')
.replace(/\[/g, '\\[')
.replace(/\]/g, '\\]');
}
// Lightweight dimensions readers for PNG/JPEG/GIF. Fallback to file size.
async function getImageDimensions(filePath) {
const fd = await fs.open(filePath, 'r');
try {
const stat = await fd.stat();
const size = stat.size || 0;
const hdr = Buffer.alloc(64);
const { bytesRead } = await fd.read(hdr, 0, hdr.length, 0);
const b = hdr.slice(0, bytesRead);
// PNG
if (b.length >= 24 && b[0] === 0x89 && b[1] === 0x50 && b[2] === 0x4E && b[3] === 0x47) {
const width = b.readUInt32BE(16);
const height = b.readUInt32BE(20);
return { width, height, area: width * height, size };
}
// GIF
if (b.length >= 10 && b[0] === 0x47 && b[1] === 0x49 && b[2] === 0x46) {
const width = b.readUInt16LE(6);
const height = b.readUInt16LE(8);
return { width, height, area: width * height, size };
}
// JPEG: scan segments for SOF markers
if (b.length >= 4 && b[0] === 0xFF && b[1] === 0xD8) {
let pos = 2;
while (pos + 3 < size) {
const markerBuf = Buffer.alloc(4);
await fd.read(markerBuf, 0, 4, pos);
if (markerBuf[0] !== 0xFF) { pos += 1; continue; }
const marker = markerBuf[1];
const hasLen = marker !== 0xD8 && marker !== 0xD9;
let segLen = 0;
if (hasLen) segLen = markerBuf.readUInt16BE(2);
// SOF0..SOF15 except DHT(0xC4), JPG(0xC8), DAC(0xCC)
if (marker >= 0xC0 && marker <= 0xCF && ![0xC4, 0xC8, 0xCC].includes(marker)) {
const segHdr = Buffer.alloc(7);
await fd.read(segHdr, 0, segHdr.length, pos + 4);
const height = segHdr.readUInt16BE(1);
const width = segHdr.readUInt16BE(3);
return { width, height, area: width * height, size };
}
if (!hasLen || segLen < 2) { pos += 2; continue; }
pos += 2 + segLen;
}
}
return { width: 0, height: 0, area: 0, size };
} catch (_) {
return { width: 0, height: 0, area: 0, size: 0 };
} finally {
await fd.close();
}
}
async function readYamlTitle(yamlPath) {
try {
const content = await fs.readFile(yamlPath, 'utf8');
// Match title: "..." or title: '...' or title: text
const m = content.match(/^\s*title\s*:\s*(?:\"([^\"]*)\"|'([^']*)'|([^\n\r#]+))/m);
if (!m) return null;
const val = (m[1] || m[2] || m[3] || '').trim();
return val;
} catch (_) {
return null;
}
}
async function main() {
// Args parsing: support --no-cache and a free-form query
const argv = process.argv.slice(2);
let noCache = false;
const parts = [];
for (const a of argv) {
if (a === '--no-cache' || a === '--nocache' || a === '--fresh') {
noCache = true;
} else if (a === '--') {
// everything after is query literal
const idx = argv.indexOf('--');
parts.push(argv.slice(idx + 1).join(' '));
break;
} else if (a.startsWith('--')) {
// ignore unknown flags for now
} else {
parts.push(a);
}
}
const query = parts.join(' ').trim();
if (!query) {
console.log('Usage: node tools/add_lego.js [--no-cache] <query|set-id>');
process.exit(1);
}
const config = await loadToolsConfig();
const apiKey = config.rebrickable.apiKey;
const host = 'rebrickable.com';
// 1) Search (cached ≤ 1 week)
const searchPath = `/api/v3/lego/sets/?search=${encodeURIComponent(query)}&page_size=25`;
const search = await cachedApiJson(
`search/${encodeURIComponent(query)}.json`,
host,
searchPath,
apiKey,
`search:${query}`,
{ noCache }
);
const results = search.results || [];
if (results.length === 0) {
console.log('No sets found for query:', query);
process.exit(0);
}
// 2) Selection
let selected = results[0];
if (results.length > 1) {
console.log(`Found ${results.length} sets:`);
results.forEach((r, i) => {
const id = (r.set_num || '').split('-')[0];
console.log(`${i + 1}. ${r.set_num}${r.name} (${r.year}) [id: ${id}]`);
});
const ans = await prompt('Choose a set (1..N): ');
const idx = Math.max(1, Math.min(results.length, parseInt(ans, 10) || 1)) - 1;
selected = results[idx];
}
const setNum = selected.set_num; // e.g., 10283-1
const setId = (setNum || '').split('-')[0];
// 3) Fetch exact set and theme (cached ≤ 1 week)
const setDetails = await cachedApiJson(
`sets/${setNum}.json`,
host,
`/api/v3/lego/sets/${encodeURIComponent(setNum)}/`,
apiKey,
`set:${setNum}`,
{ noCache }
);
const themeId = setDetails.theme_id || selected.theme_id;
const theme = await cachedApiJson(
`themes/${themeId}.json`,
host,
`/api/v3/lego/themes/${themeId}/`,
apiKey,
`theme:${themeId}`,
{ noCache }
);
const themeName = theme.name;
const themeSlug = slugify(themeName);
// 4) Prepare dirs
const setDir = path.join(LEGO_ROOT, themeSlug, setId);
const imagesDir = path.join(setDir, 'images');
const dataImagesDir = path.join(setDir, 'data', 'images');
await ensureDir(imagesDir); // created even if no images provided
await ensureDir(dataImagesDir);
// Images provided by Rebrickable (set images)
const downloadedImages = new Set();
const imageUrls = [];
if (setDetails.set_img_url) imageUrls.push(setDetails.set_img_url);
if (setDetails.set_img_url_2) imageUrls.push(setDetails.set_img_url_2);
for (const url of imageUrls) {
const urlObj = new URL(url);
const base = path.basename(urlObj.pathname);
const dest = path.join(imagesDir, base);
if (fsSync.existsSync(dest)) {
console.log(`[skip] image exists: ${path.relative(ROOT, dest)}`);
} else {
await downloadFile(url, dest);
downloadedImages.add(base);
console.log(`[download] ${url} -> ${path.relative(ROOT, dest)}`);
}
const nameNoExt = base.replace(/\.[^.]+$/, '');
const yamlPath = path.join(dataImagesDir, `${nameNoExt}.yaml`);
const titleForSetImage = (setDetails.name || selected.name || '').replace(/"/g, '\\"');
const yaml = `attribution: "&copy; [Rebrickable](https://rebrickable.com/)"\n` +
`title: "${titleForSetImage}"\n`;
await fs.writeFile(yamlPath, yaml, 'utf8');
}
// Minifigs images (stored directly in images/)
const figsResp = await cachedApiJson(
`sets/${setNum}_minifigs.json`,
host,
`/api/v3/lego/sets/${encodeURIComponent(setNum)}/minifigs/?page_size=1000`,
apiKey,
`minifigs:${setNum}`,
{ noCache }
);
const figs = figsResp.results || [];
if (figs.length > 0) {
for (const fig of figs) {
const figNum = fig.fig_num || fig.set_num;
let figImg = fig.set_img_url || fig.fig_img_url;
let figTitle = fig.name || '';
if (!figImg && figNum) {
const figDetails = await cachedApiJson(
`minifigs/${figNum}.json`,
host,
`/api/v3/lego/minifigs/${encodeURIComponent(figNum)}/`,
apiKey,
`minifig:${figNum}`,
{ noCache }
);
figImg = figDetails.set_img_url || figDetails.fig_img_url;
figTitle = figTitle || figDetails.name || '';
}
if (!figImg) continue;
const urlObj = new URL(figImg);
const base = path.basename(urlObj.pathname);
const dest = path.join(imagesDir, base);
if (fsSync.existsSync(dest)) {
console.log(`[skip] image exists: ${path.relative(ROOT, dest)}`);
} else {
await downloadFile(figImg, dest);
downloadedImages.add(base);
console.log(`[download] ${figImg} -> ${path.relative(ROOT, dest)}`);
}
const nameNoExt = base.replace(/\.[^.]+$/, '');
const yamlPath = path.join(dataImagesDir, `${nameNoExt}.yaml`);
const cleanedTitle = (figTitle || '').trim();
const yamlLines = [`attribution: "&copy; [Rebrickable](https://rebrickable.com/)"`];
if (cleanedTitle) {
yamlLines.push(`title: "${cleanedTitle.replace(/"/g, '\\"')}"`);
}
const yaml = `${yamlLines.join('\n')}\n`;
await fs.writeFile(yamlPath, yaml, 'utf8');
}
}
// index.md (do not overwrite if exists)
const indexPath = path.join(setDir, 'index.md');
const indexExists = fsSync.existsSync(indexPath);
if (!indexExists) {
const pageTitle = setDetails.name || selected.name;
const createdAt = formatDateTime();
// Collect images present in imagesDir (existing or just downloaded)
let imageFiles = [];
try {
const allFiles = await fs.readdir(imagesDir);
imageFiles = allFiles.filter((f) => /\.(png|jpe?g|gif|webp)$/i.test(f));
} catch (_) {
imageFiles = Array.from(downloadedImages);
}
// Choose cover: prefer downloaded this run; fallback to all images
let coverFile = null;
const coverCandidates = (downloadedImages.size > 0)
? Array.from(downloadedImages).filter((b) => imageFiles.includes(b))
: imageFiles.slice();
if (coverCandidates.length > 0) {
let best = null;
let bestScore = -1;
for (const base of coverCandidates) {
const full = path.join(imagesDir, base);
const dim = await getImageDimensions(full);
const score = (dim.area && dim.area > 0) ? dim.area : (dim.size || 0);
if (score > bestScore) { bestScore = score; best = base; }
}
coverFile = best;
}
let body = '';
if (imageFiles.length > 0) {
const ordered = [
...Array.from(downloadedImages).filter((b) => imageFiles.includes(b)),
...imageFiles.filter((b) => !downloadedImages.has(b)),
];
const lines = [];
for (const base of ordered) {
if (coverFile && base === coverFile) continue;
const nameNoExt = base.replace(/\.[^.]+$/, '');
const yamlPath = path.join(dataImagesDir, `${nameNoExt}.yaml`);
const altRaw = (await readYamlTitle(yamlPath)) || `${pageTitle}`;
const altEsc = escapeMarkdownAlt(altRaw);
const titleAttr = altRaw.replace(/\"/g, '\\"');
lines.push(`\n![${altEsc}](images/${base} \"${titleAttr}\")`);
}
body = lines.join('\n') + '\n';
}
const coverLine = coverFile ? `cover: images/${coverFile}\n` : '';
const md = `---\n` +
`title: "${pageTitle.replace(/"/g, '\\"')}"\n` +
`date: "${createdAt}"\n` +
coverLine +
`---\n\n` +
body;
await fs.writeFile(indexPath, md, 'utf8');
const coverInfo = coverFile ? `, cover=images/${coverFile}` : '';
console.log(`[create] ${path.relative(ROOT, indexPath)} with ${imageFiles.length} image reference(s)${coverInfo}`);
}
// Report downloaded images that are not referenced in an existing index.md
if (indexExists && downloadedImages.size > 0) {
try {
const mdContent = await fs.readFile(indexPath, 'utf8');
const refSet = new Set();
const imgRegex = /!\[[^\]]*\]\(([^)]+)\)/g; // capture URL inside ()
let m;
while ((m = imgRegex.exec(mdContent)) !== null) {
const p = m[1].trim();
const base = path.basename(p.split('?')[0].split('#')[0]);
if (base) refSet.add(base);
}
const notRef = Array.from(downloadedImages).filter((b) => !refSet.has(b));
if (notRef.length > 0) {
console.log(`\n[note] ${notRef.length} downloaded image(s) not referenced in index.md:`);
notRef.forEach((b) => console.log(` - images/${b}`));
}
} catch (e) {
console.log(`[warn] could not analyze index.md references: ${e.message}`);
}
}
console.log(`\nAdded set ${setNum}${path.relative(ROOT, setDir)}`);
}
main();

View File

@@ -1,199 +0,0 @@
#!/usr/bin/env node
const { execSync } = require("child_process");
const fs = require("fs");
const crypto = require("crypto");
const path = require("path");
const os = require("os");
const YAML = require("yaml");
const { buildUserAgent, checkUrl } = require("./lib/http");
const { getArchiveUrl, saveToArchive } = require("./lib/archive");
const { scrapePage } = require("./lib/puppeteer");
const { formatDateTime, toHugoDateTime } = require("./lib/datetime");
const LINKS_ROOT = path.join("content", "interets", "liens-interessants");
if (process.argv.length < 3) {
console.error("Usage: add_link.js <URL> [optional: YYYY-MM-DD]");
process.exit(1);
}
const url = process.argv[2];
const customDate = process.argv[3] || null;
// Generate an MD5 hash of the URL
const urlHash = crypto.createHash("md5").update(url).digest("hex").slice(0, 8);
const hugoRoot = path.resolve(process.cwd());
const interestingLinksRoot = path.join(hugoRoot, LINKS_ROOT);
function findExistingLinkBundle(hash) {
if (!fs.existsSync(interestingLinksRoot)) {
return null;
}
const stack = [interestingLinksRoot];
while (stack.length > 0) {
const current = stack.pop();
if (path.basename(current) === hash) {
return current;
}
let entries = [];
try {
entries = fs.readdirSync(current, { withFileTypes: true });
} catch (error) {
continue;
}
for (const entry of entries) {
if (entry.isDirectory()) {
stack.push(path.join(current, entry.name));
}
}
}
return null;
}
const duplicateBundlePath = findExistingLinkBundle(urlHash);
if (duplicateBundlePath) {
const relative = path.relative(hugoRoot, duplicateBundlePath);
console.log(`⚠ Link already exists at ${relative}: ${url}`);
process.exit(0);
}
// Check URL accessibility and Archive.org availability
(async () => {
const userAgent = buildUserAgent();
const initialCheck = await checkUrl(url, { userAgent, timeoutMs: 8000 });
if (initialCheck.errorType || (typeof initialCheck.status === "number" && initialCheck.status >= 400)) {
console.warn(`⚠ Vérification HTTP avant scraping: ${initialCheck.errorType || initialCheck.status || "indéterminé"}`);
} else {
console.log(`🌐 Vérification HTTP avant scraping: ${initialCheck.status ?? "inconnue"}`);
}
let archiveUrl = await getArchiveUrl(url);
// If the URL is not archived, attempt to save it
if (!archiveUrl) {
console.log(`📂 No archive found. Attempting to save ${url}...`);
archiveUrl = await saveToArchive(url);
if (!archiveUrl) {
console.log(`⚠ Warning: Unable to archive ${url}. Continuing without archive.`);
}
}
console.log(`📂 Archive URL ${archiveUrl}...`);
// Déterminer la date et l'heure d'enregistrement
const entryDate = customDate ? toHugoDateTime(customDate) : toHugoDateTime();
const formattedEntryDate = formatDateTime(entryDate);
const formattedStatusDate = formatDateTime();
const formattedDateFrench = entryDate.setLocale("fr").toFormat("d LLLL yyyy 'à' HH:mm");
const year = entryDate.year;
const month = String(entryDate.month).padStart(2, "0");
const day = String(entryDate.day).padStart(2, "0");
// Define paths
const bundlePath = path.join(hugoRoot, `content/interets/liens-interessants/${year}/${month}/${day}/${urlHash}/`);
const imagesPath = path.join(bundlePath, "images");
const dataPath = path.join(bundlePath, "data");
const finalScreenshotPath = path.join(imagesPath, "screenshot.png");
const metadataPath = path.join(dataPath, "screenshot.yaml");
// Store screenshot in a temporary location first
const tempScreenshotPath = path.join(os.tmpdir(), `screenshot_${urlHash}.png`);
// Scrape the page and capture a screenshot
console.log(`🔍 Scraping page and capturing screenshot...`);
const metadata = await scrapePage(url, tempScreenshotPath, { userAgent });
// If Puppeteer failed, do not proceed
if (!metadata || !fs.existsSync(tempScreenshotPath)) {
console.error(`❌ Scraping failed. No bundle will be created.`);
process.exit(1);
}
if (!metadata.httpStatus && typeof initialCheck.status === "number") {
metadata.httpStatus = initialCheck.status;
}
// Create Hugo bundle only if scraping succeeded
console.log(`📦 Creating Hugo bundle for ${url}...`);
execSync(`hugo new --kind liens-interessants interets/liens-interessants/${year}/${month}/${day}/${urlHash}/`, { stdio: "inherit" });
if (!fs.existsSync(bundlePath)) {
console.error("❌ Failed to create the bundle.");
process.exit(1);
}
// Move the screenshot to the final destination
if (!fs.existsSync(imagesPath)) fs.mkdirSync(imagesPath, { recursive: true });
fs.renameSync(tempScreenshotPath, finalScreenshotPath);
// Modify the frontmatter
const indexPath = path.join(bundlePath, "index.md");
let content = fs.readFileSync(indexPath, "utf8");
// Inject date
content = content.replace(/^date: .*/m, `date: "${formattedEntryDate}"`);
// Inject status
const statusEntry = `{"date": "${formattedStatusDate}", "http_code": ${metadata.httpStatus || "null"}}`;
content = content.replace("status: []", `status: [${statusEntry}]`);
// Inject title and description
if (metadata.title) {
content = content.replace(/title: ".*?"/, `title: "${metadata.title.replace(/"/g, '\\"')}"`);
}
if (metadata.description) {
content = content.replace("> [description]", `> ${metadata.description.replace(/"/g, '\\"')}`);
} else {
content = content.replace("> [description]\n\n", ""); // Remove placeholder if no description
}
// Inject keywords
if (metadata.keywords.length > 0) {
content = content.replace("keywords: []", `keywords: ["${metadata.keywords.join('", "')}"]`);
}
// Inject cover
content = content.replace('cover: ""', `cover: "images/screenshot.png"`);
// Inject links (and supprimer urls/links éventuels déjà présents)
const links = [];
links.push({
name: "Page d'origine",
url: url,
lang: metadata.lang || "unknown",
});
if (archiveUrl) {
links.push({
name: "Archive",
url: archiveUrl,
archive: true,
});
}
const linksYaml = YAML.stringify({ links }).trim();
content = content.replace(/^urls: \[\]\n?/m, "");
content = content.replace(/^links: \[\]\n?/m, "");
content = content.replace(/^---/, `---\n${linksYaml}`);
fs.writeFileSync(indexPath, content);
// Create metadata folder if necessary
if (!fs.existsSync(dataPath)) fs.mkdirSync(dataPath, { recursive: true });
// Write metadata for the screenshot
console.log("📝 Writing metadata...");
const metadataContent = `title: "Capture d'écran de ${url}"
description: "Capture effectuée le ${formattedDateFrench}"
attribution: "Richard Dern"
file: "images/screenshot.png"
`;
fs.writeFileSync(metadataPath, metadataContent);
console.log(`✔ Metadata saved: ${metadataPath}`);
console.log(`🎉 Link successfully added! Bundle path: ${bundlePath}`);
console.log(bundlePath);
})();

View File

@@ -1,66 +0,0 @@
#!/usr/bin/env node
const fs = require("fs");
const path = require("path");
const yaml = require("js-yaml");
const { scrapePage } = require("./lib/puppeteer");
if (process.argv.length < 4) {
console.error("Usage: add_link_to_article.js <URL> <path_to_article>");
process.exit(1);
}
const url = process.argv[2];
const articlePath = process.argv[3];
const indexPath = path.join(articlePath, "index.md");
// Vérifier si le fichier `index.md` existe
if (!fs.existsSync(indexPath)) {
console.error(`❌ The article at ${indexPath} does not exist.`);
process.exit(1);
}
// Charger le frontmatter YAML depuis `index.md`
const fileContent = fs.readFileSync(indexPath, "utf8");
const frontmatterMatch = fileContent.match(/^---\n([\s\S]+?)\n---/);
if (!frontmatterMatch) {
console.error("❌ No valid frontmatter found in the article.");
process.exit(1);
}
let frontmatter = yaml.load(frontmatterMatch[1]);
if (!frontmatter.links) frontmatter.links = [];
// Vérifier si le lien existe déjà
if (frontmatter.links.some(link => link.url === url)) {
console.log(`⚠ The link "${url}" is already in the article.`);
process.exit(0);
}
// Scraper la page pour récupérer le titre et la langue
(async () => {
console.log(`🔍 Retrieving metadata for ${url}...`);
const metadata = await scrapePage(url, "/dev/null"); // Ignore la capture décran pour ce script
if (!metadata.title) {
console.error("❌ Failed to retrieve page title. Skipping.");
process.exit(1);
}
frontmatter.links.push({
name: metadata.title,
lang: metadata.lang || "unknown",
url: url,
official: false
});
// Reconstruire le frontmatter YAML
const newFrontmatter = yaml.dump(frontmatter);
const newContent = `---\n${newFrontmatter}---\n${fileContent.replace(frontmatterMatch[0], "").trim()}`;
// Écrire le fichier
fs.writeFileSync(indexPath, newContent, "utf8");
console.log(`✔ Link successfully added to ${indexPath}`);
})();

View File

@@ -1,151 +0,0 @@
#!/usr/bin/env node
const path = require("path");
const { resolveMarkdownTargets } = require("./lib/content");
const { extractRawDate, readFrontmatter, writeFrontmatter } = require("./lib/weather/frontmatter");
const { resolveArticleDate } = require("./lib/weather/time");
const { fetchWeather, hasConfiguredProvider, mergeWeather } = require("./lib/weather/providers");
const { loadWeatherConfig } = require("./lib/weather/config");
const { isEffectivelyPublishedDocument } = require("./lib/publication");
const CONTENT_ROOT = path.resolve("content");
async function processFile(filePath, config, { force = false } = {}) {
const frontmatter = await readFrontmatter(filePath);
if (!frontmatter) {
return { status: "no-frontmatter" };
}
if (isEffectivelyPublishedDocument(frontmatter.doc) === false) {
return { status: "draft" };
}
let existingWeather = null;
if (frontmatter.doc.has("weather")) {
existingWeather = frontmatter.doc.get("weather");
}
if (!force && existingWeather) {
return { status: "already-set" };
}
const dateValue = frontmatter.doc.get("date");
if (!dateValue) {
return { status: "no-date" };
}
const rawDate = extractRawDate(frontmatter.frontmatterText);
const targetDate = resolveArticleDate(dateValue, rawDate, config);
if (!targetDate) {
return { status: "invalid-date" };
}
const weather = await fetchWeather(targetDate, config);
let finalWeather = {};
if (force) {
if (weather) {
finalWeather = weather;
}
} else if (existingWeather && typeof existingWeather === "object") {
finalWeather = JSON.parse(JSON.stringify(existingWeather));
if (weather) {
let providerName = null;
if (weather.source && weather.source.length > 0) {
providerName = weather.source[0];
}
const added = mergeWeather(finalWeather, weather, providerName);
if (!added && Object.keys(finalWeather).length === 0) {
finalWeather = weather;
}
}
} else if (weather) {
finalWeather = weather;
}
frontmatter.doc.set("weather", finalWeather);
await writeFrontmatter(filePath, frontmatter.doc, frontmatter.body);
let status = "empty";
if (weather) {
status = "updated";
}
let sources = [];
if (weather && weather.source) {
sources = weather.source;
}
return {
status,
sources,
};
}
async function main() {
const cliArgs = process.argv.slice(2);
let force = false;
if (cliArgs.includes("--force") || cliArgs.includes("-f")) {
force = true;
}
const pathArgs = cliArgs.filter((arg) => arg !== "--force" && arg !== "-f");
const config = loadWeatherConfig();
if (!hasConfiguredProvider(config)) {
console.error("No weather provider configured. Update tools/config/config.json (weather.providers) before running this script.");
process.exit(1);
}
const files = await resolveMarkdownTargets(pathArgs, { rootDir: CONTENT_ROOT });
if (files.length === 0) {
console.log("No matching markdown files found.");
return;
}
let updated = 0;
let skipped = 0;
for (const file of files) {
const relativePath = path.relative(process.cwd(), file);
try {
const result = await processFile(file, config, { force });
switch (result.status) {
case "updated": {
updated += 1;
let sourcesLabel = "unknown source";
if (result.sources.length > 0) {
sourcesLabel = result.sources.join(", ");
}
console.log(`✔ Added weather to ${relativePath} (${sourcesLabel})`);
break;
}
case "empty":
updated += 1;
console.log(`• Added empty weather to ${relativePath}`);
break;
case "draft":
skipped += 1;
console.log(`• Skipped draft article ${relativePath}`);
break;
default:
skipped += 1;
}
} catch (error) {
skipped += 1;
console.error(`✖ Failed ${relativePath}: ${error.message}`);
}
}
console.log(`\nSummary: ${updated} updated, ${skipped} skipped.`);
}
main().catch((error) => {
console.error(error);
process.exit(1);
});

View File

@@ -1,131 +0,0 @@
#!/usr/bin/env node
const fs = require("node:fs");
const fsPromises = require("node:fs/promises");
const path = require("node:path");
const { ensureBundleExists, promptForBundlePath } = require("./lib/bundles");
const {
extractFileTitleFromUrl,
fetchWikimediaAsset,
downloadFile,
} = require("./lib/wikimedia");
const CONTENT_DIR = path.resolve("content");
const TEMPLATE_PATH = path.resolve("data/metadata_template.yaml");
const IMAGES_DIR = "images";
const DATA_IMAGES_DIR = path.join("data", "images");
/**
* Affiche l'aide minimale du script.
*/
function showUsage() {
console.error("Usage: node tools/add_wikimedia_image.js <url_wikimedia> [bundle_path]");
}
/**
* Charge le modèle YAML utilisé pour les métadonnées de pièces jointes.
* @returns {Promise<string>} Contenu brut du modèle.
*/
async function loadMetadataTemplate() {
return fsPromises.readFile(TEMPLATE_PATH, "utf8");
}
/**
* Injecte l'attribution et la description dans le modèle YAML existant.
* @param {string} template Modèle brut.
* @param {{ attribution: string, description: string }} metadata Valeurs à injecter.
* @returns {string} YAML final prêt à être écrit.
*/
function fillMetadataTemplate(template, metadata) {
let output = template;
output = output.replace('#attribution: ""', `attribution: ${JSON.stringify(metadata.attribution)}`);
output = output.replace('#description: ""', `description: ${JSON.stringify(metadata.description)}`);
return output;
}
/**
* Retourne les chemins finaux de l'image et de son fichier de métadonnées.
* @param {string} bundlePath Bundle ciblé.
* @param {string} fileName Nom de fichier téléchargé.
* @returns {{ imagePath: string, metadataPath: string }}
*/
function buildTargetPaths(bundlePath, fileName) {
const extension = path.extname(fileName);
const baseName = path.basename(fileName, extension);
return {
imagePath: path.join(bundlePath, IMAGES_DIR, fileName),
metadataPath: path.join(bundlePath, DATA_IMAGES_DIR, `${baseName}.yaml`),
};
}
/**
* Vérifie qu'aucun fichier existant ne sera écrasé.
* @param {string} imagePath Chemin final de l'image.
* @param {string} metadataPath Chemin final des métadonnées.
*/
function ensureTargetsDoNotExist(imagePath, metadataPath) {
if (fs.existsSync(imagePath)) {
throw new Error(`L'image ${imagePath} existe déjà.`);
}
if (fs.existsSync(metadataPath)) {
throw new Error(`Le fichier de métadonnées ${metadataPath} existe déjà.`);
}
}
/**
* Assure l'existence des dossiers parents nécessaires.
* @param {string} imagePath Chemin final de l'image.
* @param {string} metadataPath Chemin final des métadonnées.
*/
async function ensureTargetDirectories(imagePath, metadataPath) {
await fsPromises.mkdir(path.dirname(imagePath), { recursive: true });
await fsPromises.mkdir(path.dirname(metadataPath), { recursive: true });
}
/**
* Point d'entrée principal du script.
*/
async function main() {
const rawUrl = process.argv[2];
const manualBundlePath = process.argv[3];
if (!rawUrl) {
showUsage();
process.exit(1);
}
const bundlePath = await promptForBundlePath(manualBundlePath, {
contentDir: CONTENT_DIR,
prompts: {
confirmLatest(latest) {
return `Utiliser le dernier bundle trouve : ${latest} ? (Y/n) `;
},
manualPath: "Saisissez le chemin relatif du bundle : ",
},
});
ensureBundleExists(bundlePath);
const fileTitle = extractFileTitleFromUrl(rawUrl);
console.log(`Recuperation des metadonnees Wikimedia pour ${fileTitle}...`);
const asset = await fetchWikimediaAsset(fileTitle);
const targets = buildTargetPaths(bundlePath, asset.fileName);
ensureTargetsDoNotExist(targets.imagePath, targets.metadataPath);
await ensureTargetDirectories(targets.imagePath, targets.metadataPath);
console.log(`Telechargement de ${asset.fileName}...`);
await downloadFile(asset.imageUrl, targets.imagePath);
const template = await loadMetadataTemplate();
const metadataContent = fillMetadataTemplate(template, asset);
await fsPromises.writeFile(targets.metadataPath, metadataContent, "utf8");
console.log(`Image enregistree : ${targets.imagePath}`);
console.log(`Metadonnees enregistrees : ${targets.metadataPath}`);
}
main();

View File

@@ -1,962 +0,0 @@
#!/usr/bin/env node
const fs = require("fs");
const path = require("path");
const yaml = require("js-yaml");
const { buildUserAgent, checkUrl, checkWithPlaywright } = require("./lib/http");
const {
collectMarkdownLinksFromFile,
extractLinksFromText,
} = require("./lib/markdown_links");
const SITE_ROOT = path.resolve(__dirname, "..");
const CONTENT_DIR = path.join(SITE_ROOT, "content");
const CONFIG_PATH = path.join(__dirname, "config", "config.json");
const DAY_MS = 24 * 60 * 60 * 1000;
const DEFAULT_CONFIG = {
cacheDir: path.join(__dirname, "cache"),
cacheFile: "external_links.yaml",
hostDelayMs: 2000,
requestTimeoutSeconds: 5,
cacheTtlSuccessDays: 30,
cacheTtlClientErrorDays: 7,
cacheTtlServerErrorDays: 1,
cacheTtlTimeoutDays: 7,
maxConcurrentHosts: 4,
maxRedirects: 5,
userAgent: null,
ignoreHosts: [],
usePlaywright: false,
playwrightTimeoutSeconds: 10,
playwrightExecutablePath: null,
};
function loadConfig() {
if (!fs.existsSync(CONFIG_PATH)) {
return {};
}
try {
return JSON.parse(fs.readFileSync(CONFIG_PATH, "utf8"));
} catch (error) {
console.warn(
`Impossible de parser ${path.relative(SITE_ROOT, CONFIG_PATH)} (${error.message}).`
);
return {};
}
}
const rawConfig = loadConfig();
const settings = {
...DEFAULT_CONFIG,
...(rawConfig.externalLinks || {}),
};
const CACHE_DIR = path.isAbsolute(settings.cacheDir)
? settings.cacheDir
: path.resolve(SITE_ROOT, settings.cacheDir);
const REPORT_PATH = path.isAbsolute(settings.cacheFile)
? settings.cacheFile
: path.join(CACHE_DIR, settings.cacheFile);
const HOST_DELAY_MS = Math.max(0, Number(settings.hostDelayMs) || 0);
const REQUEST_TIMEOUT_SECONDS = Math.max(1, Number(settings.requestTimeoutSeconds) || 5);
const REQUEST_TIMEOUT_MS = REQUEST_TIMEOUT_SECONDS * 1000;
const MAX_CONCURRENT_HOSTS = Math.max(
1,
Number.isFinite(Number(settings.maxConcurrentHosts))
? Number(settings.maxConcurrentHosts)
: DEFAULT_CONFIG.maxConcurrentHosts
);
const MAX_REDIRECTS = Math.max(
0,
Number.isFinite(Number(settings.maxRedirects))
? Number(settings.maxRedirects)
: DEFAULT_CONFIG.maxRedirects
);
const DEFAULT_USER_AGENT = buildUserAgent(settings.userAgent);
const IGNORE_HOSTS = parseIgnoreHosts(settings.ignoreHosts);
let PLAYWRIGHT_EXECUTABLE = null;
if (settings.usePlaywright === true && typeof settings.playwrightExecutablePath === "string") {
const trimmedExecutable = settings.playwrightExecutablePath.trim();
if (trimmedExecutable) {
const resolution = resolvePlaywrightExecutable(trimmedExecutable);
if (resolution.missing) {
console.error(
[
"Playwright activé mais impossible de trouver l'exécutable demandé",
`"${resolution.missing}".`,
`Corrigez externalLinks.playwrightExecutablePath dans ${path.relative(
SITE_ROOT,
CONFIG_PATH
)} ou laissez ce champ vide pour utiliser la configuration Playwright par défaut.`,
].join(" ")
);
process.exit(1);
}
PLAYWRIGHT_EXECUTABLE = resolution.path;
}
}
const PLAYWRIGHT_ENABLED = settings.usePlaywright === true;
const PLAYWRIGHT_TIMEOUT_MS = Math.max(
1000,
(Number.isFinite(Number(settings.playwrightTimeoutSeconds))
? Number(settings.playwrightTimeoutSeconds)
: DEFAULT_CONFIG.playwrightTimeoutSeconds) * 1000
);
const PLAYWRIGHT_RECHECK_STATUSES = new Set([403, 426, 429, 502]);
const CACHE_TTL_SUCCESS_MS = daysToMs(
pickNumber(settings.cacheTtlSuccessDays, DEFAULT_CONFIG.cacheTtlSuccessDays)
);
const CACHE_TTL_CLIENT_ERROR_MS = daysToMs(
pickNumber(settings.cacheTtlClientErrorDays, DEFAULT_CONFIG.cacheTtlClientErrorDays)
);
const CACHE_TTL_SERVER_ERROR_MS = daysToMs(
pickNumber(settings.cacheTtlServerErrorDays, DEFAULT_CONFIG.cacheTtlServerErrorDays)
);
const CACHE_TTL_TIMEOUT_MS = daysToMs(
pickNumber(settings.cacheTtlTimeoutDays, DEFAULT_CONFIG.cacheTtlTimeoutDays)
);
fs.mkdirSync(CACHE_DIR, { recursive: true });
const BASE_HTTP_OPTIONS = {
userAgent: DEFAULT_USER_AGENT,
timeoutMs: REQUEST_TIMEOUT_MS,
maxRedirects: MAX_REDIRECTS,
};
function pickNumber(value, fallback) {
const parsed = Number(value);
if (Number.isFinite(parsed)) {
return parsed;
}
return fallback;
}
// Normalise un hôte pour comparaison.
function normalizeHost(value) {
if (typeof value !== "string") {
return null;
}
const normalized = value.trim().toLowerCase();
return normalized || null;
}
// Retire le port éventuel d'une chaîne décrivant un hôte.
function stripPort(hostValue) {
if (typeof hostValue !== "string") {
return null;
}
const trimmed = hostValue.trim();
if (!trimmed) {
return null;
}
const bracketMatch = trimmed.match(/^\[([^\]]+)\](?::\d+)?$/);
if (bracketMatch) {
return bracketMatch[1].toLowerCase();
}
const colonIndex = trimmed.lastIndexOf(":");
if (colonIndex > -1 && trimmed.indexOf(":") === colonIndex) {
const hostPart = trimmed.slice(0, colonIndex).trim();
if (hostPart) {
return hostPart.toLowerCase();
}
}
return trimmed.toLowerCase();
}
// Nettoie une entrée de configuration d'hôte à ignorer (schéma, chemin, port).
function normalizeIgnoreHostEntry(value) {
if (typeof value !== "string") {
return null;
}
let candidate = value.trim().toLowerCase();
if (!candidate) {
return null;
}
candidate = candidate.replace(/^[a-z][a-z0-9+.-]*:\/\//, "");
candidate = candidate.replace(/^\/\//, "");
candidate = candidate.replace(/\/.*$/, "");
candidate = candidate.replace(/[?#].*$/, "");
return stripPort(candidate);
}
// Construit l'ensemble des hôtes à ignorer à partir de la configuration.
function parseIgnoreHosts(raw) {
const set = new Set();
if (Array.isArray(raw)) {
for (const entry of raw) {
const host = normalizeIgnoreHostEntry(entry);
if (host) {
set.add(host);
}
}
}
return set;
}
// Résout l'exécutable Playwright défini dans la configuration, en tolérant un simple nom présent dans PATH.
function resolvePlaywrightExecutable(rawPath) {
if (typeof rawPath !== "string") {
return { path: null, missing: null };
}
const trimmed = rawPath.trim();
if (!trimmed) {
return { path: null, missing: null };
}
if (fs.existsSync(trimmed)) {
return { path: trimmed, missing: null };
}
if (!containsPathSeparator(trimmed)) {
const resolved = findExecutableInPath(trimmed);
if (resolved) {
return { path: resolved, missing: null };
}
}
return { path: null, missing: trimmed };
}
function containsPathSeparator(value) {
return value.includes("/") || value.includes("\\");
}
// Parcourt la variable d'environnement PATH pour retrouver un binaire par son nom.
function findExecutableInPath(binaryName) {
if (typeof binaryName !== "string" || !binaryName.trim()) {
return null;
}
const searchPath = process.env.PATH || "";
const directories = searchPath.split(path.delimiter);
for (const directory of directories) {
if (!directory) {
continue;
}
const candidate = path.join(directory, binaryName);
if (fs.existsSync(candidate)) {
return candidate;
}
}
return null;
}
function daysToMs(days) {
if (!Number.isFinite(days) || days <= 0) {
return 0;
}
return days * DAY_MS;
}
function ensureDirectoryExists(targetFile) {
fs.mkdirSync(path.dirname(targetFile), { recursive: true });
}
function toPosix(relativePath) {
return typeof relativePath === "string" ? relativePath.split(path.sep).join("/") : relativePath;
}
function relativeToSite(filePath) {
return toPosix(path.relative(SITE_ROOT, filePath));
}
function toPagePath(relativeContentPath) {
if (!relativeContentPath) return null;
let normalized = toPosix(relativeContentPath);
if (!normalized) return null;
normalized = normalized.replace(/^content\//, "");
if (!normalized) {
return "/";
}
normalized = normalized.replace(/\/index\.md$/i, "");
normalized = normalized.replace(/\/_index\.md$/i, "");
normalized = normalized.replace(/\.md$/i, "");
normalized = normalized.replace(/\/+/g, "/");
normalized = normalized.replace(/\/+$/, "");
normalized = normalized.replace(/^\/+/, "");
if (!normalized) {
return "/";
}
return `/${normalized}`;
}
function deriveBundlePagePath(contentRelative) {
if (!contentRelative) return null;
const bundleRoot = contentRelative.replace(/\/data\/.*$/, "");
const candidates = [`${bundleRoot}/index.md`, `${bundleRoot}/_index.md`];
for (const candidate of candidates) {
const absolute = path.join(CONTENT_DIR, candidate);
if (fs.existsSync(absolute)) {
return toPagePath(candidate);
}
}
return toPagePath(bundleRoot);
}
function derivePagePath(relativeFile) {
if (typeof relativeFile !== "string") return null;
const normalized = toPosix(relativeFile);
if (!normalized || !normalized.startsWith("content/")) return null;
const contentRelative = normalized.slice("content/".length);
if (contentRelative.includes("/data/")) {
return deriveBundlePagePath(contentRelative);
}
return toPagePath(contentRelative);
}
function loadState() {
if (!fs.existsSync(REPORT_PATH)) {
return { generatedAt: null, links: [], entries: {} };
}
try {
const payload = yaml.load(fs.readFileSync(REPORT_PATH, "utf8")) || {};
if (payload.entries && typeof payload.entries === "object") {
return {
generatedAt: payload.generatedAt || null,
links: Array.isArray(payload.links) ? payload.links : [],
entries: normalizeEntries(payload.entries),
};
}
return {
generatedAt: payload.generatedAt || null,
links: Array.isArray(payload.links) ? payload.links : [],
entries: normalizeEntries(payload),
};
} catch (error) {
console.warn(
`Impossible de lire ${path.relative(SITE_ROOT, REPORT_PATH)} (${error.message}).`
);
return { generatedAt: null, links: [], entries: {} };
}
}
function normalizeEntries(rawEntries) {
const normalized = {};
if (!rawEntries || typeof rawEntries !== "object") {
return normalized;
}
for (const [url, data] of Object.entries(rawEntries)) {
if (!url.includes("://")) {
continue;
}
normalized[url] = normalizeEntryShape(url, data);
}
return normalized;
}
function normalizeEntryShape(url, raw) {
const checkedAt = raw?.checkedAt || raw?.checked || null;
const locations = normalizeLocations(raw?.locations, raw?.files);
return {
url,
status: typeof raw?.status === "number" ? raw.status : null,
errorType: raw?.errorType || null,
method: raw?.method || null,
checkedAt,
locations,
};
}
function normalizeLocations(locations, fallbackFiles) {
const items = [];
if (Array.isArray(locations)) {
for (const entry of locations) {
if (!entry) continue;
if (typeof entry === "string") {
const [filePart, linePart] = entry.split(":");
const filePath = toPosix(filePart.trim());
items.push({
file: filePath,
line: linePart ? Number.parseInt(linePart, 10) || null : null,
page: derivePagePath(filePath),
});
} else if (typeof entry === "object") {
const file = sizeof(entry.file) ? entry.file : null;
if (file) {
const normalizedFile = toPosix(file);
items.push({
file: normalizedFile,
line: typeof entry.line === "number" ? entry.line : null,
page:
typeof entry.page === "string" && entry.page.trim()
? toPosix(entry.page.trim())
: derivePagePath(normalizedFile),
});
}
}
}
}
if (items.length === 0 && Array.isArray(fallbackFiles)) {
for (const file of fallbackFiles) {
if (!file) continue;
const normalizedFile = toPosix(file);
items.push({
file: normalizedFile,
line: null,
page: derivePagePath(normalizedFile),
});
}
}
return dedupeAndSortLocations(items);
}
function sizeof(value) {
return typeof value === "string" && value.trim().length > 0;
}
function dedupeAndSortLocations(list) {
if (!Array.isArray(list) || list.length === 0) {
return [];
}
const map = new Map();
for (const item of list) {
if (!item?.file) continue;
const key = `${item.file}::${item.line ?? ""}`;
if (!map.has(key)) {
const normalizedFile = toPosix(item.file);
map.set(key, {
file: normalizedFile,
line: typeof item.line === "number" ? item.line : null,
page:
typeof item.page === "string" && item.page.trim()
? toPosix(item.page.trim())
: derivePagePath(normalizedFile),
});
}
}
return Array.from(map.values()).sort((a, b) => {
const fileDiff = a.file.localeCompare(b.file);
if (fileDiff !== 0) return fileDiff;
const lineA = a.line ?? Number.POSITIVE_INFINITY;
const lineB = b.line ?? Number.POSITIVE_INFINITY;
return lineA - lineB;
});
}
function saveState(state) {
ensureDirectoryExists(REPORT_PATH);
fs.writeFileSync(REPORT_PATH, yaml.dump(state), "utf8");
}
function createEntry(url, existing = {}) {
return {
url,
status: typeof existing.status === "number" ? existing.status : null,
errorType: existing.errorType || null,
method: existing.method || null,
checkedAt: existing.checkedAt || null,
locations: Array.isArray(existing.locations) ? dedupeAndSortLocations(existing.locations) : [],
};
}
function mergeOccurrences(entries, occurrences) {
const merged = {};
for (const [url, urlOccurrences] of occurrences.entries()) {
const existing = entries[url] || createEntry(url);
merged[url] = {
...existing,
url,
locations: dedupeAndSortLocations(urlOccurrences),
};
}
return merged;
}
// Liste les URLs réellement présentes dans le contenu pour supprimer les entrées obsolètes.
function buildActiveUrlSet(occurrences) {
const active = new Set();
for (const url of occurrences.keys()) {
active.add(url);
}
return active;
}
// Conserve uniquement les liens du rapport encore utilisés dans le contenu.
function filterReportLinks(links, activeUrls) {
if (!Array.isArray(links) || activeUrls.size === 0) {
return [];
}
const filtered = [];
for (const link of links) {
if (!link?.url || !activeUrls.has(link.url)) {
continue;
}
filtered.push({
...link,
locations: Array.isArray(link.locations) ? dedupeAndSortLocations(link.locations) : [],
});
}
return filtered;
}
// Supprime de la collecte les URLs dont l'hôte est ignoré.
function filterIgnoredHosts(occurrences, ignoreHosts) {
if (!ignoreHosts || ignoreHosts.size === 0) {
return occurrences;
}
const filtered = new Map();
for (const [url, urlOccurrences] of occurrences.entries()) {
const host = extractHost(url);
if (host && ignoreHosts.has(host)) {
continue;
}
filtered.set(url, urlOccurrences);
}
return filtered;
}
function shouldRecheckWithPlaywright(result) {
if (!PLAYWRIGHT_ENABLED) {
return false;
}
if (!result) {
return true;
}
if (result.errorType === "timeout" || result.errorType === "network") {
return true;
}
const status = typeof result.status === "number" ? result.status : null;
if (status === null) {
return true;
}
return PLAYWRIGHT_RECHECK_STATUSES.has(status);
}
function recordOccurrence(map, filePath, line, url) {
if (!map.has(url)) {
map.set(url, []);
}
const relativeFile = relativeToSite(filePath);
const normalizedLine = typeof line === "number" && Number.isFinite(line) ? line : null;
const pagePath = derivePagePath(relativeFile);
const list = map.get(url);
const key = `${relativeFile}:${normalizedLine ?? ""}`;
if (!list.some((item) => `${item.file}:${item.line ?? ""}` === key)) {
list.push({ file: relativeFile, line: normalizedLine, page: pagePath });
}
}
function stripYamlInlineComment(line) {
let inSingle = false;
let inDouble = false;
for (let i = 0; i < line.length; i++) {
const ch = line[i];
if (ch === "'" && !inDouble) {
const next = line[i + 1];
if (inSingle && next === "'") {
i++;
continue;
}
inSingle = !inSingle;
} else if (ch === '"' && !inSingle) {
if (!inDouble) {
inDouble = true;
} else if (line[i - 1] !== "\\") {
inDouble = false;
}
} else if (ch === "#" && !inSingle && !inDouble) {
return line.slice(0, i);
} else if (ch === "\\" && inDouble) {
i++;
}
}
return line;
}
function isYamlCommentLine(line) {
return line.trim().startsWith("#");
}
function isBlockScalarIndicator(line) {
const cleaned = stripYamlInlineComment(line).trim();
return /:\s*[>|][0-9+-]*\s*$/.test(cleaned);
}
function processYamlRecursively(obj, links = new Set()) {
if (typeof obj === "string") {
for (const link of extractLinksFromText(obj)) {
links.add(link);
}
} else if (Array.isArray(obj)) {
for (const item of obj) {
processYamlRecursively(item, links);
}
} else if (obj && typeof obj === "object") {
for (const value of Object.values(obj)) {
processYamlRecursively(value, links);
}
}
return links;
}
async function collectMarkdownLinks(filePath, occurrences) {
const entries = await collectMarkdownLinksFromFile(filePath);
for (const { url, line } of entries) {
recordOccurrence(occurrences, filePath, line, url);
}
}
async function collectYamlLinks(filePath, occurrences) {
let linkSet = new Set();
try {
const doc = yaml.load(fs.readFileSync(filePath, "utf8"));
if (doc) {
linkSet = processYamlRecursively(doc);
}
} catch (error) {
console.warn(`Impossible de parser ${relativeToSite(filePath)} (${error.message}).`);
return;
}
if (linkSet.size === 0) {
return;
}
const recorded = new Map();
const lines = fs.readFileSync(filePath, "utf8").split(/\r?\n/);
let inBlockScalar = false;
let blockIndent = 0;
const mark = (url, lineNumber) => {
if (!recorded.has(url)) {
recorded.set(url, new Set());
}
const set = recorded.get(url);
if (!set.has(lineNumber)) {
set.add(lineNumber);
recordOccurrence(occurrences, filePath, lineNumber, url);
}
};
for (let index = 0; index < lines.length; index++) {
const lineNumber = index + 1;
const line = lines[index];
const indent = line.match(/^\s*/)?.[0].length ?? 0;
const trimmed = line.trim();
if (inBlockScalar) {
if (trimmed === "" && indent < blockIndent) {
inBlockScalar = false;
continue;
}
if (trimmed === "" || indent >= blockIndent) {
if (!isYamlCommentLine(line)) {
for (const link of extractLinksFromText(line)) {
if (linkSet.has(link)) {
mark(link, lineNumber);
}
}
}
continue;
}
inBlockScalar = false;
}
const withoutComment = stripYamlInlineComment(line);
const trimmedWithoutComment = withoutComment.trim();
if (isBlockScalarIndicator(line)) {
inBlockScalar = true;
blockIndent = indent + 1;
}
if (isYamlCommentLine(line) || !trimmedWithoutComment) {
continue;
}
for (const link of extractLinksFromText(withoutComment)) {
if (linkSet.has(link)) {
mark(link, lineNumber);
}
}
}
for (const link of linkSet) {
if (!recorded.has(link) || recorded.get(link).size === 0) {
recordOccurrence(occurrences, filePath, null, link);
}
}
}
function walk(dir, exts) {
let results = [];
const entries = fs.readdirSync(dir, { withFileTypes: true });
for (const entry of entries) {
const fullPath = path.join(dir, entry.name);
if (entry.isDirectory()) {
results = results.concat(walk(fullPath, exts));
} else if (exts.includes(path.extname(entry.name))) {
results.push(fullPath);
}
}
return results;
}
function delay(ms) {
return new Promise((resolve) => setTimeout(resolve, ms));
}
const lastHostChecks = new Map();
async function applyHostDelay(host) {
if (!host || HOST_DELAY_MS <= 0) {
return;
}
const last = lastHostChecks.get(host);
if (last) {
const elapsed = Date.now() - last;
const wait = HOST_DELAY_MS - elapsed;
if (wait > 0) {
await delay(wait);
}
}
}
function recordHostCheck(host) {
if (host) {
lastHostChecks.set(host, Date.now());
}
}
function extractHost(url) {
try {
const hostname = new URL(url).hostname;
return normalizeHost(hostname);
} catch (_) {
return null;
}
}
function getTtlMs(entry) {
if (!entry) return 0;
if (entry.errorType === "timeout" || entry.status === 0 || entry.status === null) {
return CACHE_TTL_TIMEOUT_MS;
}
const status = Number(entry.status);
if (Number.isNaN(status)) {
return CACHE_TTL_TIMEOUT_MS;
}
if (status >= 500) {
return CACHE_TTL_SERVER_ERROR_MS;
}
if (status >= 400) {
return CACHE_TTL_CLIENT_ERROR_MS;
}
if (status >= 200 && status < 400) {
return CACHE_TTL_SUCCESS_MS;
}
return CACHE_TTL_TIMEOUT_MS;
}
function needsCheck(entry) {
if (!entry?.checkedAt) {
return true;
}
const checked = Date.parse(entry.checkedAt);
if (Number.isNaN(checked)) {
return true;
}
const ttl = getTtlMs(entry);
if (ttl <= 0) {
return true;
}
return Date.now() - checked >= ttl;
}
function groupEntriesByHost(entries) {
const groups = new Map();
for (const entry of entries) {
const host = extractHost(entry.url);
const key = host || `__invalid__:${entry.url}`;
if (!groups.has(key)) {
groups.set(key, { host, entries: [] });
}
groups.get(key).entries.push(entry);
}
return Array.from(groups.values());
}
async function runWithConcurrency(items, worker, concurrency) {
const executing = new Set();
for (const item of items) {
const task = Promise.resolve().then(() => worker(item));
executing.add(task);
const clean = () => executing.delete(task);
task.then(clean).catch(clean);
if (executing.size >= concurrency) {
await Promise.race(executing);
}
}
await Promise.allSettled(executing);
}
function updateEntryWithResult(entry, result) {
const now = new Date().toISOString();
entry.status = typeof result.status === "number" ? result.status : null;
entry.errorType = result.errorType || null;
entry.method = result.method;
entry.checkedAt = now;
}
function formatStatusForReport(entry) {
if (!entry) return "error";
if (entry.errorType === "timeout") return "timeout";
if (typeof entry.status === "number") return entry.status;
return "error";
}
function isDead(entry) {
if (!entry) return false;
if (entry.errorType === "timeout") return true;
if (typeof entry.status !== "number") return true;
return entry.status >= 400;
}
function getStatusOrder(value) {
if (typeof value === "number" && Number.isFinite(value)) {
return value;
}
const label = typeof value === "string" ? value.toLowerCase() : "";
if (label === "timeout") {
return 10000;
}
return 10001;
}
function buildDeadLinks(entries) {
const list = [];
for (const entry of Object.values(entries)) {
if (!isDead(entry)) continue;
list.push({
url: entry.url,
status: formatStatusForReport(entry),
locations: entry.locations || [],
});
}
return list.sort((a, b) => {
const orderDiff = getStatusOrder(a.status) - getStatusOrder(b.status);
if (orderDiff !== 0) return orderDiff;
if (typeof a.status === "number" && typeof b.status === "number") {
return a.status - b.status;
}
const labelDiff = String(a.status).localeCompare(String(b.status));
if (labelDiff !== 0) return labelDiff;
return a.url.localeCompare(b.url);
});
}
function logProgress(processed, total) {
process.stdout.write(`\rURLs vérifiées ${processed}/${total}`);
}
async function collectOccurrences() {
const occurrences = new Map();
const mdFiles = walk(CONTENT_DIR, [".md", ".markdown"]);
for (const file of mdFiles) {
await collectMarkdownLinks(file, occurrences);
}
const yamlFiles = walk(CONTENT_DIR, [".yaml", ".yml"]);
for (const file of yamlFiles) {
await collectYamlLinks(file, occurrences);
}
return occurrences;
}
function persistEntriesSnapshot(entries, snapshotMeta) {
const payload = {
generatedAt: snapshotMeta?.generatedAt || null,
links: Array.isArray(snapshotMeta?.links) ? snapshotMeta.links : [],
entries,
};
saveState(payload);
}
async function checkEntries(entriesToCheck, entries, snapshotMeta) {
if (entriesToCheck.length === 0) {
return;
}
const hostGroups = groupEntriesByHost(entriesToCheck);
const concurrency = Math.max(1, Math.min(MAX_CONCURRENT_HOSTS, hostGroups.length));
let processed = 0;
process.stdout.write(`Vérification de ${entriesToCheck.length} URL...\n`);
await runWithConcurrency(
hostGroups,
async ({ host, entries: groupEntries }) => {
for (const entry of groupEntries) {
if (host) {
await applyHostDelay(host);
}
let result = await checkUrl(entry.url, {
...BASE_HTTP_OPTIONS,
firstMethod: "GET",
retryWithGet: false,
});
recordHostCheck(host);
if (shouldRecheckWithPlaywright(result)) {
if (host) {
await applyHostDelay(host);
}
const playResult = await checkWithPlaywright(entry.url, {
userAgent: DEFAULT_USER_AGENT,
timeoutMs: PLAYWRIGHT_TIMEOUT_MS,
executablePath: PLAYWRIGHT_EXECUTABLE,
});
recordHostCheck(host);
result = playResult;
}
updateEntryWithResult(entries[entry.url], result);
persistEntriesSnapshot(entries, snapshotMeta);
processed += 1;
logProgress(processed, entriesToCheck.length);
}
},
concurrency
);
process.stdout.write("\n");
}
async function main() {
let occurrences = await collectOccurrences();
occurrences = filterIgnoredHosts(occurrences, IGNORE_HOSTS);
if (occurrences.size === 0) {
const emptyState = { generatedAt: new Date().toISOString(), links: [], entries: {} };
saveState(emptyState);
console.log("Aucun lien externe détecté.");
return;
}
const activeUrls = buildActiveUrlSet(occurrences);
const state = loadState();
const mergedEntries = mergeOccurrences(state.entries, occurrences);
const entriesArray = Object.values(mergedEntries);
const pending = entriesArray.filter((entry) => needsCheck(entry));
const snapshotMeta = {
generatedAt: state.generatedAt || null,
links: filterReportLinks(state.links, activeUrls),
};
await checkEntries(pending, mergedEntries, snapshotMeta);
const deadLinks = buildDeadLinks(mergedEntries);
const nextState = {
generatedAt: new Date().toISOString(),
links: filterReportLinks(deadLinks, activeUrls),
entries: mergedEntries,
};
saveState(nextState);
console.log(
`Liens externes analysés: ${entriesArray.length} URL (${deadLinks.length} mort(s)). Données écrites dans ${path.relative(
SITE_ROOT,
REPORT_PATH
)}`
);
}
main().catch((error) => {
console.error("Erreur lors de la vérification des liens:", error);
process.exitCode = 1;
});

View File

@@ -1,524 +0,0 @@
#!/usr/bin/env node
const fs = require("fs");
const path = require("path");
const yaml = require("js-yaml");
const { sanitizeUrlCandidate } = require("./lib/markdown_links");
const SITE_ROOT = path.resolve(__dirname, "..");
const CONTENT_DIR = path.join(SITE_ROOT, "content");
const TAXONOMIES_FILE = path.join(SITE_ROOT, "config", "_default", "taxonomies.yaml");
const TARGET_EXTENSIONS = new Set([".md", ".markdown", ".mdx", ".yaml", ".yml"]);
const MARKDOWN_EXTENSIONS = new Set([".md", ".markdown", ".mdx"]);
const INTERNAL_LINK_REGEX = /\/[^\s"'`<>\\\[\]{}|]+/g;
const VALID_PREFIX_REGEX = /[\s"'`([<{=:]/;
const PATH_KEY_REGEX = /^\s*(?:"path"|'path'|path)\s*:/i;
const FRONTMATTER_PATTERN = /^---\r?\n([\s\S]+?)\r?\n---\r?\n?/;
function toPosix(value) {
return value.split(path.sep).join("/");
}
function relativeToSite(filePath) {
return toPosix(path.relative(SITE_ROOT, filePath));
}
function isTargetFile(filePath) {
const ext = path.extname(filePath).toLowerCase();
return TARGET_EXTENSIONS.has(ext);
}
function isMarkdownFile(filePath) {
const ext = path.extname(filePath).toLowerCase();
return MARKDOWN_EXTENSIONS.has(ext);
}
function isYamlFile(filePath) {
const ext = path.extname(filePath).toLowerCase();
return ext === ".yaml" || ext === ".yml";
}
function collectContentEntries(rootDir) {
const files = [];
const directories = new Set(["/"]);
function walk(currentDir) {
const entries = fs.readdirSync(currentDir, { withFileTypes: true });
for (const entry of entries) {
const fullPath = path.join(currentDir, entry.name);
if (entry.isDirectory()) {
const relative = path.relative(rootDir, fullPath);
const normalized = relative ? `/${toPosix(relative)}` : "/";
directories.add(normalized);
walk(fullPath);
} else if (entry.isFile() && isTargetFile(fullPath)) {
files.push(fullPath);
}
}
}
walk(rootDir);
return { files, directories };
}
function collectTaxonomyKeywordPaths(files) {
const mapping = loadTaxonomyMapping(TAXONOMIES_FILE);
if (!mapping) {
return new Set();
}
const keywordPaths = new Set();
for (const filePath of files) {
if (!isMarkdownFile(filePath)) {
continue;
}
let raw;
try {
raw = fs.readFileSync(filePath, "utf8");
} catch (error) {
console.warn(
`Impossible de lire ${relativeToSite(filePath)} pour extraire les taxonomies (${error.message}).`,
);
continue;
}
const frontmatterMatch = raw.match(FRONTMATTER_PATTERN);
if (!frontmatterMatch) {
continue;
}
let frontmatter = {};
try {
frontmatter = yaml.load(frontmatterMatch[1]) || {};
} catch (error) {
console.warn(`Frontmatter invalide dans ${relativeToSite(filePath)} (${error.message}).`);
continue;
}
const keywords = extractTaxonomyKeywords(
frontmatter,
frontmatterMatch[1],
mapping.fieldToCanonical,
);
for (const keyword of keywords) {
const normalized = normalizeInternalLink(keyword.url);
if (normalized) {
keywordPaths.add(normalized);
}
}
}
return keywordPaths;
}
function loadTaxonomyMapping(configPath) {
let raw;
try {
raw = fs.readFileSync(configPath, "utf8");
} catch (error) {
console.warn(`Impossible de lire ${relativeToSite(configPath)} (${error.message}).`);
return null;
}
let data;
try {
data = yaml.load(raw) || {};
} catch (error) {
console.warn(`YAML invalide dans ${relativeToSite(configPath)} (${error.message}).`);
return null;
}
if (typeof data !== "object" || data === null) {
console.warn(`Format inattendu dans ${relativeToSite(configPath)}.`);
return null;
}
const fieldToCanonical = new Map();
for (const [singular, plural] of Object.entries(data)) {
const canonical =
typeof plural === "string" && plural.trim().length > 0 ? plural.trim() : singular.trim();
if (!canonical) continue;
const candidates = new Set([singular, canonical].filter(Boolean));
for (const candidate of candidates) {
fieldToCanonical.set(candidate, canonical);
}
}
if (fieldToCanonical.size === 0) {
console.warn("Aucune taxonomie valide n'a été trouvée.");
return null;
}
return { fieldToCanonical };
}
function extractTaxonomyKeywords(frontmatter, frontmatterRaw, fieldToCanonical) {
const keywords = [];
const seen = new Set();
function addKeyword(taxonomy, term) {
if (!taxonomy || typeof term !== "string") return;
const normalized = term.trim();
if (!normalized) return;
const slug = slugify(normalized);
if (!slug) return;
const key = `${taxonomy}::${normalized.toLowerCase()}`;
if (seen.has(key)) return;
seen.add(key);
keywords.push({
taxonomy,
term: normalized,
url: `/${taxonomy}/${slug}/`,
});
}
if (typeof frontmatter === "object" && frontmatter !== null) {
for (const [field, value] of Object.entries(frontmatter)) {
const canonical = fieldToCanonical.get(field);
if (!canonical) continue;
const terms = normalizeTerms(value);
for (const term of terms) {
addKeyword(canonical, term);
}
}
}
for (const entry of extractCommentedTerms(frontmatterRaw, fieldToCanonical)) {
addKeyword(entry.taxonomy, entry.term);
}
return keywords;
}
function normalizeTerms(value) {
if (Array.isArray(value)) {
return value.map((item) => normalizeTerm(item)).filter(Boolean);
}
const single = normalizeTerm(value);
return single ? [single] : [];
}
function normalizeTerm(value) {
if (typeof value !== "string") return null;
const trimmed = value.trim();
return trimmed.length > 0 ? trimmed : null;
}
function extractCommentedTerms(frontmatterRaw, fieldToCanonical) {
if (typeof frontmatterRaw !== "string" || frontmatterRaw.length === 0) {
return [];
}
const results = [];
const lines = frontmatterRaw.split(/\r?\n/);
let currentCanonical = null;
let currentIndent = 0;
for (const line of lines) {
const indent = getIndentation(line);
const fieldMatch = line.match(/^\s*([A-Za-z0-9_]+):\s*(?:#.*)?$/);
if (fieldMatch) {
const fieldName = fieldMatch[1];
currentCanonical = fieldToCanonical.get(fieldName) || null;
currentIndent = indent;
continue;
}
if (!currentCanonical) continue;
const commentMatch = line.match(/^\s*#\s*-\s+(.*)$/);
if (!commentMatch) continue;
if (indent <= currentIndent) continue;
const term = commentMatch[1].trim();
if (!term) continue;
results.push({ taxonomy: currentCanonical, term });
}
return results;
}
function getIndentation(line) {
if (typeof line !== "string" || line.length === 0) return 0;
const match = line.match(/^\s*/);
return match ? match[0].length : 0;
}
function slugify(value) {
return value
.normalize("NFD")
.replace(/\p{Diacritic}/gu, "")
.toLowerCase()
.replace(/[^a-z0-9]+/g, "-")
.replace(/^-+|-+$/g, "")
.replace(/-{2,}/g, "-");
}
function sanitizeInternalLink(raw) {
const candidate = sanitizeUrlCandidate(raw);
if (!candidate) return null;
if (!candidate.startsWith("/")) return null;
if (candidate.startsWith("//")) return null;
if (candidate.includes("://")) return null;
return candidate;
}
function normalizeInternalLink(link) {
if (typeof link !== "string" || !link.startsWith("/")) {
return null;
}
let normalized = link.split("?")[0];
normalized = normalized.split("#")[0];
normalized = normalized.replace(/\/+/g, "/");
normalized = normalized.replace(/\/+$/, "");
if (!normalized) {
normalized = "/";
}
return normalized;
}
function expectedDirForLink(link) {
if (link === "/") {
return CONTENT_DIR;
}
const relative = link.slice(1);
const segments = relative.split("/").filter(Boolean);
return path.join(CONTENT_DIR, ...segments);
}
function countRepeatedChar(text, startIndex, char) {
let count = 0;
while (text[startIndex + count] === char) {
count++;
}
return count;
}
function findMatchingPair(text, startIndex, openChar, closeChar) {
let depth = 0;
for (let i = startIndex; i < text.length; i++) {
const ch = text[i];
if (ch === "\\") {
i++;
continue;
}
if (ch === openChar) {
depth++;
} else if (ch === closeChar) {
depth--;
if (depth === 0) {
return i;
}
}
}
return -1;
}
function extractMarkdownLinksFromLine(line) {
const results = [];
let inlineFence = null;
for (let i = 0; i < line.length; i++) {
const ch = line[i];
if (ch === "`") {
const runLength = countRepeatedChar(line, i, "`");
if (!inlineFence) {
inlineFence = runLength;
} else if (inlineFence === runLength) {
inlineFence = null;
}
i += runLength - 1;
continue;
}
if (inlineFence) {
continue;
}
if (ch !== "[") {
continue;
}
const closeBracket = findMatchingPair(line, i, "[", "]");
if (closeBracket === -1) {
break;
}
let pointer = closeBracket + 1;
while (pointer < line.length && /\s/.test(line[pointer])) {
pointer++;
}
if (pointer >= line.length || line[pointer] !== "(") {
i = closeBracket;
continue;
}
const closeParen = findMatchingPair(line, pointer, "(", ")");
if (closeParen === -1) {
break;
}
const destination = line.slice(pointer + 1, closeParen);
results.push({ destination });
i = closeParen;
}
return results;
}
function extractInternalLinks(filePath) {
const content = fs.readFileSync(filePath, "utf8");
const lines = content.split(/\r?\n/);
const entries = [];
const skipPathKey = isYamlFile(filePath);
const treatAsMarkdown = isMarkdownFile(filePath);
let fenceDelimiter = null;
let inFrontMatter = false;
for (let index = 0; index < lines.length; index++) {
const line = lines[index];
const trimmed = line.trim();
if (treatAsMarkdown) {
if (index === 0 && trimmed === "---") {
inFrontMatter = true;
continue;
}
if (inFrontMatter) {
if (trimmed === "---") {
inFrontMatter = false;
}
continue;
}
const fenceMatch = trimmed.match(/^(```+|~~~+)/);
if (fenceMatch) {
const delimiterChar = fenceMatch[1][0];
if (!fenceDelimiter) {
fenceDelimiter = delimiterChar;
} else if (delimiterChar === fenceDelimiter) {
fenceDelimiter = null;
}
continue;
}
if (fenceDelimiter) {
continue;
}
const markdownLinks = extractMarkdownLinksFromLine(line);
for (const { destination } of markdownLinks) {
const sanitized = sanitizeInternalLink(destination);
if (!sanitized) continue;
const normalized = normalizeInternalLink(sanitized);
if (!normalized || normalized === "//") continue;
entries.push({ link: normalized, line: index + 1 });
}
continue;
}
if (skipPathKey && PATH_KEY_REGEX.test(line)) {
continue;
}
for (const match of line.matchAll(INTERNAL_LINK_REGEX)) {
const raw = match[0];
const startIndex = match.index ?? line.indexOf(raw);
if (startIndex > 0) {
const prevChar = line[startIndex - 1];
if (!VALID_PREFIX_REGEX.test(prevChar)) {
continue;
}
}
const sanitized = sanitizeInternalLink(raw);
if (!sanitized) continue;
const normalized = normalizeInternalLink(sanitized);
if (!normalized || normalized === "//") continue;
entries.push({ link: normalized, line: index + 1 });
}
}
return entries;
}
function addMissingLink(missingMap, link, filePath, line) {
let entry = missingMap.get(link);
if (!entry) {
entry = {
expectedPath: expectedDirForLink(link),
references: [],
referenceKeys: new Set(),
};
missingMap.set(link, entry);
}
const referenceKey = `${filePath}:${line}`;
if (entry.referenceKeys.has(referenceKey)) {
return;
}
entry.referenceKeys.add(referenceKey);
entry.references.push({
file: relativeToSite(filePath),
line,
});
}
function main() {
if (!fs.existsSync(CONTENT_DIR)) {
console.error(`Le dossier content est introuvable (${CONTENT_DIR}).`);
process.exit(1);
}
const { files, directories } = collectContentEntries(CONTENT_DIR);
const taxonomyPaths = collectTaxonomyKeywordPaths(files);
for (const keywordPath of taxonomyPaths) {
directories.add(keywordPath);
}
const missingLinks = new Map();
for (const filePath of files) {
let entries;
try {
entries = extractInternalLinks(filePath);
} catch (error) {
console.warn(`Impossible de lire ${relativeToSite(filePath)} (${error.message}).`);
continue;
}
for (const { link, line } of entries) {
if (directories.has(link)) {
continue;
}
addMissingLink(missingLinks, link, filePath, line);
}
}
if (missingLinks.size === 0) {
console.log("Tous les liens internes pointent vers un dossier existant.");
return;
}
console.error(`Liens internes cassés détectés: ${missingLinks.size}`);
const sorted = Array.from(missingLinks.entries()).sort((a, b) => a[0].localeCompare(b[0], "fr"));
for (const [link, data] of sorted) {
const expectedRelative = relativeToSite(data.expectedPath);
console.error(`- ${link} (attendu: ${expectedRelative})`);
for (const reference of data.references) {
console.error(`${reference.file}:${reference.line}`);
}
}
process.exitCode = 1;
}
if (require.main === module) {
try {
main();
} catch (error) {
console.error(`Erreur lors de la vérification des liens internes: ${error.message}`);
process.exit(1);
}
}

View File

@@ -1,136 +0,0 @@
{
"rebrickable": {
"apiKey": null
},
"externalLinks": {
"cacheDir": "tools/cache",
"cacheFile": "external_links.yaml",
"hostDelayMs": 2000,
"requestTimeoutSeconds": 5,
"cacheTtlSuccessDays": 30,
"cacheTtlClientErrorDays": 7,
"outputFormat": "markdown",
"outputFile": "tools/cache/external_links_report.md",
"userAgent": null,
"enableCookies": true,
"cookieJar": "tools/cache/curl_cookies.txt",
"usePlaywright": true,
"playwrightTimeoutSeconds": 10,
"playwrightExecutablePath": null,
"ignoreHosts": [
"10.0.2.1",
"web.archive.org",
"localhost",
"nas",
"selenium",
"ci.athaliasoft.com",
"rebrickable.com"
]
},
"weather": {
"timezone": "Europe/Paris",
"defaultHour": 12,
"defaultMinute": 0,
"windowMinutes": 60,
"precipitationThreshold": 0.1,
"providers": {
"influxdb": {
"url": null,
"org": "Dern",
"bucket": "weather",
"token": null,
"windowMinutes": 60,
"precipitationThreshold": 0.1,
"sensors": {
"temperature": {
"measurement": "°C",
"field": "value",
"tags": {
"entity_id": "station_meteo_bresser_exterieur_temperature"
}
},
"humidity": {
"measurement": "%",
"field": "value",
"tags": {
"entity_id": "station_meteo_bresser_exterieur_humidite_relative"
}
},
"pressure": {
"measurement": "hPa",
"field": "value",
"tags": {
"entity_id": "station_meteo_bresser_exterieur_pression_atmospherique"
}
},
"illuminance": {
"measurement": "lx",
"field": "value",
"tags": {
"entity_id": "station_meteo_bresser_exterieur_luminance"
}
},
"precipitations": {
"measurement": "mm/h",
"field": "value",
"threshold": 0.1,
"tags": {
"entity_id": "station_meteo_bresser_exterieur_precipitations"
}
},
"wind_speed": {
"measurement": "km/h",
"field": "value",
"unit": "km/h",
"tags": {
"entity_id": "station_meteo_bresser_exterieur_vitesse_du_vent"
}
},
"wind_direction": {
"measurement": "°",
"field": "value",
"tags": {
"entity_id": "station_meteo_bresser_exterieur_direction_du_vent"
}
}
}
},
"openMeteo": {
"latitude": null,
"longitude": null,
"timezone": "Europe/Paris",
"pressureOffset": 40,
"illuminanceToLuxFactor": 126.7,
"windowMinutes": 90,
"precipitationThreshold": 0.1
}
}
},
"goaccess": {
"url": null
},
"lemmy": {
"instanceUrl": null,
"siteUrl": "https://richard-dern.fr",
"auth": {
"username": null,
"password": null,
"jwt": null
},
"community": {
"prefixOverrides": {
"collections": "collection",
"critiques": "critique",
"interets": "interet"
},
"visibility": "Public",
"nsfw": false,
"descriptionTemplate": "Espace dédié aux échanges autour de {{path}}."
},
"cacheFile": "tools/cache/lemmy_sync.json",
"verificationTtlHours": {
"community": 168,
"post": 24
}
}
}

View File

@@ -1,107 +0,0 @@
{
"config": {
"dataOutput": "content/stats/data/stats.json",
"defaultImageDir": "content/stats/images"
},
"sections": [
{
"title": "Habitudes d'écriture",
"statistics": [
{
"key": "most_prolific_month",
"title": "Mois le plus prolifique",
"type": "variable",
"script": "tools/stats/most_prolific_month.js"
},
{
"key": "weekday_activity",
"title": "Articles et mots par jour",
"type": "graphic",
"script": "tools/stats/weekday_activity.py",
"image": "content/stats/images/weekday_activity.png"
},
{
"key": "articles_avg_per_month",
"title": "Moyenne d'articles par mois",
"type": "variable",
"script": "tools/stats/articles_avg_per_month.js"
},
{
"key": "articles_per_month",
"title": "Articles par mois",
"type": "graphic",
"script": "tools/stats/articles_per_month.py",
"image": "content/stats/images/articles_per_month.png"
},
{
"key": "articles_per_year",
"title": "Articles par an",
"type": "graphic",
"script": "tools/stats/articles_per_year.py",
"image": "content/stats/images/articles_per_year.png"
},
{
"key": "cumulative_articles",
"title": "Cumul articles / mots",
"type": "graphic",
"script": "tools/stats/cumulative_articles.py",
"image": "content/stats/images/cumulative_articles.png"
},
{
"key": "articles_per_section",
"title": "Articles par section",
"type": "graphic",
"script": "tools/stats/articles_per_section.py",
"image": "content/stats/images/articles_per_section.png"
},
{
"key": "words_per_article",
"title": "Nombre de mots par article",
"type": "graphic",
"script": "tools/stats/words_per_article.py",
"image": "content/stats/images/words_per_article.png"
},
{
"key": "words_histogram",
"title": "Distribution des longueurs",
"type": "graphic",
"script": "tools/stats/words_histogram.py",
"image": "content/stats/images/words_histogram.png"
},
{
"key": "weather_hexbin",
"title": "Conditions météo à la publication",
"type": "graphic",
"script": "tools/stats/weather_hexbin.py",
"image": "content/stats/images/weather_hexbin.png"
}
]
},
{
"title": "Visites",
"statistics": [
{
"key": "pageviews_per_month",
"title": "Pages vues (mois courant)",
"type": "variable",
"script": "tools/stats/goaccess_monthly.js",
"metric": "hits"
},
{
"key": "unique_visitors_per_month_value",
"title": "Visiteurs uniques (mois courant)",
"type": "variable",
"script": "tools/stats/goaccess_monthly.js",
"metric": "visitors"
},
{
"key": "top_requests",
"title": "Top requêtes (30 jours)",
"type": "graphic",
"script": "tools/stats/top_requests.py",
"image": "content/stats/images/top_requests.png"
}
]
}
]
}

View File

@@ -1,53 +0,0 @@
#!/usr/bin/env node
const fs = require("fs");
const path = require("path");
const sharp = require("sharp");
const PROJECT_ROOT = path.resolve(__dirname, "..");
const SOURCE_ICON_PATH = path.join(PROJECT_ROOT, "static", "favicon.png");
const APPLE_TOUCH_ICON_SIZE = 180;
const APPLE_TOUCH_ICON_BACKGROUND = "#060c14";
const OUTPUT_ICON_PATHS = [
path.join(PROJECT_ROOT, "static", "apple-touch-icon.png"),
path.join(PROJECT_ROOT, "static", "apple-touch-icon-precomposed.png"),
];
/**
* Génère le PNG Apple Touch à partir du favicon principal du site.
*
* L'image finale est rendue opaque sur le fond sombre du thème actif pour
* éviter les rendus incohérents des zones transparentes sur certains appareils iOS.
*
* @returns {Promise<Buffer>}
*/
function buildAppleTouchIconBuffer() {
return sharp(SOURCE_ICON_PATH)
.resize(APPLE_TOUCH_ICON_SIZE, APPLE_TOUCH_ICON_SIZE, {
fit: "cover",
})
.flatten({
background: APPLE_TOUCH_ICON_BACKGROUND,
})
.png({
compressionLevel: 9,
adaptiveFiltering: true,
})
.toBuffer();
}
/**
* Écrit la même icône sous les deux noms historiques encore demandés par les navigateurs.
*
* @param {Buffer} iconBuffer
*/
function writeAppleTouchIcons(iconBuffer) {
for (const outputPath of OUTPUT_ICON_PATHS) {
fs.writeFileSync(outputPath, iconBuffer);
}
}
(async function main() {
const iconBuffer = await buildAppleTouchIconBuffer();
writeAppleTouchIcons(iconBuffer);
})();

View File

@@ -1,199 +0,0 @@
const fs = require('fs/promises');
const path = require('path');
const { spawn } = require('child_process');
const os = require('os');
const { promptForBundlePath } = require('./lib/bundles');
const CONTENT_DIR = path.resolve('content');
const DIAGRAMS_DIR = 'diagrams';
const OUTPUT_DIR = 'images';
const MERMAID_EXTENSION = '.mermaid';
async function directoryExists(dirPath) {
try {
const stat = await fs.stat(dirPath);
return stat.isDirectory();
} catch (error) {
if (error.code === 'ENOENT') {
return false;
}
throw error;
}
}
async function collectMermaidFiles(root) {
async function walk(current) {
const entries = await fs.readdir(current, { withFileTypes: true });
const files = [];
for (const entry of entries) {
const entryPath = path.join(current, entry.name);
if (entry.isDirectory()) {
files.push(...await walk(entryPath));
} else if (entry.isFile() && path.extname(entry.name) === MERMAID_EXTENSION) {
files.push(entryPath);
}
}
return files;
}
return walk(root);
}
function getMermaidCliPath() {
const cliName = process.platform === 'win32' ? 'mmdc.cmd' : 'mmdc';
return path.resolve('node_modules', '.bin', cliName);
}
let cachedPuppeteerExecutablePath;
let cachedPuppeteerConfigPath;
function resolvePuppeteerExecutablePath() {
if (cachedPuppeteerExecutablePath !== undefined) {
return cachedPuppeteerExecutablePath;
}
const candidates = [
() => require('puppeteer'),
() => require('@mermaid-js/mermaid-cli/node_modules/puppeteer')
];
for (const load of candidates) {
try {
const module = load();
const executablePath = typeof module.executablePath === 'function' ? module.executablePath() : module.executablePath;
if (executablePath) {
cachedPuppeteerExecutablePath = executablePath;
return cachedPuppeteerExecutablePath;
}
} catch (error) {
if (error.code !== 'MODULE_NOT_FOUND') {
cachedPuppeteerExecutablePath = null;
throw error;
}
}
}
cachedPuppeteerExecutablePath = null;
return cachedPuppeteerExecutablePath;
}
async function ensurePuppeteerConfig() {
if (cachedPuppeteerConfigPath) {
return cachedPuppeteerConfigPath;
}
const executablePath = resolvePuppeteerExecutablePath();
const config = {
headless: 'new',
args: ['--no-sandbox', '--disable-setuid-sandbox', '--disable-dev-shm-usage'],
timeout: 60000
};
if (executablePath) {
config.executablePath = executablePath;
}
const dir = await fs.mkdtemp(path.join(os.tmpdir(), 'mmdc-'));
const filePath = path.join(dir, 'puppeteer-config.json');
await fs.writeFile(filePath, JSON.stringify(config, null, 2), 'utf8');
cachedPuppeteerConfigPath = filePath;
return cachedPuppeteerConfigPath;
}
async function renderMermaidDiagram(inputPath, outputPath) {
const cliPath = getMermaidCliPath();
const args = ['-i', inputPath, '-o', outputPath, '-e', 'png'];
const env = { ...process.env };
const executablePath = resolvePuppeteerExecutablePath();
if (executablePath) {
env.PUPPETEER_EXECUTABLE_PATH = executablePath;
}
const puppeteerConfig = await ensurePuppeteerConfig();
if (puppeteerConfig) {
args.push('-p', puppeteerConfig);
}
return new Promise((resolve, reject) => {
const child = spawn(cliPath, args, { stdio: 'inherit', env });
child.on('error', reject);
child.on('close', code => {
if (code === 0) {
resolve();
} else {
reject(new Error(`mmdc exited with code ${code}`));
}
});
});
}
async function ensureDirectory(dirPath) {
await fs.mkdir(dirPath, { recursive: true });
}
async function generateDiagrams(bundlePath) {
const diagramsRoot = path.join(bundlePath, DIAGRAMS_DIR);
if (!(await directoryExists(diagramsRoot))) {
console.log(`No "${DIAGRAMS_DIR}" directory found in ${bundlePath}. Nothing to generate.`);
return;
}
const mermaidFiles = await collectMermaidFiles(diagramsRoot);
if (mermaidFiles.length === 0) {
console.log(`No Mermaid diagrams found in ${diagramsRoot}. Nothing to generate.`);
return;
}
for (const mermaidFile of mermaidFiles) {
const contents = await fs.readFile(mermaidFile, 'utf8');
if (!contents.trim()) {
console.log(`Skipped: ${mermaidFile} (file is empty)`);
continue;
}
const relativePath = path.relative(diagramsRoot, mermaidFile);
const outputRelative = relativePath.replace(/\.mermaid$/i, '.png');
const outputPath = path.join(bundlePath, OUTPUT_DIR, outputRelative);
await ensureDirectory(path.dirname(outputPath));
console.log(`Rendering ${mermaidFile} -> ${outputPath}`);
await renderMermaidDiagram(mermaidFile, outputPath);
console.log(`Generated: ${outputPath}`);
}
}
async function main() {
const manualPath = process.argv[2];
const bundle = await promptForBundlePath(manualPath, { contentDir: CONTENT_DIR });
try {
await generateDiagrams(bundle);
} catch (error) {
console.error('Failed to generate Mermaid diagrams.');
console.error(error);
process.exitCode = 1;
}
}
main();

View File

@@ -1,71 +0,0 @@
const fs = require('fs/promises');
const fsSync = require('fs');
const path = require('path');
const { promptForBundlePath } = require('./lib/bundles');
const CONTENT_DIR = path.resolve('content');
const TEMPLATE_PATH = path.resolve('data/metadata_template.yaml');
const MEDIA_TYPES = ['images', 'sounds', 'videos'];
async function loadTemplate() {
return fs.readFile(TEMPLATE_PATH, 'utf8');
}
async function generateYamlFiles(bundlePath, yamlTemplate) {
console.log(`\nProcessing bundle: ${bundlePath}`);
for (const type of MEDIA_TYPES) {
const mediaDir = path.join(bundlePath, type);
const dataDir = path.join(bundlePath, 'data', type);
let dataDirEnsured = false; // Create lazily only if a YAML file must be written
if (!fsSync.existsSync(mediaDir)) {
console.log(`Skipped: no folder "${type}" found.`);
continue;
}
const files = await fs.readdir(mediaDir);
for (const file of files) {
const ext = path.extname(file);
if (!ext) {
console.log(`Ignored: ${file} (no extension)`);
continue;
}
const yamlName = path.basename(file, ext) + '.yaml';
const yamlPath = path.join(dataDir, yamlName);
if (fsSync.existsSync(yamlPath)) {
console.log(`Skipped: ${yamlPath} (already exists)`);
continue;
}
if (!dataDirEnsured) {
await fs.mkdir(dataDir, { recursive: true });
dataDirEnsured = true;
}
await fs.writeFile(yamlPath, yamlTemplate, 'utf8');
console.log(`Created: ${yamlPath}`);
}
}
}
async function main() {
const manualPath = process.argv[2];
const bundle = await promptForBundlePath(manualPath, { contentDir: CONTENT_DIR });
const template = await loadTemplate();
await generateYamlFiles(bundle, template);
}
main();

View File

@@ -1,360 +0,0 @@
#!/usr/bin/env node
const fs = require("fs/promises");
const path = require("path");
const { loadEnv } = require("./lib/env");
loadEnv();
const DEFAULT_CONFIG_PATH = "tools/config/stats.json";
const DEFAULT_DATA_OUTPUT = "content/stats/data/stats.json";
const DEFAULT_IMAGE_DIR = "content/stats/images";
function parseArgs(argv) {
const args = {};
for (let index = 0; index < argv.length; index += 1) {
const current = argv[index];
const next = argv[index + 1];
switch (current) {
case "--config":
case "-c":
args.config = next;
index += 1;
break;
case "--data":
case "-d":
args.data = next;
index += 1;
break;
case "--only":
case "-o":
args.only = next;
index += 1;
break;
default:
break;
}
}
return args;
}
async function loadDefinition(configPath) {
const raw = await fs.readFile(configPath, "utf8");
try {
return JSON.parse(raw);
} catch (error) {
throw new Error(`Impossible de parser ${configPath}: ${error.message}`);
}
}
async function loadModule(scriptPath) {
const resolved = path.resolve(scriptPath);
await fs.access(resolved);
// allow re-run without cache
delete require.cache[resolved];
const mod = require(resolved);
if (!mod || typeof mod.run !== "function") {
throw new Error(`Le script ${scriptPath} doit exporter une fonction run(context)`);
}
return mod;
}
function resolvePythonInterpreter() {
const envPython = process.env.VIRTUAL_ENV
? path.join(process.env.VIRTUAL_ENV, "bin", "python")
: null;
const candidates = [
envPython,
path.join(process.cwd(), ".venv", "bin", "python"),
path.join(process.cwd(), ".venv", "bin", "python3"),
"python3",
].filter(Boolean);
return candidates.find((candidate) => {
try {
const stat = require("fs").statSync(candidate);
return stat.isFile() || stat.isSymbolicLink();
} catch (_error) {
return false;
}
}) || "python3";
}
async function runPython(scriptPath, payload) {
const resolved = path.resolve(scriptPath);
await fs.access(resolved);
const interpreter = resolvePythonInterpreter();
return new Promise((resolve, reject) => {
const child = require("child_process").spawn(interpreter, [resolved], {
stdio: ["pipe", "pipe", "pipe"],
});
let stdout = "";
let stderr = "";
child.stdout.on("data", (data) => {
stdout += data.toString();
});
child.stderr.on("data", (data) => {
stderr += data.toString();
});
child.on("error", (error) => {
reject(error);
});
child.on("close", (code) => {
if (code !== 0) {
const err = new Error(stderr || `Python exited with code ${code}`);
err.code = code;
return reject(err);
}
const trimmed = stdout.trim();
if (!trimmed) return resolve({});
try {
resolve(JSON.parse(trimmed));
} catch (error) {
reject(new Error(`Invalid JSON from ${scriptPath}: ${error.message}`));
}
});
child.stdin.write(JSON.stringify(payload));
child.stdin.end();
});
}
function toPublicPath(target, { rootDir = process.cwd(), staticDir = path.resolve("static"), absOutput } = {}) {
if (!target) return "";
const normalized = target.replace(/\\/g, "/");
if (/^https?:\/\//i.test(normalized)) return normalized;
if (normalized.startsWith("/")) {
return normalized.replace(/\/{2,}/g, "/");
}
const absolute = absOutput || path.resolve(rootDir, target);
if (absolute.startsWith(staticDir)) {
const rel = path.relative(staticDir, absolute).replace(/\\/g, "/");
return `/${rel}`;
}
if (!path.isAbsolute(target) && !target.startsWith("/")) {
return normalized;
}
const relRoot = path.relative(rootDir, absolute).replace(/\\/g, "/");
return `/${relRoot}`;
}
function resolveGraphicPaths(stat, defaultImageDir, { rootDir, staticDir }) {
const target = stat.image || (defaultImageDir ? path.join(defaultImageDir, `${stat.key}.png`) : null);
if (!target) {
throw new Error("Chemin d'image manquant (image ou defaultImageDir)");
}
const absOutput = path.isAbsolute(target) ? target : path.resolve(rootDir, target);
const publicPath = toPublicPath(target, { rootDir, staticDir, absOutput });
return { publicPath, outputPath: absOutput };
}
function mergeResult(base, result, { publicPath } = {}) {
const entry = { ...base };
if (publicPath && base.type === "graphic") {
entry.image = publicPath;
}
if (result === undefined || result === null) {
return entry;
}
if (typeof result === "object" && !Array.isArray(result)) {
if (Object.prototype.hasOwnProperty.call(result, "value")) {
entry.value = result.value;
}
if (result.image) {
entry.image = toPublicPath(result.image);
}
if (result.meta) {
entry.meta = result.meta;
}
if (result.data) {
entry.data = result.data;
}
return entry;
}
entry.value = result;
return entry;
}
async function runStat(stat, context) {
const base = {
key: stat.key,
title: stat.title,
type: stat.type,
};
if (!stat.key) {
throw new Error("Cle manquante pour cette statistique");
}
if (!stat.script) {
throw new Error(`Script manquant pour ${stat.key}`);
}
if (!stat.type) {
throw new Error(`Type manquant pour ${stat.key}`);
}
const isPython = stat.script.endsWith(".py");
if (isPython) {
if (stat.type === "graphic") {
const { publicPath, outputPath } = resolveGraphicPaths(stat, context.defaultImageDir, context);
await fs.mkdir(path.dirname(outputPath), { recursive: true });
const result = await runPython(stat.script, {
...context,
stat,
outputPath,
publicPath,
});
return mergeResult(base, result, { publicPath });
}
if (stat.type === "variable") {
const result = await runPython(stat.script, {
...context,
stat,
});
return mergeResult(base, result);
}
throw new Error(`Type inconnu pour ${stat.key}: ${stat.type}`);
} else {
const mod = await loadModule(stat.script);
if (stat.type === "graphic") {
const { publicPath, outputPath } = resolveGraphicPaths(stat, context.defaultImageDir, context);
await fs.mkdir(path.dirname(outputPath), { recursive: true });
const result = await mod.run({
...context,
stat,
outputPath,
publicPath,
});
return mergeResult(base, result, { publicPath });
}
if (stat.type === "variable") {
const result = await mod.run({
...context,
stat,
});
return mergeResult(base, result);
}
throw new Error(`Type inconnu pour ${stat.key}: ${stat.type}`);
}
}
function buildOnlyFilter(onlyRaw) {
if (!onlyRaw) return null;
const parts = onlyRaw
.split(",")
.map((part) => part.trim())
.filter(Boolean);
return new Set(parts);
}
async function main() {
const cliArgs = parseArgs(process.argv.slice(2));
const definitionPath = path.resolve(cliArgs.config || DEFAULT_CONFIG_PATH);
const definition = await loadDefinition(definitionPath);
const statsConfig = definition.config || {};
const dataOutput = path.resolve(cliArgs.data || statsConfig.dataOutput || DEFAULT_DATA_OUTPUT);
const defaultImageDir = statsConfig.defaultImageDir || DEFAULT_IMAGE_DIR;
const onlyFilter = buildOnlyFilter(cliArgs.only);
const context = {
rootDir: process.cwd(),
contentDir: path.resolve("content"),
staticDir: path.resolve("static"),
definitionPath,
defaultImageDir,
config: statsConfig,
};
const output = {
generated_at: new Date().toISOString(),
sections: [],
};
const errors = [];
for (const section of definition.sections || []) {
const results = [];
for (const stat of section.statistics || []) {
if (onlyFilter && !onlyFilter.has(stat.key)) {
continue;
}
try {
const entry = await runStat(stat, context);
results.push(entry);
console.log(`[ok] ${stat.key}`);
} catch (error) {
errors.push({ key: stat.key, message: error.message });
console.error(`[err] ${stat.key}: ${error.message}`);
results.push({
key: stat.key,
title: stat.title,
type: stat.type,
error: error.message,
image: stat.image ? toPublicPath(stat.image) : undefined,
});
}
}
if (results.length > 0) {
output.sections.push({
title: section.title,
statistics: results,
});
}
}
if (errors.length > 0) {
output.errors = errors;
}
await fs.mkdir(path.dirname(dataOutput), { recursive: true });
await fs.writeFile(dataOutput, `${JSON.stringify(output, null, 2)}\n`, "utf8");
const relativeOutput = path.relative(process.cwd(), dataOutput);
console.log(`\nFichier de donnees genere: ${relativeOutput}`);
if (errors.length > 0) {
console.log(`Statistiques en erreur: ${errors.length}. Les entrees concernees contiennent le message d'erreur.`);
}
}
main().catch((error) => {
console.error(error.message);
process.exit(1);
});

View File

@@ -1 +0,0 @@
"""Shared Python helpers for tools."""

View File

@@ -1,167 +0,0 @@
const { fetch } = require("undici");
const ARCHIVE_CDX_URL = "https://web.archive.org/cdx/search/cdx";
const ARCHIVE_SAVE_URL = "https://web.archive.org/save/";
const ARCHIVE_REQUEST_TIMEOUT_MS = 15000;
/**
* Construit l'URL publique d'une capture Wayback.
* @param {string} originalUrl URL d'origine.
* @param {string} timestamp Horodatage Wayback.
* @returns {string} URL archive.org utilisable directement.
*/
function buildArchiveCaptureUrl(originalUrl, timestamp) {
return `https://web.archive.org/web/${timestamp}/${originalUrl}`;
}
/**
* Borne une valeur numerique a un entier strictement positif.
* @param {unknown} value Valeur a verifier.
* @param {number} fallback Valeur par defaut.
* @returns {number} Entier positif.
*/
function normalizePositiveInteger(value, fallback) {
const parsed = Number.parseInt(String(value), 10);
if (Number.isNaN(parsed)) {
return fallback;
}
if (parsed <= 0) {
return fallback;
}
return parsed;
}
/**
* Charge un document JSON Archive.org avec un delai maximal.
* @param {string|URL} url URL a appeler.
* @returns {Promise<unknown>} Document JSON decode.
*/
async function fetchArchiveJson(url) {
const controller = new AbortController();
const timer = setTimeout(() => controller.abort(), ARCHIVE_REQUEST_TIMEOUT_MS);
const response = await fetch(url, { signal: controller.signal }).finally(() => clearTimeout(timer));
if (!response.ok) {
throw new Error(`Erreur de l'API Archive.org (${response.status})`);
}
return response.json();
}
/**
* Liste les captures Wayback recentes disponibles pour une URL.
* @param {string} url URL d'origine a rechercher.
* @param {{ limit?: number }} options Options de requete.
* @returns {Promise<Array<{ timestamp: string, originalUrl: string, statusCode: number|null, mimetype: string|null, url: string }>>}
*/
async function listArchiveCaptures(url, options = {}) {
const limit = normalizePositiveInteger(options.limit, 10);
const requestUrl = new URL(ARCHIVE_CDX_URL);
requestUrl.searchParams.set("url", url);
requestUrl.searchParams.set("output", "json");
requestUrl.searchParams.set("fl", "timestamp,original,statuscode,mimetype,digest");
requestUrl.searchParams.set("filter", "statuscode:200");
requestUrl.searchParams.set("collapse", "digest");
requestUrl.searchParams.set("fastLatest", "true");
requestUrl.searchParams.set("limit", `-${limit}`);
const rows = await fetchArchiveJson(requestUrl);
if (!Array.isArray(rows)) {
return [];
}
if (rows.length <= 1) {
return [];
}
const header = rows[0];
if (!Array.isArray(header)) {
return [];
}
const timestampIndex = header.indexOf("timestamp");
const originalIndex = header.indexOf("original");
const statusCodeIndex = header.indexOf("statuscode");
const mimetypeIndex = header.indexOf("mimetype");
const captures = [];
for (const row of rows.slice(1)) {
if (!Array.isArray(row)) {
continue;
}
const timestamp = row[timestampIndex];
const originalUrl = row[originalIndex];
if (typeof timestamp !== "string") {
continue;
}
if (typeof originalUrl !== "string") {
continue;
}
let statusCode = null;
if (statusCodeIndex > -1) {
const parsedStatusCode = Number.parseInt(row[statusCodeIndex], 10);
if (!Number.isNaN(parsedStatusCode)) {
statusCode = parsedStatusCode;
}
}
let mimetype = null;
if (mimetypeIndex > -1) {
const rawMimetype = row[mimetypeIndex];
if (typeof rawMimetype === "string" && rawMimetype.trim()) {
mimetype = rawMimetype.trim();
}
}
captures.push({
timestamp,
originalUrl,
statusCode,
mimetype,
url: buildArchiveCaptureUrl(originalUrl, timestamp),
});
}
captures.sort((left, right) => right.timestamp.localeCompare(left.timestamp));
return captures.slice(0, limit);
}
/**
* Retourne la capture la plus recente disponible pour une URL.
* @param {string} url URL d'origine.
* @returns {Promise<string|null>} URL archive.org, ou null si aucune capture n'existe.
*/
async function getArchiveUrl(url) {
const captures = await listArchiveCaptures(url, { limit: 1 });
if (captures.length === 0) {
return null;
}
return captures[0].url;
}
/**
* Demande a Archive.org d'archiver une URL.
* @param {string} url URL a archiver.
* @returns {Promise<string|null>} URL finale de la capture si disponible.
*/
async function saveToArchive(url) {
const controller = new AbortController();
const timer = setTimeout(() => controller.abort(), ARCHIVE_REQUEST_TIMEOUT_MS);
const response = await fetch(`${ARCHIVE_SAVE_URL}${encodeURIComponent(url)}`, {
method: "POST",
signal: controller.signal,
}).finally(() => clearTimeout(timer));
if (!response.ok) {
throw new Error(`Erreur de sauvegarde Archive.org (${response.status})`);
}
if (response.url.includes("/save/")) {
return null;
}
return response.url;
}
module.exports = {
buildArchiveCaptureUrl,
listArchiveCaptures,
getArchiveUrl,
saveToArchive,
};

View File

@@ -1,172 +0,0 @@
const fs = require("node:fs");
const path = require("node:path");
const { readFrontmatterFile, writeFrontmatterFile } = require("./frontmatter");
const { resolveBundlePath, ensureBundleExists } = require("./bundles");
/**
* Vérifie que le chemin reste sous content/.
* @param {string} targetPath Chemin absolu à contrôler.
* @param {string} contentRoot Racine content/.
*/
function ensureWithinContent(targetPath, contentRoot) {
const relative = path.relative(contentRoot, targetPath);
if (relative.startsWith("..") || path.isAbsolute(relative)) {
throw new Error(`Le chemin ${targetPath} est en dehors de content/.`);
}
}
/**
* Découpe un chemin de bundle en segments relatifs.
* @param {string} bundleDir Chemin absolu du bundle.
* @param {string} contentRoot Racine content/.
* @returns {string[]} Segments relatifs.
*/
function splitRelativeParts(bundleDir, contentRoot) {
const relative = path.relative(contentRoot, bundleDir);
return relative.split(path.sep).filter(Boolean);
}
/**
* Résout la destination finale en tenant compte de l'arborescence de dates.
* @param {string} input Chemin de destination fourni.
* @param {string} slug Slug du bundle source.
* @param {{ segments: string[] }|null} sourceDate Segments de date de la source.
* @param {string} contentRoot Racine content/.
* @param {Function} isDateSegment Fonction de détection de date.
* @returns {{ bundleDir: string }} Chemin final du bundle.
*/
function resolveDestination(input, slug, sourceDate, contentRoot, isDateSegment) {
const resolved = path.resolve(input);
let destinationDir = resolved;
if (resolved.toLowerCase().endsWith(`${path.sep}index.md`)) {
destinationDir = path.dirname(resolved);
}
let includesSlug = false;
if (path.basename(destinationDir) === slug) {
includesSlug = true;
}
let baseDir = destinationDir;
if (includesSlug) {
baseDir = path.dirname(destinationDir);
}
const destDate = findDateSegments(splitRelativeParts(baseDir, contentRoot), isDateSegment);
if (sourceDate && !destDate) {
baseDir = path.join(baseDir, ...sourceDate.segments);
}
let bundleDir = baseDir;
if (!includesSlug) {
bundleDir = path.join(baseDir, slug);
}
return { bundleDir };
}
/**
* Déplace un bundle dans sa nouvelle destination.
* @param {string} sourceDir Chemin source.
* @param {string} destinationDir Chemin cible.
*/
function moveBundle(sourceDir, destinationDir) {
fs.mkdirSync(path.dirname(destinationDir), { recursive: true });
fs.renameSync(sourceDir, destinationDir);
}
/**
* Ajoute un alias Hugo vers l'ancien chemin.
* @param {string} bundleDir Chemin du bundle déplacé.
* @param {string[]} oldParts Segments de l'ancien chemin relatif.
*/
function addAlias(bundleDir, oldParts) {
const indexPath = path.join(bundleDir, "index.md");
const frontmatter = readFrontmatterFile(indexPath);
if (!frontmatter) {
throw new Error(`Frontmatter introuvable pour ${bundleDir}.`);
}
const alias = `/${oldParts.join("/")}/`;
const aliases = normalizeAliases(frontmatter.data.aliases);
if (!aliases.includes(alias)) {
aliases.push(alias);
}
frontmatter.data.aliases = aliases;
writeFrontmatterFile(indexPath, frontmatter.data, frontmatter.body);
}
/**
* Normalise un champ aliases en tableau de chaînes.
* @param {unknown} value Valeur brute du frontmatter.
* @returns {string[]} Tableau nettoyé.
*/
function normalizeAliases(value) {
const aliases = [];
if (Array.isArray(value)) {
for (const entry of value) {
if (typeof entry === "string" && entry.trim()) {
aliases.push(entry.trim());
}
}
return aliases;
}
if (typeof value === "string" && value.trim()) {
aliases.push(value.trim());
}
return aliases;
}
/**
* Supprime les dossiers parents vides jusqu'à content/.
* @param {string} startDir Dossier de départ.
* @param {string} stopDir Dossier racine à préserver.
*/
function cleanupEmptyParents(startDir, stopDir) {
let current = startDir;
while (current.startsWith(stopDir)) {
if (!fs.existsSync(current)) {
current = path.dirname(current);
continue;
}
const entries = fs.readdirSync(current);
if (entries.length > 0) {
return;
}
fs.rmdirSync(current);
if (current === stopDir) {
return;
}
current = path.dirname(current);
}
}
/**
* Détecte une arborescence de date dans un chemin.
* @param {string[]} parts Segments à analyser.
* @param {Function} isDateSegment Fonction de détection de date.
* @returns {{ segments: string[] }|null} Segments de date ou null.
*/
function findDateSegments(parts, isDateSegment) {
let index = 0;
while (index < parts.length - 2) {
const dateSegments = isDateSegment(parts, index);
if (dateSegments) {
return { segments: dateSegments };
}
index += 1;
}
return null;
}
module.exports = {
resolveBundlePath,
ensureBundleExists,
ensureWithinContent,
splitRelativeParts,
resolveDestination,
moveBundle,
addAlias,
cleanupEmptyParents,
findDateSegments,
};

View File

@@ -1,163 +0,0 @@
const fs = require("node:fs");
const fsPromises = require("node:fs/promises");
const path = require("node:path");
const readline = require("node:readline/promises");
const { stdin, stdout } = require("node:process");
/**
* Normalise une entrée utilisateur vers le dossier du bundle.
* @param {string} input Chemin saisi par l'utilisateur.
* @returns {string} Chemin absolu du bundle.
*/
function resolveBundlePath(input) {
if (typeof input !== "string" || !input.trim()) {
throw new Error("Le chemin du bundle est vide.");
}
const resolved = path.resolve(input);
if (resolved.toLowerCase().endsWith(`${path.sep}index.md`)) {
return path.dirname(resolved);
}
return resolved;
}
/**
* Vérifie qu'un dossier correspond bien à un bundle Hugo.
* @param {string} bundleDir Chemin absolu du bundle.
*/
function ensureBundleExists(bundleDir) {
if (!fs.existsSync(bundleDir)) {
throw new Error(`Le bundle ${bundleDir} est introuvable.`);
}
const stats = fs.statSync(bundleDir);
if (!stats.isDirectory()) {
throw new Error(`Le bundle ${bundleDir} n'est pas un dossier.`);
}
const indexPath = path.join(bundleDir, "index.md");
if (!fs.existsSync(indexPath)) {
throw new Error(`Le bundle ${bundleDir} ne contient pas index.md.`);
}
}
/**
* Pose une question simple à l'utilisateur.
* @param {string} query Texte affiché dans le terminal.
* @returns {Promise<string>} Réponse nettoyée.
*/
async function askQuestion(query) {
const rl = readline.createInterface({ input: stdin, output: stdout });
const answer = await rl.question(query);
rl.close();
return answer.trim();
}
/**
* Cherche le bundle modifié le plus récemment sous un répertoire racine.
* @param {string} rootDir Racine à parcourir.
* @returns {Promise<string|null>} Chemin absolu du dernier bundle trouvé.
*/
async function findLatestBundle(rootDir) {
let latestPath = null;
let latestTime = 0;
await walk(rootDir);
return latestPath;
/**
* Parcourt récursivement l'arborescence et conserve le bundle le plus récent.
* @param {string} currentDir Dossier en cours d'analyse.
*/
async function walk(currentDir) {
const entries = await fsPromises.readdir(currentDir, { withFileTypes: true });
let hasIndex = false;
for (const entry of entries) {
if (entry.isFile() && entry.name === "index.md") {
hasIndex = true;
break;
}
}
if (hasIndex) {
const stats = await fsPromises.stat(currentDir);
if (stats.mtimeMs > latestTime) {
latestTime = stats.mtimeMs;
latestPath = currentDir;
}
return;
}
for (const entry of entries) {
if (!entry.isDirectory()) {
continue;
}
const childDir = path.join(currentDir, entry.name);
await walk(childDir);
}
}
}
/**
* Résout le bundle cible à partir d'un chemin manuel ou du dernier bundle trouvé.
* @param {string|null|undefined} manualPath Chemin optionnel fourni en argument.
* @param {{ contentDir: string, prompts?: { confirmLatest: Function, manualPath: string } }} options Options de résolution.
* @returns {Promise<string>} Chemin absolu du bundle retenu.
*/
async function promptForBundlePath(manualPath, options) {
let contentDir = path.resolve("content");
if (options && typeof options.contentDir === "string" && options.contentDir.trim()) {
contentDir = path.resolve(options.contentDir);
}
const defaultPrompts = {
confirmLatest(latest) {
return `Use latest bundle found: ${latest}? (Y/n) `;
},
manualPath: "Enter the relative path to your bundle: ",
};
let prompts = defaultPrompts;
if (options && options.prompts && typeof options.prompts === "object") {
prompts = {
confirmLatest: defaultPrompts.confirmLatest,
manualPath: defaultPrompts.manualPath,
};
if (typeof options.prompts.confirmLatest === "function") {
prompts.confirmLatest = options.prompts.confirmLatest;
}
if (typeof options.prompts.manualPath === "string" && options.prompts.manualPath.trim()) {
prompts.manualPath = options.prompts.manualPath;
}
}
if (typeof manualPath === "string" && manualPath.trim()) {
return resolveBundlePath(manualPath);
}
const latest = await findLatestBundle(contentDir);
if (!latest) {
throw new Error("Aucun bundle n'a été trouvé sous content/.");
}
const confirm = await askQuestion(prompts.confirmLatest(latest));
if (confirm.toLowerCase() === "n") {
const inputPath = await askQuestion(prompts.manualPath);
return resolveBundlePath(inputPath);
}
return latest;
}
module.exports = {
resolveBundlePath,
ensureBundleExists,
askQuestion,
findLatestBundle,
promptForBundlePath,
};

View File

@@ -1,95 +0,0 @@
const fs = require("fs/promises");
const path = require("path");
const { loadEnv } = require("./env");
let cached = null;
function parseNumber(value) {
const parsed = Number(value);
return Number.isFinite(parsed) ? parsed : null;
}
function applyEnvOverrides(config = {}) {
const merged = { ...config };
merged.rebrickable = { ...(config.rebrickable || {}) };
if (process.env.REBRICKABLE_API_KEY) {
merged.rebrickable.apiKey = process.env.REBRICKABLE_API_KEY;
}
const weather = config.weather || {};
const providers = weather.providers || {};
merged.weather = { ...weather, providers: { ...providers } };
merged.weather.providers.influxdb = { ...(providers.influxdb || {}) };
merged.weather.providers.openMeteo = { ...(providers.openMeteo || {}) };
if (process.env.WEATHER_INFLUXDB_URL) {
merged.weather.providers.influxdb.url = process.env.WEATHER_INFLUXDB_URL;
}
if (process.env.WEATHER_INFLUXDB_TOKEN) {
merged.weather.providers.influxdb.token = process.env.WEATHER_INFLUXDB_TOKEN;
}
const envLatitude = parseNumber(process.env.WEATHER_OPEN_METEO_LATITUDE);
if (envLatitude !== null) {
merged.weather.providers.openMeteo.latitude = envLatitude;
}
const envLongitude = parseNumber(process.env.WEATHER_OPEN_METEO_LONGITUDE);
if (envLongitude !== null) {
merged.weather.providers.openMeteo.longitude = envLongitude;
}
merged.goaccess = { ...(config.goaccess || {}) };
if (process.env.GOACCESS_URL) {
merged.goaccess.url = process.env.GOACCESS_URL;
}
const lemmy = config.lemmy || {};
const community = lemmy.community || {};
merged.lemmy = {
...lemmy,
auth: { ...(lemmy.auth || {}) },
community: {
...community,
prefixOverrides: { ...(community.prefixOverrides || {}) },
},
verificationTtlHours: { ...(lemmy.verificationTtlHours || {}) },
};
if (process.env.LEMMY_INSTANCE_URL) {
merged.lemmy.instanceUrl = process.env.LEMMY_INSTANCE_URL;
}
if (process.env.LEMMY_SITE_URL) {
merged.lemmy.siteUrl = process.env.LEMMY_SITE_URL;
}
if (process.env.LEMMY_JWT) {
merged.lemmy.auth.jwt = process.env.LEMMY_JWT;
}
if (process.env.LEMMY_USERNAME) {
merged.lemmy.auth.username = process.env.LEMMY_USERNAME;
}
if (process.env.LEMMY_PASSWORD) {
merged.lemmy.auth.password = process.env.LEMMY_PASSWORD;
}
return merged;
}
async function loadToolsConfig(configPath = "tools/config/config.json") {
const resolved = path.resolve(configPath);
if (cached && cached.path === resolved) {
return cached.data;
}
loadEnv();
const raw = await fs.readFile(resolved, "utf8");
const data = applyEnvOverrides(JSON.parse(raw));
cached = { path: resolved, data };
return data;
}
module.exports = {
applyEnvOverrides,
loadToolsConfig,
};

View File

@@ -1,216 +0,0 @@
const fs = require("fs/promises");
const path = require("path");
async function collectMarkdownFiles(rootDir, { skipIndex = true } = {}) {
const entries = await fs.readdir(rootDir, { withFileTypes: true });
const files = [];
for (const entry of entries) {
const fullPath = path.join(rootDir, entry.name);
if (entry.isDirectory()) {
const nested = await collectMarkdownFiles(fullPath, { skipIndex });
files.push(...nested);
continue;
}
if (!entry.isFile()) continue;
if (!entry.name.toLowerCase().endsWith(".md")) continue;
if (skipIndex && entry.name === "_index.md") continue;
files.push(fullPath);
}
return files;
}
async function collectSectionIndexDirs(rootDir) {
const sections = new Set();
async function walk(dir) {
let entries;
try {
entries = await fs.readdir(dir, { withFileTypes: true });
} catch (error) {
console.error(`Skipping section scan for ${dir}: ${error.message}`);
return;
}
let hasIndex = false;
for (const entry of entries) {
if (entry.isFile() && entry.name.toLowerCase() === "_index.md") {
hasIndex = true;
break;
}
}
if (hasIndex) {
sections.add(path.resolve(dir));
}
for (const entry of entries) {
if (entry.isDirectory()) {
await walk(path.join(dir, entry.name));
}
}
}
await walk(rootDir);
return sections;
}
async function resolveMarkdownTargets(inputs, { rootDir = process.cwd(), skipIndex = true } = {}) {
if (!inputs || inputs.length === 0) {
return collectMarkdownFiles(rootDir, { skipIndex });
}
const targets = new Set();
for (const input of inputs) {
const resolved = path.resolve(input);
try {
const stat = await fs.stat(resolved);
if (stat.isDirectory()) {
const nested = await collectMarkdownFiles(resolved, { skipIndex });
nested.forEach((file) => targets.add(file));
continue;
}
if (stat.isFile()) {
const lower = resolved.toLowerCase();
if (!lower.endsWith(".md")) continue;
if (skipIndex && path.basename(resolved) === "_index.md") continue;
targets.add(resolved);
}
} catch (error) {
console.error(`Skipping ${input}: ${error.message}`);
}
}
return Array.from(targets);
}
/**
* Collecte tous les fichiers correspondant a une liste d'extensions.
* @param {string} rootDir Racine a parcourir.
* @param {string[]} extensions Extensions attendues, avec le point.
* @param {{ skipDirs?: string[] }} options Options de parcours.
* @returns {Promise<string[]>} Fichiers trouves, tries par chemin.
*/
async function collectFilesByExtensions(rootDir, extensions, options = {}) {
const normalizedExtensions = new Set();
for (const extension of extensions) {
if (typeof extension !== "string") {
continue;
}
const candidate = extension.trim().toLowerCase();
if (!candidate) {
continue;
}
normalizedExtensions.add(candidate);
}
if (normalizedExtensions.size === 0) {
return [];
}
const skipDirs = new Set([".git", "node_modules"]);
if (Array.isArray(options.skipDirs)) {
for (const directoryName of options.skipDirs) {
if (typeof directoryName !== "string") {
continue;
}
const candidate = directoryName.trim();
if (!candidate) {
continue;
}
skipDirs.add(candidate);
}
}
const files = [];
await walk(rootDir);
files.sort((a, b) => a.localeCompare(b));
return files;
async function walk(currentDir) {
const entries = await fs.readdir(currentDir, { withFileTypes: true });
for (const entry of entries) {
const fullPath = path.join(currentDir, entry.name);
if (entry.isDirectory()) {
if (skipDirs.has(entry.name)) {
continue;
}
await walk(fullPath);
continue;
}
if (!entry.isFile()) {
continue;
}
const extension = path.extname(entry.name).toLowerCase();
if (!normalizedExtensions.has(extension)) {
continue;
}
files.push(fullPath);
}
}
}
async function collectBundles(rootDir) {
const bundles = [];
await walk(rootDir, rootDir, bundles);
bundles.sort((a, b) => a.relativePath.localeCompare(b.relativePath));
return bundles;
}
async function walk(rootDir, currentDir, bucket) {
let entries;
try {
entries = await fs.readdir(currentDir, { withFileTypes: true });
} catch (error) {
console.warn(`⚠️ Lecture impossible de ${currentDir}: ${error.message}`);
return;
}
let hasIndex = false;
for (const entry of entries) {
if (entry.isFile() && entry.name === "index.md") {
hasIndex = true;
break;
}
}
if (hasIndex) {
const relative = path.relative(rootDir, currentDir);
const parts = relative.split(path.sep).filter(Boolean);
const slug = parts[parts.length - 1] || path.basename(currentDir);
bucket.push({
dir: currentDir,
indexPath: path.join(currentDir, "index.md"),
relativePath: parts.join("/"),
parts,
slug,
});
}
for (const entry of entries) {
if (!entry.isDirectory()) continue;
if (entry.name === ".git" || entry.name === "node_modules") continue;
await walk(rootDir, path.join(currentDir, entry.name), bucket);
}
}
module.exports = {
collectMarkdownFiles,
collectSectionIndexDirs,
resolveMarkdownTargets,
collectFilesByExtensions,
collectBundles,
};

View File

@@ -1,191 +0,0 @@
const fs = require("fs");
const path = require("path");
const YAML = require("yaml");
const { DateTime } = require("luxon");
const HUGO_CONFIG_PATH = path.join(process.cwd(), "config", "_default", "config.yaml");
let cachedTimeZone = null;
/**
* Récupère le fuseau horaire configuré pour Hugo.
* @returns {string} Identifiant IANA du fuseau horaire de Hugo.
*/
function getHugoTimeZone() {
if (cachedTimeZone) {
return cachedTimeZone;
}
const rawConfig = fs.readFileSync(HUGO_CONFIG_PATH, "utf8");
const parsedConfig = YAML.parse(rawConfig);
if (!parsedConfig || !parsedConfig.timeZone) {
throw new Error("Aucun fuseau horaire Hugo n'a été trouvé dans config/_default/config.yaml.");
}
cachedTimeZone = String(parsedConfig.timeZone).trim();
if (!cachedTimeZone) {
throw new Error("Le fuseau horaire Hugo est vide ou invalide.");
}
return cachedTimeZone;
}
/**
* Parse une chaîne de date selon les formats Hugo attendus.
* @param {string} value Chaîne de date.
* @param {string} zone Fuseau horaire IANA.
* @param {number} defaultHour Heure par défaut si absente.
* @param {number} defaultMinute Minute par défaut si absente.
* @returns {import("luxon").DateTime|null} DateTime ou null si invalide.
*/
function parseHugoDateString(value, zone, defaultHour = 12, defaultMinute = 0) {
let trimmed = value.trim();
if (!trimmed) {
return null;
}
if (
(trimmed.startsWith("'") && trimmed.endsWith("'")) ||
(trimmed.startsWith("\"") && trimmed.endsWith("\""))
) {
trimmed = trimmed.slice(1, -1).trim();
}
const iso = DateTime.fromISO(trimmed, { setZone: true });
if (iso.isValid) {
return iso.setZone(zone);
}
const formats = [
"yyyy-LL-dd HH:mm:ss",
"yyyy-LL-dd'T'HH:mm:ss",
"yyyy-LL-dd HH:mm",
"yyyy-LL-dd'T'HH:mm",
"yyyy-LL-dd",
];
for (const format of formats) {
const parsed = DateTime.fromFormat(trimmed, format, { zone });
if (parsed.isValid) {
if (format === "yyyy-LL-dd") {
return parsed.set({ hour: defaultHour, minute: defaultMinute, second: 0, millisecond: 0 });
}
return parsed;
}
}
const rfc2822 = DateTime.fromRFC2822(trimmed, { setZone: true });
if (rfc2822.isValid) {
return rfc2822.setZone(zone);
}
return null;
}
/**
* Convertit une valeur vers un DateTime positionné sur le fuseau horaire Hugo.
* @param {Date|import("luxon").DateTime|string|number|null} value Valeur à convertir (null : maintenant).
* @returns {import("luxon").DateTime} Instance DateTime alignée sur le fuseau de Hugo.
*/
function toHugoDateTime(value = null) {
const zone = getHugoTimeZone();
if (value === null || value === undefined) {
const now = DateTime.now().setZone(zone);
if (!now.isValid) {
throw new Error(now.invalidReason || "Date actuelle invalide pour le fuseau Hugo.");
}
return now;
}
if (DateTime.isDateTime(value)) {
const zoned = value.setZone(zone);
if (!zoned.isValid) {
throw new Error(zoned.invalidReason || "DateTime invalide après application du fuseau Hugo.");
}
return zoned;
}
if (value instanceof Date) {
const zoned = DateTime.fromJSDate(value, { zone });
if (!zoned.isValid) {
throw new Error(zoned.invalidReason || "Date JS invalide pour le fuseau Hugo.");
}
return zoned;
}
if (typeof value === "string") {
const parsed = parseHugoDateString(value, zone);
if (!parsed) {
throw new Error(`Chaîne de date invalide : ${value}`);
}
return parsed;
}
if (typeof value === "number") {
const parsed = DateTime.fromMillis(value, { zone }).setZone(zone);
if (!parsed.isValid) {
throw new Error(parsed.invalidReason || "Horodatage numérique invalide pour le fuseau Hugo.");
}
return parsed;
}
throw new Error("Type de date non pris en charge pour le fuseau horaire Hugo.");
}
/**
* Formate une date selon le format Hugo simple (sans offset).
* @param {Date|import("luxon").DateTime|string|number|null} value Valeur à formater.
* @returns {string} Date formatée.
*/
function formatDateTime(value = null) {
const zoned = toHugoDateTime(value);
const normalized = zoned.set({ millisecond: 0 });
const formatted = normalized.toFormat("yyyy-LL-dd HH:mm:ss");
if (!formatted) {
throw new Error("Impossible de formater la date avec le fuseau Hugo.");
}
return formatted;
}
/**
* Convertit une valeur de frontmatter en DateTime si elle est valide.
* @param {import("luxon").DateTime|Date|string|number|null|undefined} value Valeur lue depuis le frontmatter.
* @returns {import("luxon").DateTime|null} DateTime utilisable ou null si invalide.
*/
function parseFrontmatterDate(value) {
const zone = getHugoTimeZone();
if (DateTime.isDateTime(value)) {
const zoned = value.setZone(zone);
return zoned.isValid ? zoned : null;
}
if (value instanceof Date) {
const zoned = DateTime.fromJSDate(value, { zone });
return zoned.isValid ? zoned : null;
}
if (typeof value === "string") {
const parsed = parseHugoDateString(value, zone);
return parsed;
}
if (typeof value === "number" && Number.isFinite(value)) {
const zoned = DateTime.fromMillis(value, { zone }).setZone(zone);
return zoned.isValid ? zoned : null;
}
return null;
}
module.exports = {
formatDateTime,
getHugoTimeZone,
toHugoDateTime,
parseFrontmatterDate,
parseHugoDateString,
};

View File

@@ -1,33 +0,0 @@
const fs = require("fs");
const path = require("path");
let envLoaded = false;
function loadEnv(envPath = path.resolve(__dirname, "..", "..", ".env")) {
if (envLoaded) return;
envLoaded = true;
if (!fs.existsSync(envPath)) return;
const content = fs.readFileSync(envPath, "utf8");
const lines = content.split(/\r?\n/);
for (const line of lines) {
const trimmed = line.trim();
if (!trimmed || trimmed.startsWith("#")) continue;
const separator = trimmed.indexOf("=");
if (separator === -1) continue;
const key = trimmed.slice(0, separator).trim();
const value = trimmed.slice(separator + 1);
if (!key || process.env[key] !== undefined) continue;
process.env[key] = value;
}
}
module.exports = {
loadEnv,
};

View File

@@ -1,198 +0,0 @@
const fs = require("node:fs");
const path = require("node:path");
const yaml = require("js-yaml");
const { loadToolsConfig } = require("./config");
const DEFAULT_CACHE_DIR = "tools/cache";
const DEFAULT_CACHE_FILE = "external_links.yaml";
/**
* Resout le chemin du rapport des liens externes a partir de la configuration.
* @param {string} siteRoot Racine du projet.
* @returns {Promise<string>} Chemin absolu du rapport YAML.
*/
async function resolveExternalLinksReportPath(siteRoot) {
const rootDir = path.resolve(siteRoot);
const configPath = path.join(rootDir, "tools", "config", "config.json");
const config = await loadToolsConfig(configPath);
let cacheDir = DEFAULT_CACHE_DIR;
const externalLinks = config.externalLinks;
if (externalLinks && typeof externalLinks.cacheDir === "string" && externalLinks.cacheDir.trim()) {
cacheDir = externalLinks.cacheDir.trim();
}
let cacheFile = DEFAULT_CACHE_FILE;
if (externalLinks && typeof externalLinks.cacheFile === "string" && externalLinks.cacheFile.trim()) {
cacheFile = externalLinks.cacheFile.trim();
}
let resolvedCacheDir = cacheDir;
if (!path.isAbsolute(resolvedCacheDir)) {
resolvedCacheDir = path.join(rootDir, resolvedCacheDir);
}
if (path.isAbsolute(cacheFile)) {
return cacheFile;
}
return path.join(resolvedCacheDir, cacheFile);
}
/**
* Normalise la liste des emplacements associes a un lien.
* @param {unknown[]} rawLocations Emplacements bruts.
* @returns {Array<{ file: string, line: number|null, page: string|null }>}
*/
function normalizeLocations(rawLocations) {
if (!Array.isArray(rawLocations)) {
return [];
}
const locations = [];
for (const rawLocation of rawLocations) {
if (!rawLocation || typeof rawLocation !== "object") {
continue;
}
let file = null;
if (typeof rawLocation.file === "string" && rawLocation.file.trim()) {
file = rawLocation.file.trim();
}
if (!file) {
continue;
}
let line = null;
if (typeof rawLocation.line === "number" && Number.isFinite(rawLocation.line)) {
line = rawLocation.line;
}
let page = null;
if (typeof rawLocation.page === "string" && rawLocation.page.trim()) {
page = rawLocation.page.trim();
}
locations.push({ file, line, page });
}
return locations;
}
/**
* Normalise une entree du rapport.
* @param {unknown} rawLink Entree brute.
* @returns {{ url: string, status: number|null, locations: Array<{ file: string, line: number|null, page: string|null }> }|null}
*/
function normalizeLink(rawLink) {
if (!rawLink || typeof rawLink !== "object") {
return null;
}
if (typeof rawLink.url !== "string" || !rawLink.url.trim()) {
return null;
}
let status = null;
if (typeof rawLink.status === "number" && Number.isFinite(rawLink.status)) {
status = rawLink.status;
}
if (typeof rawLink.status === "string" && rawLink.status.trim()) {
const parsedStatus = Number.parseInt(rawLink.status, 10);
if (!Number.isNaN(parsedStatus)) {
status = parsedStatus;
}
}
return {
url: rawLink.url.trim(),
status,
locations: normalizeLocations(rawLink.locations),
};
}
/**
* Reconstitue une liste de liens a partir de la section entries du cache.
* @param {Record<string, unknown>} entries Entrees brutes.
* @returns {Array<{ url: string, status: number|null, locations: Array<{ file: string, line: number|null, page: string|null }> }>}
*/
function buildLinksFromEntries(entries) {
const links = [];
for (const [url, rawEntry] of Object.entries(entries)) {
let status = null;
let locations = null;
if (rawEntry && typeof rawEntry === "object") {
status = rawEntry.status;
locations = rawEntry.locations;
}
const normalized = normalizeLink({
url,
status,
locations,
});
if (normalized) {
links.push(normalized);
}
}
return links;
}
/**
* Charge le rapport des liens externes.
* @param {string} reportPath Chemin absolu ou relatif du rapport YAML.
* @returns {{ generatedAt: string|null, links: Array<{ url: string, status: number|null, locations: Array<{ file: string, line: number|null, page: string|null }> }> }}
*/
function loadExternalLinksReport(reportPath) {
const resolvedPath = path.resolve(reportPath);
if (!fs.existsSync(resolvedPath)) {
return { generatedAt: null, links: [] };
}
const raw = yaml.load(fs.readFileSync(resolvedPath, "utf8")) || {};
let links = [];
if (Array.isArray(raw.links)) {
for (const rawLink of raw.links) {
const normalized = normalizeLink(rawLink);
if (normalized) {
links.push(normalized);
}
}
} else if (raw.entries && typeof raw.entries === "object") {
links = buildLinksFromEntries(raw.entries);
}
return {
generatedAt: raw.generatedAt || null,
links,
};
}
/**
* Filtre les liens du rapport par code de statut HTTP.
* @param {{ links?: Array<{ status: number|null }> }} report Rapport charge.
* @param {number} statusCode Code a retenir.
* @returns {Array<{ url: string, status: number|null, locations: Array<{ file: string, line: number|null, page: string|null }> }>}
*/
function getLinksByStatus(report, statusCode) {
if (!report || !Array.isArray(report.links)) {
return [];
}
const links = [];
for (const link of report.links) {
if (!link || typeof link !== "object") {
continue;
}
if (link.status !== statusCode) {
continue;
}
links.push(link);
}
return links;
}
module.exports = {
resolveExternalLinksReportPath,
loadExternalLinksReport,
getLinksByStatus,
};

View File

@@ -1,53 +0,0 @@
const fs = require("node:fs");
const yaml = require("js-yaml");
const FRONTMATTER_PATTERN = /^---\n([\s\S]*?)\n---\n?/;
/**
* Lit le frontmatter d'un fichier Markdown et retourne son contenu brut.
* La fonction préserve également le corps du fichier afin de permettre
* une réécriture propre après modification.
* @param {string} filePath Chemin absolu du fichier à analyser.
* @returns {{ data: Record<string, any>, body: string, frontmatterText: string, raw: string }|null}
*/
function readFrontmatterFile(filePath) {
const raw = fs.readFileSync(filePath, "utf8");
const match = raw.match(FRONTMATTER_PATTERN);
if (!match) {
return null;
}
const frontmatterText = match[1];
const data = yaml.load(frontmatterText) || {};
const body = raw.slice(match[0].length);
return {
data,
body,
frontmatterText,
raw,
};
}
/**
* Réécrit complètement le fichier Markdown avec un frontmatter mis à jour.
* @param {string} filePath Chemin absolu du fichier.
* @param {Record<string, any>} frontmatter Objet contenant les métadonnées.
* @param {string} body Corps Markdown déjà prêt à être réinséré.
*/
function writeFrontmatterFile(filePath, frontmatter, body) {
if (
typeof frontmatter !== "object" ||
frontmatter === null ||
Array.isArray(frontmatter)
) {
throw new Error(`Frontmatter invalide pour ${filePath}`);
}
const serialized = yaml.dump(frontmatter, { lineWidth: 120, sortKeys: false }).trimEnd();
const contentBody = typeof body === "string" ? body : "";
const rewritten = `---\n${serialized}\n---\n${contentBody}`;
fs.writeFileSync(filePath, rewritten, "utf8");
}
module.exports = {
readFrontmatterFile,
writeFrontmatterFile,
};

View File

@@ -1,299 +0,0 @@
const { fetch } = require("undici");
const DEFAULT_ACCEPT =
"text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8";
const DEFAULT_ACCEPT_LANGUAGE = "fr-FR,fr;q=0.9,en;q=0.7";
const DEFAULT_ACCEPT_ENCODING = "gzip, deflate, br";
const DEFAULT_CACHE_CONTROL = "no-cache";
const DEFAULT_PRAGMA = "no-cache";
const DEFAULT_TIMEOUT_MS = 5000;
const DEFAULT_MAX_REDIRECTS = 5;
const DEFAULT_USER_AGENTS = [
"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/122.0.0.0 Safari/537.36",
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/122.0.0.0 Safari/537.36",
];
const DEFAULT_VIEWPORT = { width: 1366, height: 768 };
const DEFAULT_PLAYWRIGHT_ARGS = ["--disable-blink-features=AutomationControlled"];
let playwrightModule = null;
function buildUserAgent(preferred) {
if (typeof preferred === "string" && preferred.trim()) {
return preferred.trim();
}
const index = Math.floor(Math.random() * DEFAULT_USER_AGENTS.length);
return DEFAULT_USER_AGENTS[index];
}
function extractChromeVersion(userAgent) {
if (typeof userAgent !== "string") {
return null;
}
const match = userAgent.match(/Chrome\/(\d+)/i);
if (match && match[1]) {
return match[1];
}
return null;
}
function isChromeLike(userAgent) {
if (typeof userAgent !== "string") {
return false;
}
return /Chrome\/\d+/i.test(userAgent);
}
function derivePlatform(userAgent) {
if (typeof userAgent !== "string") {
return null;
}
if (/Windows NT/i.test(userAgent)) {
return "Windows";
}
if (/Mac OS X/i.test(userAgent)) {
return "macOS";
}
if (/Android/i.test(userAgent)) {
return "Android";
}
if (/iPhone|iPad|iPod/i.test(userAgent)) {
return "iOS";
}
if (/Linux/i.test(userAgent)) {
return "Linux";
}
return null;
}
function isMobileUserAgent(userAgent) {
if (typeof userAgent !== "string") {
return false;
}
return /Mobile|Android|iPhone|iPad|iPod/i.test(userAgent);
}
function buildSecChUa(userAgent) {
if (!isChromeLike(userAgent)) {
return null;
}
const version = extractChromeVersion(userAgent) || "122";
return `"Chromium";v="${version}", "Not A(Brand";v="24", "Google Chrome";v="${version}"`;
}
function buildNavigationHeaders(url, userAgent, extraHeaders = {}) {
const platform = derivePlatform(userAgent);
const secChUa = buildSecChUa(userAgent);
const secChUaMobile = isMobileUserAgent(userAgent) ? "?1" : "?0";
const secChUaPlatform = platform ? `"${platform}"` : null;
const baseHeaders = {
"user-agent": userAgent,
accept: DEFAULT_ACCEPT,
"accept-language": DEFAULT_ACCEPT_LANGUAGE,
"accept-encoding": DEFAULT_ACCEPT_ENCODING,
"cache-control": DEFAULT_CACHE_CONTROL,
pragma: DEFAULT_PRAGMA,
dnt: "1",
connection: "keep-alive",
"upgrade-insecure-requests": "1",
"sec-fetch-site": "none",
"sec-fetch-mode": "navigate",
"sec-fetch-user": "?1",
"sec-fetch-dest": "document",
...extraHeaders,
};
if (secChUa) {
baseHeaders["sec-ch-ua"] = secChUa;
}
if (secChUaMobile) {
baseHeaders["sec-ch-ua-mobile"] = secChUaMobile;
}
if (secChUaPlatform) {
baseHeaders["sec-ch-ua-platform"] = secChUaPlatform;
}
return baseHeaders;
}
function loadPlaywright() {
if (playwrightModule) {
return playwrightModule;
}
playwrightModule = require("playwright");
return playwrightModule;
}
// Vérifie une URL via Playwright, en se rapprochant d'une navigation réelle.
async function checkWithPlaywright(url, options = {}) {
const userAgent = buildUserAgent(options.userAgent);
const timeoutMs = Number.isFinite(options.timeoutMs) ? options.timeoutMs : DEFAULT_TIMEOUT_MS;
const executablePath =
typeof options.executablePath === "string" && options.executablePath.trim()
? options.executablePath.trim()
: null;
const playwright = loadPlaywright();
let browser = null;
let context = null;
try {
browser = await playwright.chromium.launch({
headless: true,
executablePath: executablePath || undefined,
args: DEFAULT_PLAYWRIGHT_ARGS,
});
context = await browser.newContext({
viewport: { ...DEFAULT_VIEWPORT },
userAgent,
extraHTTPHeaders: buildNavigationHeaders(url, userAgent),
});
const page = await context.newPage();
try {
const response = await page.goto(url, { waitUntil: "domcontentloaded", timeout: timeoutMs });
const status = response ? response.status() : null;
const finalUrl = page.url() || url;
return {
status,
finalUrl,
method: "GET",
errorType: null,
};
} catch (error) {
return {
status: null,
finalUrl: url,
method: "GET",
errorType: error?.name === "TimeoutError" ? "timeout" : "network",
message: error?.message || null,
};
} finally {
if (context) {
await context.close();
}
if (browser) {
await browser.close();
}
}
} catch (error) {
// Toute erreur de chargement/initialisation Playwright doit interrompre le script.
throw error;
}
}
async function fetchWithRedirects(targetUrl, options, maxRedirects) {
let currentUrl = targetUrl;
let response = null;
let redirects = 0;
while (redirects <= maxRedirects) {
response = await fetch(currentUrl, { ...options, redirect: "manual" });
const location = response.headers.get("location");
if (
response.status >= 300 &&
response.status < 400 &&
location &&
redirects < maxRedirects
) {
if (response.body && typeof response.body.cancel === "function") {
try {
await response.body.cancel();
} catch (_) {
// Ignore cancellation errors; we're moving to the next hop.
}
}
currentUrl = new URL(location, currentUrl).toString();
redirects += 1;
continue;
}
break;
}
return response;
}
async function probeUrl(url, options = {}) {
const method = typeof options.method === "string" ? options.method.toUpperCase() : "GET";
const timeoutMs = Number.isFinite(options.timeoutMs) ? options.timeoutMs : DEFAULT_TIMEOUT_MS;
const maxRedirects = Number.isFinite(options.maxRedirects)
? options.maxRedirects
: DEFAULT_MAX_REDIRECTS;
const userAgent = buildUserAgent(options.userAgent);
const headers = buildNavigationHeaders(url, userAgent, options.headers || {});
const controller = new AbortController();
const timer = setTimeout(() => controller.abort(), timeoutMs);
try {
const response = await fetchWithRedirects(
url,
{
method,
headers,
signal: controller.signal,
},
maxRedirects
);
const status = response ? response.status : null;
const finalUrl = response?.url || url;
if (response?.body && typeof response.body.cancel === "function") {
try {
await response.body.cancel();
} catch (_) {
// Ignore cancellation errors; the status is all we needed.
}
}
return {
status,
finalUrl,
method,
errorType: null,
};
} catch (error) {
if (error.name === "AbortError") {
return {
status: null,
finalUrl: url,
method,
errorType: "timeout",
};
}
return {
status: null,
finalUrl: url,
method,
errorType: "network",
message: error.message,
};
} finally {
clearTimeout(timer);
}
}
function shouldRetry(result) {
if (!result) return true;
if (result.errorType) return true;
if (typeof result.status !== "number") return true;
return result.status >= 400;
}
async function checkUrl(url, options = {}) {
const firstMethod = options.firstMethod || "GET";
const retryWithGet =
typeof options.retryWithGet === "boolean"
? options.retryWithGet
: firstMethod === "HEAD";
let result = await probeUrl(url, { ...options, method: firstMethod });
if (retryWithGet && shouldRetry(result)) {
result = await probeUrl(url, { ...options, method: "GET" });
}
return result;
}
module.exports = {
buildUserAgent,
checkUrl,
probeUrl,
shouldRetry,
checkWithPlaywright,
};

View File

@@ -1,404 +0,0 @@
const crypto = require("node:crypto");
const { LemmyHttp } = require("lemmy-js-client");
const MAX_COMMUNITY_NAME_LENGTH = 20;
const MIN_COMMUNITY_NAME_LENGTH = 3;
/**
* Normalise la configuration Lemmy extraite des fichiers tools/config.
* @param {object} rawConfig Configuration brute.
* @returns {{ instanceUrl: string, siteUrl: string, auth: { jwt: string|null, username: string|null, password: string|null }, community: object }} Configuration prête à l'emploi.
*/
function normalizeLemmyConfig(rawConfig) {
if (!rawConfig || typeof rawConfig !== "object") {
throw new Error("La configuration Lemmy est manquante (tools/config/config.json).");
}
const instanceUrl = normalizeUrl(rawConfig.instanceUrl);
if (!instanceUrl) {
throw new Error(
"lemmy.instanceUrl doit être renseigné dans tools/config/config.json ou via l'environnement."
);
}
const siteUrl = normalizeUrl(rawConfig.siteUrl);
if (!siteUrl) {
throw new Error("lemmy.siteUrl doit être défini pour construire les URLs des articles.");
}
const auth = rawConfig.auth || {};
const hasJwt = typeof auth.jwt === "string" && auth.jwt.trim().length > 0;
const hasCredentials =
typeof auth.username === "string" &&
auth.username.trim().length > 0 &&
typeof auth.password === "string" &&
auth.password.length > 0;
if (!hasJwt && !hasCredentials) {
throw new Error("lemmy.auth.jwt ou lemmy.auth.username + lemmy.auth.password doivent être fournis.");
}
const prefixOverrides = buildOverrides(rawConfig.community?.prefixOverrides || {});
let jwt = null;
let username = null;
let password = null;
if (hasJwt) {
jwt = auth.jwt.trim();
}
if (hasCredentials) {
username = auth.username.trim();
password = auth.password;
}
let descriptionTemplate = "Espace dédié aux échanges autour de {{path}}.";
if (typeof rawConfig.community?.descriptionTemplate === "string") {
const trimmed = rawConfig.community.descriptionTemplate.trim();
if (trimmed) {
descriptionTemplate = trimmed;
}
}
return {
instanceUrl,
siteUrl,
auth: {
jwt,
username,
password,
},
community: {
prefixOverrides,
visibility: rawConfig.community?.visibility || "Public",
nsfw: rawConfig.community?.nsfw === true,
descriptionTemplate,
},
};
}
/**
* Crée un client Lemmy authentifié via JWT ou couple utilisateur/mot de passe.
* @param {object} lemmyConfig Configuration normalisée.
* @returns {Promise<LemmyHttp>} Client prêt pour les appels API.
*/
async function createLemmyClient(lemmyConfig) {
const client = new LemmyHttp(lemmyConfig.instanceUrl);
if (lemmyConfig.auth.jwt) {
client.setHeaders({ Authorization: `Bearer ${lemmyConfig.auth.jwt}` });
return client;
}
const loginResponse = await client.login({
username_or_email: lemmyConfig.auth.username,
password: lemmyConfig.auth.password,
});
client.setHeaders({ Authorization: `Bearer ${loginResponse.jwt}` });
return client;
}
/**
* Construit l'URL publique d'un article à partir de son chemin Hugo.
* @param {string} siteUrl Domaine du site.
* @param {string[]} parts Segments du chemin du bundle.
* @returns {string} URL finale.
*/
function buildArticleUrl(siteUrl, parts) {
const relative = parts.join("/");
return `${siteUrl}/${relative}`;
}
/**
* Transforme les segments de chemin en nom et titre de communauté Lemmy.
* @param {string[]} parts Segments du chemin du bundle.
* @param {object} communityConfig Configuration de la section community.
* @returns {{ name: string, title: string, description: string }}
*/
function buildCommunityDescriptor(parts, communityConfig) {
const intermediate = stripDateSegments(parts.slice(0, -1));
if (intermediate.length === 0) {
throw new Error(`Impossible de déduire une communauté depuis ${parts.join("/")}.`);
}
const normalized = intermediate.map((segment) => applyOverride(segment, communityConfig.prefixOverrides));
const sanitized = normalized.map((segment) => sanitizeSegment(segment)).filter(Boolean);
if (sanitized.length === 0) {
throw new Error(`Les segments ${intermediate.join("/")} sont invalides pour une communauté.`);
}
const name = enforceCommunityLength(sanitized);
if (name.length < MIN_COMMUNITY_NAME_LENGTH) {
throw new Error(`Nom de communauté trop court pour ${parts.join("/")}.`);
}
const title = normalized.map((segment) => capitalizeLabel(segment)).join(" / ");
const labelPath = normalized.join("/");
const description = buildCommunityDescription(communityConfig.descriptionTemplate, labelPath);
return { name, title, description };
}
/**
* Recherche une communauté par nom, la crée si nécessaire et force la restriction de publication aux modérateurs.
* @param {LemmyHttp} client Client Lemmy.
* @param {object} descriptor Nom, titre et description attendus.
* @param {object} communityConfig Paramètres nsfw/visibilité.
* @returns {Promise<{ id: number, name: string, title: string }>} Communauté finalisée.
*/
async function ensureCommunity(client, descriptor, communityConfig) {
const existing = await searchCommunity(client, descriptor.name);
if (existing) {
const existingCommunity = existing.community;
if (existingCommunity.posting_restricted_to_mods !== true) {
const edited = await client.editCommunity({
community_id: existingCommunity.id,
posting_restricted_to_mods: true,
});
const editedCommunity = edited.community_view.community;
return {
id: editedCommunity.id,
name: editedCommunity.name,
title: editedCommunity.title,
};
}
return {
id: existingCommunity.id,
name: existingCommunity.name,
title: existingCommunity.title,
};
}
const response = await client.createCommunity({
name: descriptor.name,
title: descriptor.title,
description: descriptor.description,
nsfw: communityConfig.nsfw,
posting_restricted_to_mods: true,
visibility: communityConfig.visibility,
});
const createdCommunity = response.community_view.community;
return {
id: createdCommunity.id,
name: createdCommunity.name,
title: createdCommunity.title,
};
}
/**
* Recherche une communauté précise via l'API de recherche.
* @param {LemmyHttp} client Client Lemmy.
* @param {string} name Nom recherché.
* @returns {Promise<object|null>} Vue de communauté ou null.
*/
async function searchCommunity(client, name) {
const response = await client.search({ q: name, type_: "Communities", limit: 50 });
if (!response.communities || response.communities.length === 0) {
return null;
}
return response.communities.find((communityView) => communityView.community.name === name) || null;
}
/**
* Crée une description de communauté à partir du template configuré.
* @param {string} template Modèle issu de la configuration.
* @param {string} labelPath Chemin textuel des segments.
* @returns {string} Description finale.
*/
function buildCommunityDescription(template, labelPath) {
if (template.includes("{{path}}")) {
return template.replace("{{path}}", labelPath);
}
return `${template} (${labelPath})`;
}
/**
* Supprime les segments correspondant à un pattern année/mois/jour consécutif.
* @param {string[]} segments Segments intermédiaires du chemin.
* @returns {string[]} Segments épurés des dates.
*/
function stripDateSegments(segments) {
const filtered = [];
let index = 0;
while (index < segments.length) {
if (
isYearSegment(segments[index]) &&
isMonthSegment(segments[index + 1]) &&
isDaySegment(segments[index + 2])
) {
index += 3;
continue;
}
filtered.push(segments[index]);
index += 1;
}
return filtered;
}
/**
* Applique une table de remplacements sur un segment en respectant la casse initiale.
* @param {string} segment Segment brut.
* @param {Record<string, string>} overrides Table de substitutions.
* @returns {string} Segment éventuellement remplacé.
*/
function applyOverride(segment, overrides) {
const lookup = segment.toLowerCase();
if (overrides[lookup]) {
return overrides[lookup];
}
return segment;
}
/**
* Nettoie un segment pour l'utiliser dans un nom de communauté Lemmy.
* @param {string} segment Valeur brute.
* @returns {string} Segment assaini.
*/
function sanitizeSegment(segment) {
return segment
.normalize("NFKD")
.replace(/[\u0300-\u036f]/g, "")
.toLowerCase()
.replace(/[^a-z0-9]+/g, "_")
.replace(/_{2,}/g, "_")
.replace(/^_|_$/g, "");
}
/**
* Garantit que le nom de communauté respecte la longueur imposée par Lemmy.
* @param {string[]} segments Segments déjà assainis.
* @returns {string} Nom final.
*/
function enforceCommunityLength(segments) {
const reduced = segments.map((segment) => segment.slice(0, MAX_COMMUNITY_NAME_LENGTH));
let current = reduced.join("_");
if (current.length <= MAX_COMMUNITY_NAME_LENGTH) {
return current;
}
const working = [...reduced];
let cursor = working.length - 1;
while (current.length > MAX_COMMUNITY_NAME_LENGTH && cursor >= 0) {
if (working[cursor].length > 1) {
working[cursor] = working[cursor].slice(0, -1);
current = working.join("_");
continue;
}
cursor -= 1;
}
if (current.length <= MAX_COMMUNITY_NAME_LENGTH) {
return current;
}
const compactSource = segments.join("_");
const compact = compactSource.replace(/_/g, "").slice(0, Math.max(1, MAX_COMMUNITY_NAME_LENGTH - 5));
const hash = crypto.createHash("sha1").update(compactSource).digest("hex");
const suffixLength = Math.max(2, MAX_COMMUNITY_NAME_LENGTH - compact.length - 1);
const suffix = hash.slice(0, suffixLength);
return `${compact}_${suffix}`;
}
/**
* Construit un titre lisible pour la communauté.
* @param {string} value Segment brut.
* @returns {string} Version capitalisée.
*/
function capitalizeLabel(value) {
const spaced = value.replace(/[-_]+/g, " ").replace(/\s+/g, " ").trim();
if (!spaced) {
return value;
}
return spaced.charAt(0).toUpperCase() + spaced.slice(1);
}
/**
* Indique si un segment représente une année sur 4 chiffres.
* @param {string|undefined} value Segment à tester.
* @returns {boolean} true si le segment est une année.
*/
function isYearSegment(value) {
if (typeof value !== "string") {
return false;
}
return /^\d{4}$/.test(value);
}
/**
* Indique si un segment représente un mois numérique valide.
* @param {string|undefined} value Segment à tester.
* @returns {boolean} true si le segment est un mois.
*/
function isMonthSegment(value) {
if (typeof value !== "string") {
return false;
}
if (!/^\d{1,2}$/.test(value)) {
return false;
}
const numeric = Number.parseInt(value, 10);
return numeric >= 1 && numeric <= 12;
}
/**
* Indique si un segment représente un jour numérique valide.
* @param {string|undefined} value Segment à tester.
* @returns {boolean} true si le segment est un jour.
*/
function isDaySegment(value) {
if (typeof value !== "string") {
return false;
}
if (!/^\d{1,2}$/.test(value)) {
return false;
}
const numeric = Number.parseInt(value, 10);
return numeric >= 1 && numeric <= 31;
}
/**
* Nettoie les URLs en supprimant les slashs terminaux et espaces superflus.
* @param {string|null} url URL brute.
* @returns {string|null} URL normalisée ou null.
*/
function normalizeUrl(url) {
if (typeof url !== "string") {
return null;
}
const trimmed = url.trim();
if (!trimmed) {
return null;
}
return trimmed.replace(/\/+$/, "");
}
/**
* Transforme l'objet de remplacements brut en table normalisée.
* @param {Record<string, string>} overrides Remplacements issus de la configuration.
* @returns {Record<string, string>} Table prête à l'emploi.
*/
function buildOverrides(overrides) {
const table = {};
for (const [key, value] of Object.entries(overrides)) {
if (typeof key !== "string" || typeof value !== "string") {
continue;
}
const normalizedKey = key.trim().toLowerCase();
const normalizedValue = value.trim();
if (normalizedKey && normalizedValue) {
table[normalizedKey] = normalizedValue;
}
}
return table;
}
module.exports = {
normalizeLemmyConfig,
createLemmyClient,
buildArticleUrl,
buildCommunityDescriptor,
ensureCommunity,
isYearSegment,
isMonthSegment,
isDaySegment,
};

View File

@@ -1,400 +0,0 @@
const fs = require("fs");
const readline = require("readline");
function trimUnbalancedTrailing(value, openChar, closeChar) {
let result = value;
while (result.endsWith(closeChar)) {
const openCount = (result.match(new RegExp(`\\${openChar}`, "g")) || []).length;
const closeCount = (result.match(new RegExp(`\\${closeChar}`, "g")) || []).length;
if (closeCount > openCount) {
result = result.slice(0, -1);
} else {
break;
}
}
return result;
}
function stripTrailingPunctuation(value) {
let result = value;
while (/[.,;:!?'"\u2018\u2019\u201C\u201D]+$/.test(result)) {
result = result.slice(0, -1);
}
return result;
}
function sanitizeUrlCandidate(raw, options = {}) {
if (typeof raw !== "string") return null;
let candidate = raw.trim();
if (!candidate) return null;
if (candidate.startsWith("<") && candidate.endsWith(">")) {
candidate = candidate.slice(1, -1).trim();
}
candidate = stripTrailingPunctuation(candidate);
if (!options.keepTrailingParens) {
candidate = trimUnbalancedTrailing(candidate, "(", ")");
} else if (candidate.endsWith(")")) {
const openCount = (candidate.match(/\(/g) || []).length;
const closeCount = (candidate.match(/\)/g) || []).length;
if (closeCount > openCount) {
candidate = trimUnbalancedTrailing(candidate, "(", ")");
}
}
candidate = trimUnbalancedTrailing(candidate, "[", "]");
candidate = trimUnbalancedTrailing(candidate, "{", "}");
candidate = stripTrailingPunctuation(candidate);
candidate = candidate.replace(/[)]+$/g, (suffix) => {
const toTrim = !options.keepTrailingParens ? suffix.length : Math.max(0, suffix.length - 1);
return ")".repeat(suffix.length - toTrim);
});
candidate = candidate.replace(/[*_]+$/, "");
candidate = candidate.replace(/\[\^[^\]]*\]$/, "");
candidate = stripTrailingPunctuation(candidate);
if (!options.keepTrailingParens) {
candidate = trimUnbalancedTrailing(candidate, "(", ")");
}
if ((candidate.match(/\(/g) || []).length > (candidate.match(/\)/g) || []).length) {
return null;
}
if ((candidate.match(/\[/g) || []).length > (candidate.match(/]/g) || []).length) {
return null;
}
if ((candidate.match(/{/g) || []).length > (candidate.match(/}/g) || []).length) {
return null;
}
return candidate || null;
}
function findMatchingPair(text, startIndex, openChar, closeChar) {
let depth = 0;
for (let i = startIndex; i < text.length; i++) {
const ch = text[i];
if (ch === "\\") {
i++;
continue;
}
if (ch === openChar) {
depth++;
} else if (ch === closeChar) {
depth--;
if (depth === 0) {
return i;
}
}
}
return -1;
}
function parseLinkDestination(raw) {
if (typeof raw !== "string") return null;
let candidate = raw.trim();
if (!candidate) return null;
if (candidate.startsWith("<")) {
const closeIndex = candidate.indexOf(">");
if (closeIndex > 0) {
return candidate.slice(1, closeIndex).trim();
}
}
let result = "";
let escaping = false;
let parenDepth = 0;
for (let i = 0; i < candidate.length; i++) {
const ch = candidate[i];
if (escaping) {
result += ch;
escaping = false;
continue;
}
if (ch === "\\") {
escaping = true;
continue;
}
if (ch === "(") {
parenDepth++;
} else if (ch === ")" && parenDepth > 0) {
parenDepth--;
} else if (/\s/.test(ch) && parenDepth === 0) {
break;
}
result += ch;
}
return result;
}
function extractMarkdownLinkTokens(text) {
const tokens = [];
for (let i = 0; i < text.length; i++) {
if (text[i] === "!") {
if (text[i + 1] !== "[") continue;
i += 1;
}
if (text[i] !== "[") continue;
const closeBracket = findMatchingPair(text, i, "[", "]");
if (closeBracket === -1) continue;
let pointer = closeBracket + 1;
while (pointer < text.length && /\s/.test(text[pointer])) pointer++;
if (pointer >= text.length || text[pointer] !== "(") {
i = closeBracket;
continue;
}
const openParen = pointer;
const closeParen = findMatchingPair(text, openParen, "(", ")");
if (closeParen === -1) {
break;
}
const rawDestination = text.slice(openParen + 1, closeParen);
const candidate = parseLinkDestination(rawDestination);
if (candidate) {
const startOffset = rawDestination.indexOf(candidate);
if (startOffset > -1) {
tokens.push({
url: candidate,
start: openParen + 1 + startOffset,
end: openParen + 1 + startOffset + candidate.length,
});
} else {
tokens.push({
url: candidate,
start: openParen + 1,
end: closeParen,
});
}
}
i = closeParen;
}
return tokens;
}
function extractMarkdownDestinations(text) {
return extractMarkdownLinkTokens(text).map((token) => token.url);
}
function isExternalLink(link) {
return typeof link === "string" && link.includes("://");
}
function stripMarkdownInlineCode(text) {
if (typeof text !== "string" || !text.includes("`")) {
return text;
}
let result = "";
let index = 0;
while (index < text.length) {
if (text[index] !== "`") {
result += text[index];
index += 1;
continue;
}
let fenceLength = 1;
while (index + fenceLength < text.length && text[index + fenceLength] === "`") {
fenceLength += 1;
}
const fence = "`".repeat(fenceLength);
const closingIndex = text.indexOf(fence, index + fenceLength);
if (closingIndex === -1) {
result += text.slice(index, index + fenceLength);
index += fenceLength;
continue;
}
const spanLength = closingIndex + fenceLength - index;
result += " ".repeat(spanLength);
index = closingIndex + fenceLength;
}
return result;
}
function parseMarkdownFence(line) {
if (typeof line !== "string") {
return null;
}
const match = line.match(/^[ ]{0,3}([`~]{3,})/);
if (!match) {
return null;
}
return {
marker: match[1][0],
length: match[1].length,
};
}
function isFenceClosingLine(line, activeFence) {
if (!activeFence || typeof line !== "string") {
return false;
}
const match = line.match(/^[ ]{0,3}([`~]{3,})[ \t]*$/);
if (!match) {
return false;
}
if (match[1][0] !== activeFence.marker) {
return false;
}
return match[1].length >= activeFence.length;
}
function isIndentedCodeLine(line) {
if (typeof line !== "string" || !line) {
return false;
}
return line.startsWith(" ") || line.startsWith("\t");
}
function extractLinksFromText(text) {
if (typeof text !== "string" || !text.includes("http")) {
return [];
}
const strippedText = stripMarkdownInlineCode(text);
if (typeof strippedText !== "string" || !strippedText.includes("http")) {
return [];
}
const results = [];
const seen = new Set();
const markdownLinkTokens = extractMarkdownLinkTokens(strippedText);
function addCandidate(candidate, options = {}) {
const sanitized = sanitizeUrlCandidate(candidate, options);
if (!sanitized) return;
if (!isExternalLink(sanitized)) return;
if (seen.has(sanitized)) return;
seen.add(sanitized);
results.push(sanitized);
}
for (const token of markdownLinkTokens) {
addCandidate(token.url, { keepTrailingParens: true });
}
const angleRegex = /<\s*(https?:\/\/[^>\s]+)\s*>/gi;
let match;
while ((match = angleRegex.exec(strippedText)) !== null) {
addCandidate(match[1]);
}
const autoRegex = /https?:\/\/[^\s<>"`]+/gi;
while ((match = autoRegex.exec(strippedText)) !== null) {
let overlapsMarkdownDestination = false;
for (const token of markdownLinkTokens) {
if (match.index >= token.start && match.index < token.end) {
overlapsMarkdownDestination = true;
break;
}
}
if (overlapsMarkdownDestination) {
continue;
}
addCandidate(match[0]);
}
return results;
}
async function collectMarkdownLinksFromStream(stream) {
const rl = readline.createInterface({ input: stream, crlfDelay: Infinity });
const results = [];
let lineNumber = 0;
let inFrontMatter = false;
let activeFence = null;
let inIndentedCodeBlock = false;
let previousLineBlank = true;
try {
for await (const line of rl) {
lineNumber++;
const trimmed = line.trim();
// Skip YAML front matter entirely; only scan Markdown content
if (lineNumber === 1 && trimmed === "---") {
inFrontMatter = true;
continue;
}
if (inFrontMatter) {
if (trimmed === "---") {
inFrontMatter = false;
}
continue;
}
if (activeFence) {
if (isFenceClosingLine(line, activeFence)) {
activeFence = null;
}
previousLineBlank = trimmed === "";
continue;
}
const openingFence = parseMarkdownFence(line);
if (openingFence) {
activeFence = openingFence;
previousLineBlank = trimmed === "";
continue;
}
if (inIndentedCodeBlock) {
if (trimmed === "") {
previousLineBlank = true;
continue;
}
if (isIndentedCodeLine(line)) {
previousLineBlank = false;
continue;
}
inIndentedCodeBlock = false;
}
if (previousLineBlank && isIndentedCodeLine(line)) {
inIndentedCodeBlock = true;
previousLineBlank = false;
continue;
}
for (const url of extractLinksFromText(line)) {
results.push({ url, line: lineNumber });
}
previousLineBlank = trimmed === "";
}
} finally {
rl.close();
if (typeof stream.close === "function") {
stream.close();
}
}
return results;
}
async function collectMarkdownLinksFromFile(filePath) {
const stream = fs.createReadStream(filePath, { encoding: "utf8" });
try {
return await collectMarkdownLinksFromStream(stream);
} catch (error) {
stream.destroy();
throw error;
}
}
module.exports = {
collectMarkdownLinksFromFile,
collectMarkdownLinksFromStream,
extractLinksFromText,
sanitizeUrlCandidate,
};

View File

@@ -1,71 +0,0 @@
/**
* Interprète une valeur booléenne potentiellement sérialisée.
* @param {unknown} value Valeur brute issue du frontmatter.
* @returns {boolean|null} true/false si interprétable, sinon null.
*/
function parseBoolean(value) {
if (typeof value === "boolean") {
return value;
}
if (typeof value !== "string") {
return null;
}
const normalized = value.trim().toLowerCase();
if (!normalized) {
return null;
}
if (normalized === "true" || normalized === "1" || normalized === "yes" || normalized === "on") {
return true;
}
if (normalized === "false" || normalized === "0" || normalized === "no" || normalized === "off") {
return false;
}
return null;
}
/**
* Détermine si la valeur `draft` représente un brouillon.
* @param {unknown} value Valeur brute de l'attribut `draft`.
* @returns {boolean} true si l'article est un brouillon.
*/
function isDraftValue(value) {
return parseBoolean(value) === true;
}
/**
* Indique si un frontmatter objet correspond à un article publié.
* @param {Record<string, unknown>|null|undefined} frontmatterData Données frontmatter sérialisées.
* @returns {boolean} true si l'article est considéré comme publié.
*/
function isEffectivelyPublished(frontmatterData) {
if (!frontmatterData || typeof frontmatterData !== "object") {
return true;
}
return isDraftValue(frontmatterData.draft) === false;
}
/**
* Indique si un document YAML frontmatter correspond à un article publié.
* @param {{ get: (key: string) => unknown }|null|undefined} doc Document YAML.
* @returns {boolean} true si l'article est considéré comme publié.
*/
function isEffectivelyPublishedDocument(doc) {
if (!doc || typeof doc.get !== "function") {
return true;
}
return isDraftValue(doc.get("draft")) === false;
}
module.exports = {
parseBoolean,
isDraftValue,
isEffectivelyPublished,
isEffectivelyPublishedDocument,
};

View File

@@ -1,107 +0,0 @@
const puppeteer = require("puppeteer-extra");
const StealthPlugin = require("puppeteer-extra-plugin-stealth");
const { buildUserAgent } = require("./http");
puppeteer.use(StealthPlugin());
/**
* Scrape a webpage to extract metadata and take a screenshot.
* @param {string} url - The URL of the page to scrape.
* @param {string} screenshotPath - Path where the screenshot should be saved.
* @param {object} options
* @param {string} [options.userAgent] - Optional user agent to use for the session.
* @returns {Promise<object>} - Metadata including title, description, keywords, language, and HTTP status.
*/
async function scrapePage(url, screenshotPath, options = {}) {
console.log(`🔍 Scraping: ${url}`);
const browser = await puppeteer.launch({
headless: true,
ignoreHTTPSErrors: true, // ✅ Ignore invalid SSL certificates
args: [
"--no-sandbox",
"--disable-setuid-sandbox",
"--disable-blink-features=AutomationControlled",
"--disable-web-security",
"--disable-features=site-per-process",
"--ignore-certificate-errors", // ✅ Disable strict SSL checking
"--ssl-version-min=tls1", // ✅ Allow older SSL/TLS versions
"--disable-features=IsolateOrigins,site-per-process", // ✅ Avoid site isolation (fixes blocked resources)
"--disable-site-isolation-trials",
"--disable-backgrounding-occluded-windows",
"--disable-renderer-backgrounding",
"--disable-background-timer-throttling",
"--disable-client-side-phishing-detection",
],
});
const page = await browser.newPage();
const userAgent = buildUserAgent(options.userAgent);
await page.setUserAgent(userAgent);
// Add headers to simulate a real browser
await page.setExtraHTTPHeaders({
"Referer": "https://www.google.com/",
"Accept-Language": "fr-FR;fr;en-US,en;q=0.9",
"Upgrade-Insecure-Requests": "1",
});
// Prevent detection of Puppeteer
await page.evaluateOnNewDocument(() => {
Object.defineProperty(navigator, "webdriver", { get: () => undefined });
});
await page.setViewport({ width: 1920, height: 1080 });
let metadata = {
title: "",
description: "",
keywords: [],
lang: "unknown",
httpStatus: null,
};
try {
await page.emulateMediaFeatures([
{ name: "prefers-color-scheme", value: "dark" },
]);
const response = await page.goto(url, { waitUntil: "domcontentloaded", timeout: 30000 });
metadata.httpStatus = response.status();
// Extract metadata
metadata.title = await page.title();
metadata.description = await page.$eval('meta[name="description"]', el => el.content).catch(() => "");
metadata.keywords = await page.$eval('meta[name="keywords"]', el => el.content)
.then(content => content.split(",").map(k => k.trim()))
.catch(() => []);
// 🌍 Detect page language
metadata.lang = await page.evaluate(() => {
// 1⃣ Try to get language from <html lang="xx">
let lang = document.documentElement.lang;
if (lang) return lang.toLowerCase();
// 2⃣ Try meta tags (og:locale)
const metaLang = document.querySelector('meta[property="og:locale"]');
if (metaLang) return metaLang.content.split("_")[0].toLowerCase(); // Convert "fr_FR" to "fr"
return "unknown";
});
if (screenshotPath) {
await page.screenshot({ path: screenshotPath, fullPage: true });
console.log(`✔ Screenshot saved: ${screenshotPath}`);
}
} catch (error) {
console.error(`❌ Error scraping page: ${error.message}`);
}
await browser.close();
return metadata;
}
module.exports = { scrapePage };

View File

@@ -1,528 +0,0 @@
#!/usr/bin/env python3
import argparse
import json
import math
import sys
import matplotlib
matplotlib.use("Agg")
import matplotlib.pyplot as plt # noqa: E402
import matplotlib.colors as mcolors # noqa: E402
import numpy as np # noqa: E402
PALETTE = [
"#467FFF", # blue-500
"#40C474", # green-500
"#FF4D5A", # red-500
"#FFA93D", # amber-500
"#9E63E9", # purple-500
"#2FC4FF", # cyan-500
"#98C0FF", # blue-300
"#8FE4A2", # green-300
"#FF939B", # red-300
"#FFD08C", # amber-300
"#D2AAF7", # purple-300
"#8EE8FF", # cyan-300
]
BACKGROUND = "#0F1114" # gray-900
TEXT = "#D9E0E8" # gray-300
GRID = (1.0, 1.0, 1.0, 0.16) # soft white grid
FIG_WIDTH = 20.0 # ~1920px at DPI=96
FIG_HEIGHT = 10.8 # 16:9 ratio
DPI = 96
BASE_FONT_SIZE = 16
TICK_FONT_SIZE = 15
LEGEND_FONT_SIZE = 14
TITLE_FONT_SIZE = 18
def setup_rcparams():
matplotlib.rcParams.update(
{
"figure.figsize": (FIG_WIDTH, FIG_HEIGHT),
"figure.dpi": DPI,
"axes.facecolor": BACKGROUND,
"figure.facecolor": BACKGROUND,
"axes.edgecolor": TEXT,
"axes.labelcolor": TEXT,
"xtick.color": TEXT,
"ytick.color": TEXT,
"text.color": TEXT,
"font.size": BASE_FONT_SIZE,
}
)
def new_axes():
fig, ax = plt.subplots()
fig.set_facecolor(BACKGROUND)
ax.set_facecolor(BACKGROUND)
ax.grid(True, axis="y", color=GRID, linestyle="--", linewidth=0.7)
return fig, ax
def render_articles_per_month(data, output):
labels = data.get("labels") or []
series = data.get("series") or []
title = data.get("title") or "Articles par mois"
if not labels or not series:
fig, ax = new_axes()
ax.text(
0.5,
0.5,
"Aucune donnees",
ha="center",
va="center",
fontsize=BASE_FONT_SIZE,
)
fig.savefig(output, bbox_inches="tight")
plt.close(fig)
return
x = np.arange(len(labels))
fig, ax = new_axes()
bottoms = np.zeros(len(labels))
for index, serie in enumerate(series):
values = np.array(serie.get("values") or [0] * len(labels), dtype=float)
color = PALETTE[index % len(PALETTE)]
ax.bar(x, values, bottom=bottoms, label=str(serie.get("label", "")), color=color, linewidth=0)
bottoms += values
ax.set_xticks(x)
ax.set_xticklabels(labels, rotation=45, ha="right", fontsize=TICK_FONT_SIZE)
ax.tick_params(axis="y", labelsize=TICK_FONT_SIZE)
ax.set_ylabel("Articles")
ax.set_title(title, fontsize=TITLE_FONT_SIZE, color=TEXT)
ax.legend(fontsize=LEGEND_FONT_SIZE)
fig.tight_layout()
fig.savefig(output, bbox_inches="tight")
plt.close(fig)
def render_articles_per_year(data, output):
labels = data.get("labels") or []
values = data.get("values") or []
title = data.get("title") or "Articles par an"
if not labels or not values:
fig, ax = new_axes()
ax.text(
0.5,
0.5,
"Aucune donnees",
ha="center",
va="center",
fontsize=BASE_FONT_SIZE,
)
fig.savefig(output, bbox_inches="tight")
plt.close(fig)
return
x = np.arange(len(labels))
fig, ax = new_axes()
ax.bar(x, values, color=PALETTE[0])
ax.set_xticks(x)
ax.set_xticklabels(labels, rotation=0, fontsize=TICK_FONT_SIZE)
ax.tick_params(axis="y", labelsize=TICK_FONT_SIZE)
ax.set_ylabel("Articles")
ax.set_title(title, fontsize=TITLE_FONT_SIZE, color=TEXT)
fig.tight_layout()
fig.savefig(output, bbox_inches="tight")
plt.close(fig)
def render_articles_per_section(data, output):
labels = data.get("labels") or []
values = data.get("values") or []
title = data.get("title") or "Articles par section"
if not labels or not values:
fig, ax = new_axes()
ax.text(
0.5,
0.5,
"Aucune donnees",
ha="center",
va="center",
fontsize=BASE_FONT_SIZE,
)
fig.savefig(output, bbox_inches="tight")
plt.close(fig)
return
fig, ax = new_axes()
# Donut chart
wedges, _ = ax.pie(
values,
labels=None,
colors=[PALETTE[i % len(PALETTE)] for i in range(len(values))],
startangle=90,
counterclock=False,
)
centre_circle = plt.Circle((0, 0), 0.60, fc=BACKGROUND)
fig.gca().add_artist(centre_circle)
ax.set_title(title, fontsize=TITLE_FONT_SIZE, color=TEXT)
ax.legend(
wedges,
labels,
title="Sections",
loc="center left",
bbox_to_anchor=(1.0, 0.5),
fontsize=LEGEND_FONT_SIZE,
title_fontsize=LEGEND_FONT_SIZE,
)
fig.tight_layout()
fig.savefig(output, bbox_inches="tight")
plt.close(fig)
def render_cumulative(data, output):
labels = data.get("labels") or []
articles = data.get("articles") or []
words = data.get("words") or []
title = data.get("title") or "Cumul articles / mots"
if not labels or (not articles and not words):
fig, ax = new_axes()
ax.text(
0.5,
0.5,
"Aucune donnees",
ha="center",
va="center",
fontsize=BASE_FONT_SIZE,
)
fig.savefig(output, bbox_inches="tight")
plt.close(fig)
return
x = np.arange(len(labels))
fig, ax_words = new_axes()
ax_articles = ax_words.twinx()
lines = []
labels_for_legend = []
if words:
lw = ax_words.plot(
x,
words,
label="Mots cumulés",
color=PALETTE[1],
linewidth=2.2,
marker="o",
markersize=4,
)
lines += lw
labels_for_legend += ["Mots cumulés"]
if articles:
la = ax_articles.plot(
x,
articles,
label="Articles cumulés",
color=PALETTE[0],
linewidth=2.2,
marker="o",
markersize=4,
)
lines += la
labels_for_legend += ["Articles cumulés"]
ax_words.set_xticks(x)
ax_words.set_xticklabels(labels, rotation=45, ha="right", fontsize=TICK_FONT_SIZE)
ax_words.tick_params(axis="y", labelsize=TICK_FONT_SIZE, colors=PALETTE[1])
ax_articles.tick_params(axis="y", labelsize=TICK_FONT_SIZE, colors=PALETTE[0])
ax_words.set_ylabel("Mots cumulés", color=PALETTE[1])
ax_articles.set_ylabel("Articles cumulés", color=PALETTE[0])
ax_words.set_title(title, fontsize=TITLE_FONT_SIZE, color=TEXT)
ax_articles.grid(False)
ax_words.grid(True, axis="y", color=GRID, linestyle="--", linewidth=0.7)
fig.legend(lines, labels_for_legend, loc="upper left", fontsize=LEGEND_FONT_SIZE)
fig.tight_layout()
fig.savefig(output, bbox_inches="tight")
plt.close(fig)
def render_words_histogram(data, output):
values = data.get("values") or []
title = data.get("title") or "Distribution des longueurs d'article"
bins = data.get("bins") or 20
fig, ax = new_axes()
if not values:
ax.text(
0.5,
0.5,
"Aucune donnees",
ha="center",
va="center",
fontsize=BASE_FONT_SIZE,
)
else:
ax.hist(values, bins=bins, color=PALETTE[0], edgecolor=TEXT, alpha=0.9)
ax.set_xlabel("Nombre de mots")
ax.set_ylabel("Articles")
ax.set_title(title, fontsize=TITLE_FONT_SIZE, color=TEXT)
fig.tight_layout()
fig.savefig(output, bbox_inches="tight")
plt.close(fig)
def render_top_requests(data, output):
labels = data.get("labels") or []
values = data.get("values") or []
title = data.get("title") or "Top requêtes"
fig, ax = new_axes()
if not labels or not values:
ax.text(
0.5,
0.5,
"Aucune donnees",
ha="center",
va="center",
fontsize=BASE_FONT_SIZE,
)
else:
y_pos = np.arange(len(labels))
ax.barh(y_pos, values, color=PALETTE[0])
ax.set_yticks(y_pos)
ax.set_yticklabels(labels, fontsize=TICK_FONT_SIZE)
ax.invert_yaxis()
ax.set_xlabel("Hits")
ax.set_title(title, fontsize=TITLE_FONT_SIZE, color=TEXT)
fig.tight_layout()
fig.savefig(output, bbox_inches="tight")
plt.close(fig)
def render_weather_hexbin(data, output):
temps = data.get("temps") or []
hums = data.get("hums") or []
presses = data.get("presses") or []
title = data.get("title") or "Météo à la publication"
fig, ax = new_axes()
if not temps or not hums:
ax.text(
0.5,
0.5,
"Aucune donnees",
ha="center",
va="center",
fontsize=BASE_FONT_SIZE,
)
else:
# If pressures are provided, use them for color; otherwise density
if presses and len(presses) == len(temps):
hb = ax.scatter(temps, hums, c=presses, cmap="viridis", alpha=0.75, s=50, edgecolors="none")
cbar = fig.colorbar(hb, ax=ax)
cbar.set_label("Pression (hPa)", color=TEXT)
cbar.ax.yaxis.set_tick_params(color=TEXT, labelsize=LEGEND_FONT_SIZE)
plt.setp(plt.getp(cbar.ax.axes, "yticklabels"), color=TEXT)
else:
norm = mcolors.LogNorm() if len(temps) > 0 else None
hb = ax.hexbin(
temps,
hums,
gridsize=28,
cmap="plasma",
mincnt=1,
linewidths=0.2,
edgecolors="none",
alpha=0.9,
norm=norm,
)
cbar = fig.colorbar(hb, ax=ax)
cbar.set_label("Densité", color=TEXT)
cbar.ax.yaxis.set_tick_params(color=TEXT, labelsize=LEGEND_FONT_SIZE)
plt.setp(plt.getp(cbar.ax.axes, "yticklabels"), color=TEXT)
ax.set_xlabel("Température (°C)")
ax.set_ylabel("Humidité (%)")
ax.tick_params(axis="x", labelsize=TICK_FONT_SIZE)
ax.tick_params(axis="y", labelsize=TICK_FONT_SIZE)
ax.set_title(title, fontsize=TITLE_FONT_SIZE, color=TEXT)
fig.tight_layout()
fig.savefig(output, bbox_inches="tight")
plt.close(fig)
def render_weekday_activity(data, output):
labels = data.get("labels") or []
articles = data.get("articles") or []
words = data.get("words") or []
title = data.get("title") or "Activité par jour"
fig, ax_left = new_axes()
ax_right = ax_left.twinx()
if not labels or (not articles and not words):
ax_left.text(
0.5,
0.5,
"Aucune donnees",
ha="center",
va="center",
fontsize=BASE_FONT_SIZE,
)
else:
x = np.arange(len(labels))
width = 0.38
bars_articles = ax_left.bar(
x - width / 2,
articles,
width=width,
label="Articles",
color=PALETTE[0],
)
bars_words = ax_right.bar(
x + width / 2,
words,
width=width,
label="Mots",
color=PALETTE[1],
)
ax_left.set_xticks(x)
ax_left.set_xticklabels(labels, rotation=0, fontsize=TICK_FONT_SIZE)
ax_left.tick_params(axis="y", labelsize=TICK_FONT_SIZE, colors=PALETTE[0])
ax_right.tick_params(axis="y", labelsize=TICK_FONT_SIZE, colors=PALETTE[1])
ax_left.set_ylabel("Articles", color=PALETTE[0])
ax_right.set_ylabel("Mots", color=PALETTE[1])
lines = [bars_articles, bars_words]
labels_for_legend = ["Articles", "Mots"]
fig.legend(lines, labels_for_legend, loc="upper right", fontsize=LEGEND_FONT_SIZE)
ax_left.set_title(title, fontsize=TITLE_FONT_SIZE, color=TEXT)
fig.tight_layout()
fig.savefig(output, bbox_inches="tight")
plt.close(fig)
def render_words_per_article(data, output):
labels = data.get("labels") or []
series = data.get("series") or []
title = data.get("title") or "Moyenne de mots par article (par mois)"
if not labels or not series:
fig, ax = new_axes()
ax.text(
0.5,
0.5,
"Aucune donnees",
ha="center",
va="center",
fontsize=BASE_FONT_SIZE,
)
fig.savefig(output, bbox_inches="tight")
plt.close(fig)
return
x = np.arange(len(labels))
n_series = len(series)
width = 0.8 / max(n_series, 1)
fig, ax = new_axes()
for index, serie in enumerate(series):
values = np.array(serie.get("values") or [0] * len(labels), dtype=float)
color = PALETTE[index % len(PALETTE)]
offset = (index - (n_series - 1) / 2) * width
ax.bar(x + offset, values, width=width, label=str(serie.get("label", "")), color=color, linewidth=0)
ax.set_xticks(x)
ax.set_xticklabels(labels, rotation=45, ha="right", fontsize=TICK_FONT_SIZE)
ax.tick_params(axis="y", labelsize=TICK_FONT_SIZE)
ax.set_ylabel("Mots par article (moyenne)")
ax.set_title(title, fontsize=TITLE_FONT_SIZE, color=TEXT)
ax.legend(fontsize=LEGEND_FONT_SIZE)
fig.tight_layout()
fig.savefig(output, bbox_inches="tight")
plt.close(fig)
def main():
parser = argparse.ArgumentParser(description="Render stats charts from JSON data.")
parser.add_argument(
"--type",
required=True,
choices=[
"articles_per_month",
"articles_per_year",
"articles_per_section",
"words_per_article",
"cumulative",
"words_histogram",
"top_requests",
"weather_hexbin",
"weekday_activity",
],
)
parser.add_argument("--output", required=True)
args = parser.parse_args()
try:
payload = json.load(sys.stdin)
except Exception as exc: # noqa: BLE001
print(f"Failed to read JSON from stdin: {exc}", file=sys.stderr)
sys.exit(1)
setup_rcparams()
chart_type = args.type
if chart_type == "articles_per_month":
render_articles_per_month(payload, args.output)
elif chart_type == "articles_per_year":
render_articles_per_year(payload, args.output)
elif chart_type == "articles_per_section":
render_articles_per_section(payload, args.output)
elif chart_type == "words_per_article":
render_words_per_article(payload, args.output)
elif chart_type == "cumulative":
render_cumulative(payload, args.output)
elif chart_type == "words_histogram":
render_words_histogram(payload, args.output)
elif chart_type == "top_requests":
render_top_requests(payload, args.output)
elif chart_type == "weather_hexbin":
render_weather_hexbin(payload, args.output)
elif chart_type == "weekday_activity":
render_weekday_activity(payload, args.output)
else:
print(f"Unknown chart type: {chart_type}", file=sys.stderr)
sys.exit(1)
if __name__ == "__main__":
main()

View File

@@ -1,76 +0,0 @@
const path = require("path");
const { collectMarkdownFiles, collectSectionIndexDirs } = require("../content");
const { readFrontmatter } = require("../weather/frontmatter");
const { parseFrontmatterDate } = require("../datetime");
function parseDate(value) {
if (!value) return null;
return parseFrontmatterDate(value);
}
function countWords(body) {
if (!body) return 0;
const cleaned = body
.replace(/```[\s\S]*?```/g, " ") // fenced code blocks
.replace(/`[^`]*`/g, " ") // inline code
.replace(/<[^>]+>/g, " "); // html tags
const words = cleaned.match(/[\p{L}\p{N}'-]+/gu);
return words ? words.length : 0;
}
async function loadArticles(contentDir) {
const files = await collectMarkdownFiles(contentDir);
const sectionDirs = await collectSectionIndexDirs(contentDir);
const rootDir = path.resolve(contentDir);
const articles = [];
function resolveSection(filePath) {
const absolute = path.resolve(filePath);
let current = path.dirname(absolute);
while (current.startsWith(rootDir)) {
if (sectionDirs.has(current)) {
return path.relative(rootDir, current).replace(/\\/g, "/") || ".";
}
const parent = path.dirname(current);
if (parent === current) break;
current = parent;
}
return null;
}
for (const file of files) {
const frontmatter = await readFrontmatter(file);
if (!frontmatter) continue;
const date = parseDate(frontmatter.doc.get("date"));
const title = frontmatter.doc.get("title") || path.basename(file, ".md");
const body = frontmatter.body.trim();
const wordCount = countWords(body);
const relativePath = path.relative(contentDir, file);
const section = resolveSection(file);
articles.push({
path: file,
relativePath,
title,
date,
body,
wordCount,
section,
frontmatter: frontmatter.doc.toJS ? frontmatter.doc.toJS() : frontmatter.doc.toJSON(),
});
}
return articles;
}
module.exports = {
collectMarkdownFiles,
countWords,
loadArticles,
parseDate,
};

View File

@@ -1,131 +0,0 @@
const { request } = require("undici");
const { DateTime } = require("luxon");
async function fetchGoAccessJson(url) {
const res = await request(url, { method: "GET" });
if (res.statusCode < 200 || res.statusCode >= 300) {
throw new Error(`HTTP ${res.statusCode}`);
}
return res.body.json();
}
function crawlerRatios(data) {
const browsers = data.browsers?.data || [];
const crawler = browsers.find((entry) => entry.data === "Crawlers");
if (!crawler) return { hits: 0, visitors: 0 };
const totalHits = (browsers.reduce((sum, entry) => sum + (entry.hits?.count || 0), 0)) || 0;
const totalVisitors = (browsers.reduce((sum, entry) => sum + (entry.visitors?.count || 0), 0)) || 0;
const hitRatio = totalHits > 0 ? Math.min(1, (crawler.hits?.count || 0) / totalHits) : 0;
const visitorRatio = totalVisitors > 0 ? Math.min(1, (crawler.visitors?.count || 0) / totalVisitors) : 0;
return { hits: hitRatio, visitors: visitorRatio };
}
function groupVisitsByMonth(data, { adjustCrawlers = true } = {}) {
const entries = data.visitors?.data || [];
const ratios = adjustCrawlers ? crawlerRatios(data) : { hits: 0, visitors: 0 };
const months = new Map();
for (const entry of entries) {
const dateStr = entry.data;
if (!/^[0-9]{8}$/.test(dateStr)) continue;
const year = dateStr.slice(0, 4);
const month = dateStr.slice(4, 6);
const day = dateStr.slice(6, 8);
const key = `${year}-${month}`;
const hits = entry.hits?.count || 0;
const visitors = entry.visitors?.count || 0;
const current = months.get(key) || { hits: 0, visitors: 0, from: null, to: null };
const isoDate = `${year}-${month}-${day}`;
current.hits += hits;
current.visitors += visitors;
if (!current.from || isoDate < current.from) current.from = isoDate;
if (!current.to || isoDate > current.to) current.to = isoDate;
months.set(key, current);
}
const adjust = (value, ratio) => {
if (!adjustCrawlers) return value;
const scaled = value * (1 - ratio);
return Math.max(0, Math.round(scaled));
};
const sorted = Array.from(months.entries())
.sort((a, b) => a[0].localeCompare(b[0]))
.map(([key, value]) => ({
month: key,
from: value.from,
to: value.to,
hits: adjust(value.hits, ratios.hits),
visitors: adjust(value.visitors, ratios.visitors),
}));
return sorted;
}
function aggregateLastNDays(data, days = 30, { adjustCrawlers = true } = {}) {
const entries = data.visitors?.data || [];
if (!entries.length || days <= 0) {
return { from: null, to: null, hits: 0, visitors: 0 };
}
const valid = entries.filter((entry) => /^[0-9]{8}$/.test(entry.data));
if (valid.length === 0) {
return { from: null, to: null, hits: 0, visitors: 0 };
}
const sorted = valid.slice().sort((a, b) => a.data.localeCompare(b.data));
const last = sorted[sorted.length - 1];
const end = DateTime.fromFormat(last.data, "yyyyLLdd", { zone: "UTC" });
if (!end.isValid) {
return { from: null, to: null, hits: 0, visitors: 0 };
}
const start = end.minus({ days: days - 1 });
let from = null;
let to = null;
let hits = 0;
let visitors = 0;
for (const entry of sorted) {
const current = DateTime.fromFormat(entry.data, "yyyyLLdd", { zone: "UTC" });
if (!current.isValid) continue;
if (current < start || current > end) continue;
const iso = current.toISODate();
if (!from || iso < from) from = iso;
if (!to || iso > to) to = iso;
hits += entry.hits?.count || 0;
visitors += entry.visitors?.count || 0;
}
const ratios = adjustCrawlers ? crawlerRatios(data) : { hits: 0, visitors: 0 };
const adjust = (value, ratio) => {
if (!adjustCrawlers) return value;
const scaled = value * (1 - ratio);
return Math.max(0, Math.round(scaled));
};
return {
from,
to,
hits: adjust(hits, ratios.hits),
visitors: adjust(visitors, ratios.visitors),
};
}
module.exports = {
fetchGoAccessJson,
groupVisitsByMonth,
aggregateLastNDays,
crawlerRatios,
};

View File

@@ -1,31 +0,0 @@
const { spawn } = require("child_process");
const path = require("path");
async function renderWithPython({ type, data, outputPath }) {
return new Promise((resolve, reject) => {
const scriptPath = path.resolve(__dirname, "../render_stats_charts.py");
const child = spawn("python3", [scriptPath, "--type", type, "--output", outputPath], {
stdio: ["pipe", "inherit", "inherit"],
});
const payload = JSON.stringify(data);
child.stdin.write(payload);
child.stdin.end();
child.on("error", (error) => {
reject(error);
});
child.on("exit", (code) => {
if (code === 0) {
resolve();
} else {
reject(new Error(`Python renderer exited with code ${code}`));
}
});
});
}
module.exports = {
renderWithPython,
};

View File

@@ -1,92 +0,0 @@
const fs = require("node:fs/promises");
const path = require("node:path");
const { collectFilesByExtensions } = require("./content");
const DEFAULT_URL_TEXT_EXTENSIONS = Object.freeze([
".json",
".markdown",
".md",
".yaml",
".yml",
]);
/**
* Compte les occurrences exactes d'une chaine dans un texte.
* @param {string} text Texte a analyser.
* @param {string} needle Chaine recherchee.
* @returns {number} Nombre d'occurrences trouvees.
*/
function countOccurrences(text, needle) {
if (typeof text !== "string") {
return 0;
}
if (typeof needle !== "string" || !needle) {
return 0;
}
return text.split(needle).length - 1;
}
/**
* Retourne la liste des fichiers textuels contenant une URL donnee.
* @param {string} rootDir Racine a parcourir.
* @param {string} targetUrl URL a rechercher.
* @param {{ extensions?: string[] }} options Options de recherche.
* @returns {Promise<Array<{ filePath: string, occurrences: number }>>}
*/
async function findUrlOccurrences(rootDir, targetUrl, options = {}) {
let extensions = DEFAULT_URL_TEXT_EXTENSIONS;
if (Array.isArray(options.extensions)) {
extensions = options.extensions;
}
const files = await collectFilesByExtensions(rootDir, extensions);
const matches = [];
for (const filePath of files) {
const content = await fs.readFile(filePath, "utf8");
const occurrences = countOccurrences(content, targetUrl);
if (occurrences <= 0) {
continue;
}
matches.push({ filePath, occurrences });
}
return matches;
}
/**
* Remplace toutes les occurrences exactes d'une URL dans une liste de fichiers.
* @param {string} rootDir Racine de recherche.
* @param {string} targetUrl URL a remplacer.
* @param {string} replacementUrl URL de remplacement.
* @param {{ extensions?: string[], matches?: Array<{ filePath: string, occurrences: number }> }} options Options d'ecriture.
* @returns {Promise<{ changedFiles: string[], totalOccurrences: number }>}
*/
async function replaceUrlInFiles(rootDir, targetUrl, replacementUrl, options = {}) {
let matches = [];
if (Array.isArray(options.matches)) {
matches = options.matches;
} else {
matches = await findUrlOccurrences(rootDir, targetUrl, options);
}
const changedFiles = [];
let totalOccurrences = 0;
for (const match of matches) {
const filePath = path.resolve(match.filePath);
const content = await fs.readFile(filePath, "utf8");
const updatedContent = content.split(targetUrl).join(replacementUrl);
await fs.writeFile(filePath, updatedContent, "utf8");
changedFiles.push(filePath);
totalOccurrences += match.occurrences;
}
return { changedFiles, totalOccurrences };
}
module.exports = {
DEFAULT_URL_TEXT_EXTENSIONS,
countOccurrences,
findUrlOccurrences,
replaceUrlInFiles,
};

View File

@@ -1,66 +0,0 @@
const fs = require("fs");
const path = require("path");
const { applyEnvOverrides } = require("../config");
const { loadEnv } = require("../env");
const DEFAULT_WEATHER_CONFIG = {
timezone: "Europe/Paris",
defaultHour: 12,
defaultMinute: 0,
windowMinutes: 60,
precipitationThreshold: 0.1,
providers: {
influxdb: {
windowMinutes: 60,
},
openMeteo: {
timezone: null,
pressureOffset: 0,
illuminanceToLuxFactor: 126.7,
},
},
};
function loadWeatherConfig(configPath = path.resolve(__dirname, "..", "..", "config", "config.json")) {
loadEnv();
let raw = {};
if (fs.existsSync(configPath)) {
try {
raw = JSON.parse(fs.readFileSync(configPath, "utf8"));
} catch (error) {
console.error(`Unable to read weather config at ${configPath}: ${error.message}`);
}
}
const withEnv = applyEnvOverrides(raw);
const weather = withEnv.weather || {};
const providers = {
...DEFAULT_WEATHER_CONFIG.providers,
...(weather.providers || {}),
};
providers.influxdb = {
...DEFAULT_WEATHER_CONFIG.providers.influxdb,
...(weather.providers?.influxdb || {}),
};
providers.openMeteo = {
...DEFAULT_WEATHER_CONFIG.providers.openMeteo,
timezone: weather.providers?.openMeteo?.timezone || weather.timezone || DEFAULT_WEATHER_CONFIG.timezone,
...(weather.providers?.openMeteo || {}),
};
return {
...DEFAULT_WEATHER_CONFIG,
...weather,
providers,
};
}
module.exports = {
DEFAULT_WEATHER_CONFIG,
loadWeatherConfig,
};

View File

@@ -1,18 +0,0 @@
const WEATHER_FIELDS = [
"temperature",
"humidity",
"pressure",
"illuminance",
"precipitations",
"wind_speed",
"wind_direction",
];
function hasValue(value) {
return value !== null && value !== undefined;
}
module.exports = {
WEATHER_FIELDS,
hasValue,
};

View File

@@ -1,41 +0,0 @@
const fs = require("fs/promises");
const YAML = require("yaml");
const FRONTMATTER_PATTERN = /^---\n([\s\S]*?)\n---\n?/;
async function readFrontmatter(filePath) {
const raw = await fs.readFile(filePath, "utf8");
const match = raw.match(FRONTMATTER_PATTERN);
if (!match) return null;
const frontmatterText = match[1];
const doc = YAML.parseDocument(frontmatterText);
if (doc.errors.length) {
const [error] = doc.errors;
throw new Error(`Invalid frontmatter in ${filePath}: ${error.message}`);
}
const body = raw.slice(match[0].length);
return { doc, body, frontmatterText };
}
async function writeFrontmatter(filePath, doc, body) {
const serialized = doc.toString().trimEnd();
const nextContent = `---\n${serialized}\n---\n${body}`;
await fs.writeFile(filePath, nextContent, "utf8");
}
function extractRawDate(frontmatterText) {
const match = frontmatterText.match(/^date:\s*(.+)$/m);
return match ? match[1].trim() : null;
}
module.exports = {
extractRawDate,
readFrontmatter,
writeFrontmatter,
};

View File

@@ -1,62 +0,0 @@
const { WEATHER_FIELDS, hasValue } = require("../constants");
const { createInfluxProvider } = require("./influxdb");
const { createOpenMeteoProvider } = require("./open-meteo");
function mergeWeather(target, addition, providerName) {
let added = false;
for (const field of WEATHER_FIELDS) {
if (!hasValue(addition[field])) continue;
if (hasValue(target[field])) continue;
target[field] = addition[field];
added = true;
}
if (added) {
const existing = target.source || [];
const nextSources = Array.from(new Set([...existing, ...(addition.source || [providerName])].filter(Boolean)));
target.source = nextSources;
}
return added;
}
function buildProviders(config) {
const providers = [];
const influx = createInfluxProvider(config.providers?.influxdb, config);
const openMeteo = createOpenMeteoProvider(config.providers?.openMeteo, config);
if (influx) providers.push(influx);
if (openMeteo) providers.push(openMeteo);
return providers;
}
async function fetchWeather(targetDateTime, config) {
const providers = buildProviders(config);
const weather = {};
for (const provider of providers) {
const result = await provider.fetch({ target: targetDateTime });
if (!result) continue;
if (provider.name === "influxdb") {
return result;
}
mergeWeather(weather, result, provider.name);
const complete = WEATHER_FIELDS.every((field) => hasValue(weather[field]));
if (complete) break;
}
return Object.keys(weather).length > 0 ? weather : null;
}
module.exports = {
fetchWeather,
hasConfiguredProvider: (config) => buildProviders(config).length > 0,
mergeWeather,
};

View File

@@ -1,157 +0,0 @@
const { InfluxDB } = require("@influxdata/influxdb-client");
const { DateTime } = require("luxon");
const { buildTimeWindow } = require("../time");
const { hasValue } = require("../constants");
function escapeValue(value) {
return String(value).replace(/\\/g, "\\\\").replace(/"/g, '\\"');
}
function buildFluxQuery(bucket, sensor, window) {
const startIso = window.start.toUTC().toISO();
const stopIso = window.end.toUTC().toISO();
const filters = [];
if (sensor.measurement) filters.push(`r._measurement == "${escapeValue(sensor.measurement)}"`);
if (sensor.field) filters.push(`r._field == "${escapeValue(sensor.field)}"`);
if (sensor.tags) {
for (const [key, value] of Object.entries(sensor.tags)) {
filters.push(`r.${key} == "${escapeValue(value)}"`);
}
}
let query = `from(bucket: "${escapeValue(bucket)}")\n |> range(start: time(v: "${startIso}"), stop: time(v: "${stopIso}"))`;
if (filters.length > 0) {
query += `\n |> filter(fn: (r) => ${filters.join(" and ")})`;
}
query += "\n |> keep(columns: [\"_time\", \"_value\"])";
return query;
}
function pickNearestRow(rows, target) {
let closest = null;
let smallestDiff = Number.POSITIVE_INFINITY;
for (const row of rows) {
if (!hasValue(row._time)) continue;
const rowTime = DateTime.fromISO(row._time);
if (!rowTime.isValid) continue;
const diff = Math.abs(rowTime.diff(target).as("milliseconds"));
if (diff < smallestDiff) {
smallestDiff = diff;
closest = { time: rowTime, value: row._value };
}
}
return closest;
}
function normalizeValue(key, rawValue, sensor, precipitationThreshold) {
if (!hasValue(rawValue)) return null;
let value = typeof rawValue === "number" ? rawValue : Number(rawValue);
if (Number.isNaN(value)) return null;
if (sensor.multiplier && Number.isFinite(sensor.multiplier)) {
value *= sensor.multiplier;
}
if (sensor.offset && Number.isFinite(sensor.offset)) {
value += sensor.offset;
}
if (key === "wind_speed" && sensor.unit === "mps") {
value *= 3.6;
}
if (key === "precipitations") {
const threshold = Number.isFinite(sensor.threshold) ? sensor.threshold : precipitationThreshold;
if (!Number.isFinite(threshold)) return null;
return value > threshold;
}
return value;
}
function createInfluxProvider(config = {}, globalConfig = {}) {
const { url, token, org, bucket, sensors } = config;
if (!url || !token || !org || !bucket || !sensors || Object.keys(sensors).length === 0) {
return null;
}
const timezone = config.timezone || globalConfig.timezone || "UTC";
const queryApi = new InfluxDB({ url, token }).getQueryApi(org);
const windowMinutes = config.windowMinutes ?? globalConfig.windowMinutes ?? 60;
const precipitationThreshold = config.precipitationThreshold ?? globalConfig.precipitationThreshold ?? 0.1;
async function fetch({ target }) {
const targetInZone = DateTime.isDateTime(target) ? target.setZone(timezone) : DateTime.fromJSDate(target, { zone: timezone });
const window = buildTimeWindow(targetInZone, windowMinutes);
const weather = {};
const contributed = new Set();
let hasMeasurement = false;
for (const [key, sensor] of Object.entries(sensors)) {
const query = buildFluxQuery(bucket, sensor, window);
try {
const rows = await queryApi.collectRows(query);
const nearest = pickNearestRow(rows, target);
if (!nearest) continue;
const value = normalizeValue(key, nearest.value, sensor, precipitationThreshold);
if (!hasValue(value)) continue;
weather[key] = value;
contributed.add("influxdb");
hasMeasurement = true;
} catch (error) {
console.error(`InfluxDB error for ${key}: ${error.message}`);
}
}
if (!hasMeasurement) return null;
if (!hasValue(weather.illuminance) && sensors.illuminance) {
const hour = targetInZone.hour;
const isNight = hour < 6 || hour >= 18;
if (isNight) {
weather.illuminance = 0;
contributed.add("influxdb");
}
}
if (!hasValue(weather.precipitations) && sensors.precipitations) {
weather.precipitations = false;
contributed.add("influxdb");
}
if (Object.keys(weather).length === 0) return null;
weather.source = Array.from(contributed);
return weather;
}
return {
name: "influxdb",
fetch,
};
}
module.exports = {
createInfluxProvider,
};

View File

@@ -1,165 +0,0 @@
const { fetch: httpFetch } = require("undici");
const { DateTime } = require("luxon");
const { hasValue } = require("../constants");
const { buildTimeWindow } = require("../time");
function isConfigured(config) {
return config && Number.isFinite(config.latitude) && Number.isFinite(config.longitude);
}
function normalizeTarget(target, zone) {
if (!target) return null;
if (DateTime.isDateTime(target)) return target.setZone(zone || undefined);
if (target instanceof Date) return DateTime.fromJSDate(target, { zone });
if (typeof target === "string") return DateTime.fromISO(target, { zone });
return null;
}
function selectNearestHour(hourly, target, timezone, windowMinutes) {
if (!hourly || !Array.isArray(hourly.time) || hourly.time.length === 0) return null;
const window = buildTimeWindow(target, windowMinutes);
let nearest = null;
let smallestDiff = Number.POSITIVE_INFINITY;
for (let i = 0; i < hourly.time.length; i++) {
const timeValue = DateTime.fromISO(hourly.time[i], { zone: timezone });
if (!timeValue.isValid) continue;
if (timeValue < window.start || timeValue > window.end) continue;
const diff = Math.abs(timeValue.diff(target).as("milliseconds"));
if (diff < smallestDiff) {
smallestDiff = diff;
nearest = {
time: timeValue,
temperature_2m: hourly.temperature_2m?.[i],
relative_humidity_2m: hourly.relative_humidity_2m?.[i],
surface_pressure: hourly.surface_pressure?.[i],
shortwave_radiation: hourly.shortwave_radiation?.[i],
precipitation: hourly.precipitation?.[i],
wind_speed_10m: hourly.wind_speed_10m?.[i],
wind_direction_10m: hourly.wind_direction_10m?.[i],
};
}
}
return nearest;
}
function createOpenMeteoProvider(config = {}, globalConfig = {}) {
if (!isConfigured(config)) return null;
const windowMinutes = config.windowMinutes ?? globalConfig.windowMinutes ?? 60;
const precipitationThreshold = config.precipitationThreshold ?? globalConfig.precipitationThreshold ?? 0.1;
const pressureOffset = Number.isFinite(config.pressureOffset) ? config.pressureOffset : 0;
const illuminanceToLuxFactor = Number.isFinite(config.illuminanceToLuxFactor)
? config.illuminanceToLuxFactor
: Number.isFinite(globalConfig.providers?.openMeteo?.illuminanceToLuxFactor)
? globalConfig.providers.openMeteo.illuminanceToLuxFactor
: 126.7;
const timezone = config.timezone || globalConfig.timezone || "UTC";
async function fetchData({ target }) {
const normalized = normalizeTarget(target, timezone);
if (!normalized || !normalized.isValid) return null;
const isFuture = normalized > DateTime.now().setZone(timezone).plus({ days: 1 });
const baseUrl = isFuture ? "https://api.open-meteo.com/v1/forecast" : "https://archive-api.open-meteo.com/v1/archive";
const params = new URLSearchParams({
latitude: config.latitude.toString(),
longitude: config.longitude.toString(),
timezone,
start_date: normalized.toISODate(),
end_date: normalized.toISODate(),
hourly: [
"temperature_2m",
"relative_humidity_2m",
"surface_pressure",
"shortwave_radiation",
"precipitation",
"wind_speed_10m",
"wind_direction_10m",
].join(","),
});
if (isFuture) {
params.set("forecast_days", "16");
params.set("past_days", "7");
}
const url = `${baseUrl}?${params.toString()}`;
let json;
try {
const response = await httpFetch(url);
if (!response || typeof response.ok === "undefined") {
const sanitizedUrl = url
.replace(/(latitude=)[^&]+/, "$1***")
.replace(/(longitude=)[^&]+/, "$1***");
console.error("Open-Meteo unexpected response", {
type: typeof response,
ctor: response?.constructor?.name,
url: sanitizedUrl,
});
throw new Error("Invalid response object");
}
if (!response.ok) {
let details = "";
try {
details = await response.text();
} catch (_) {
// ignore
}
const suffix = details ? `: ${details}` : "";
throw new Error(`HTTP ${response.status}${suffix}`);
}
json = await response.json();
} catch (error) {
console.error(`Open-Meteo error: ${error.message}`);
return null;
}
if (!json?.hourly) return null;
const nearest = selectNearestHour(json.hourly, target, timezone, windowMinutes);
if (!nearest) return null;
const weather = {};
if (hasValue(nearest.temperature_2m)) weather.temperature = nearest.temperature_2m;
if (hasValue(nearest.relative_humidity_2m)) weather.humidity = nearest.relative_humidity_2m;
if (hasValue(nearest.surface_pressure)) weather.pressure = nearest.surface_pressure + pressureOffset;
if (hasValue(nearest.shortwave_radiation)) {
const factor = Number.isFinite(illuminanceToLuxFactor) ? illuminanceToLuxFactor : 126.7;
weather.illuminance = nearest.shortwave_radiation * factor;
}
if (hasValue(nearest.precipitation)) {
weather.precipitations = nearest.precipitation > precipitationThreshold;
}
if (hasValue(nearest.wind_speed_10m)) weather.wind_speed = nearest.wind_speed_10m;
if (hasValue(nearest.wind_direction_10m)) weather.wind_direction = nearest.wind_direction_10m;
if (Object.keys(weather).length === 0) return null;
weather.source = ["open-meteo"];
return weather;
}
return {
name: "open-meteo",
fetch: fetchData,
};
}
module.exports = {
createOpenMeteoProvider,
};

View File

@@ -1,44 +0,0 @@
const { DateTime } = require("luxon");
const { parseHugoDateString } = require("../datetime");
function hasExplicitTime(rawDate) {
if (!rawDate) return false;
const cleaned = rawDate.replace(/^['"]|['"]$/g, "");
return /\d{2}:\d{2}/.test(cleaned);
}
function resolveArticleDate(dateValue, rawDate, { timezone = "Europe/Paris", defaultHour = 12, defaultMinute = 0 } = {}) {
const hasTime = hasExplicitTime(rawDate);
const zone = timezone || "UTC";
let parsed;
if (typeof dateValue === "string") {
const source = rawDate || dateValue;
parsed = parseHugoDateString(source, zone, defaultHour, defaultMinute);
} else if (dateValue instanceof Date) {
parsed = DateTime.fromJSDate(dateValue, { zone });
}
if (!parsed || !parsed.isValid) return null;
if (!hasTime) {
parsed = parsed.set({ hour: defaultHour, minute: defaultMinute, second: 0, millisecond: 0 });
}
return parsed;
}
function buildTimeWindow(target, windowMinutes = 60) {
const minutes = Number.isFinite(windowMinutes) ? windowMinutes : 60;
return {
start: target.minus({ minutes }),
end: target.plus({ minutes }),
};
}
module.exports = {
buildTimeWindow,
hasExplicitTime,
resolveArticleDate,
};

View File

@@ -1,355 +0,0 @@
const fs = require("node:fs/promises");
const path = require("node:path");
const { fetch } = require("undici");
const COMMONS_API_URL = "https://commons.wikimedia.org/w/api.php";
const COMMONS_HOST = "commons.wikimedia.org";
const UPLOAD_HOST = "upload.wikimedia.org";
/**
* Extrait un titre de fichier MediaWiki depuis une URL Wikipédia ou Commons.
* @param {string} rawUrl URL fournie par l'utilisateur.
* @returns {string} Titre canonique de type `File:Nom.ext`.
*/
function extractFileTitleFromUrl(rawUrl) {
const url = new URL(rawUrl);
const hostname = url.hostname.toLowerCase();
if (url.hash) {
const hash = decodeURIComponent(url.hash.slice(1));
if (hash.startsWith("/media/")) {
const fileTitle = hash.slice("/media/".length);
return normalizeFileTitle(fileTitle);
}
}
if (pathnameLooksLikeFilePage(url.pathname)) {
const title = decodeURIComponent(url.pathname.slice("/wiki/".length));
return normalizeFileTitle(title);
}
if (hostname === UPLOAD_HOST) {
const fileName = decodeURIComponent(path.basename(url.pathname));
return normalizeFileTitle(`File:${fileName}`);
}
if (hostname === COMMONS_HOST || hostname.endsWith(".wikipedia.org")) {
throw new Error(`L'URL ${rawUrl} ne pointe pas vers une page de fichier Wikimedia.`);
}
throw new Error(`L'URL ${rawUrl} n'appartient pas à Wikipédia ou Wikimedia Commons.`);
}
/**
* Vérifie si un chemin d'URL correspond à une page de fichier MediaWiki.
* @param {string} pathname Partie pathname de l'URL.
* @returns {boolean} `true` si le chemin vise une page de fichier.
*/
function pathnameLooksLikeFilePage(pathname) {
if (!pathname.startsWith("/wiki/")) {
return false;
}
const decoded = decodeURIComponent(pathname.slice("/wiki/".length));
if (decoded.startsWith("File:")) {
return true;
}
if (decoded.startsWith("Fichier:")) {
return true;
}
return false;
}
/**
* Normalise un titre de fichier Wikimedia vers l'espace de noms `File:`.
* @param {string} rawTitle Titre brut extrait d'une URL.
* @returns {string} Titre normalisé.
*/
function normalizeFileTitle(rawTitle) {
const cleaned = rawTitle.trim();
if (!cleaned) {
throw new Error("Le titre du fichier Wikimedia est vide.");
}
if (cleaned.startsWith("File:")) {
return cleaned;
}
if (cleaned.startsWith("Fichier:")) {
return `File:${cleaned.slice("Fichier:".length)}`;
}
throw new Error(`Le titre ${rawTitle} ne correspond pas à un fichier Wikimedia.`);
}
/**
* Interroge l'API Commons pour récupérer l'image et ses métadonnées.
* @param {string} fileTitle Titre du fichier ciblé.
* @returns {Promise<{ fileTitle: string, fileName: string, imageUrl: string, descriptionUrl: string, descriptionShortUrl: string, description: string, attribution: string }>}
*/
async function fetchWikimediaAsset(fileTitle) {
const url = new URL(COMMONS_API_URL);
url.searchParams.set("action", "query");
url.searchParams.set("titles", fileTitle);
url.searchParams.set("prop", "imageinfo");
url.searchParams.set("iiprop", "url|extmetadata");
url.searchParams.set("iiextmetadatalanguage", "en");
url.searchParams.set("iilimit", "1");
url.searchParams.set("format", "json");
const response = await fetch(url, {
headers: {
accept: "application/json",
},
});
if (!response.ok) {
throw new Error(`L'API Wikimedia Commons a répondu ${response.status} pour ${fileTitle}.`);
}
const data = await response.json();
return extractAssetFromApiResponse(data);
}
/**
* Extrait les informations utiles depuis une réponse JSON de l'API Commons.
* @param {Record<string, any>} data Réponse JSON brute.
* @returns {{ fileTitle: string, fileName: string, imageUrl: string, descriptionUrl: string, descriptionShortUrl: string, description: string, attribution: string }}
*/
function extractAssetFromApiResponse(data) {
if (!data || typeof data !== "object") {
throw new Error("La réponse de l'API Wikimedia Commons est invalide.");
}
const query = data.query;
if (!query || typeof query !== "object") {
throw new Error("La réponse de l'API Wikimedia Commons ne contient pas de section query.");
}
const pages = query.pages;
if (!pages || typeof pages !== "object") {
throw new Error("La réponse de l'API Wikimedia Commons ne contient pas de pages.");
}
const pageIds = Object.keys(pages);
if (pageIds.length === 0) {
throw new Error("La réponse de l'API Wikimedia Commons ne contient aucune page.");
}
const page = pages[pageIds[0]];
if (!page || typeof page !== "object") {
throw new Error("La page Wikimedia Commons retournée est invalide.");
}
if (Object.prototype.hasOwnProperty.call(page, "missing")) {
throw new Error(`Le fichier Wikimedia ${page.title} est introuvable.`);
}
const imageInfoList = page.imageinfo;
if (!Array.isArray(imageInfoList) || imageInfoList.length === 0) {
throw new Error(`Aucune information image n'a été retournée pour ${page.title}.`);
}
const imageInfo = imageInfoList[0];
const extmetadata = imageInfo.extmetadata;
if (!extmetadata || typeof extmetadata !== "object") {
throw new Error(`Les métadonnées étendues sont absentes pour ${page.title}.`);
}
const imageUrl = imageInfo.url;
const descriptionUrl = imageInfo.descriptionurl;
const descriptionShortUrl = imageInfo.descriptionshorturl;
if (typeof imageUrl !== "string" || !imageUrl) {
throw new Error(`L'URL de téléchargement est absente pour ${page.title}.`);
}
if (typeof descriptionUrl !== "string" || !descriptionUrl) {
throw new Error(`L'URL de description est absente pour ${page.title}.`);
}
if (typeof descriptionShortUrl !== "string" || !descriptionShortUrl) {
throw new Error(`L'URL courte de description est absente pour ${page.title}.`);
}
const imageDescription = readExtMetadataValue(extmetadata, "ImageDescription");
const artist = readExtMetadataValue(extmetadata, "Artist");
const credit = readExtMetadataValue(extmetadata, "Credit");
const licenseShortName = normalizeLicenseName(readExtMetadataValue(extmetadata, "LicenseShortName"));
const attribution = buildAttribution(artist, credit, licenseShortName, descriptionShortUrl);
const fileName = decodeURIComponent(path.basename(new URL(imageUrl).pathname));
if (!imageDescription) {
throw new Error(`La description Wikimedia est absente pour ${page.title}.`);
}
if (!attribution) {
throw new Error(`L'attribution Wikimedia est absente pour ${page.title}.`);
}
return {
fileTitle: page.title,
fileName,
imageUrl,
descriptionUrl,
descriptionShortUrl,
description: imageDescription,
attribution,
};
}
/**
* Lit un champ extmetadata et le convertit en texte brut.
* @param {Record<string, any>} extmetadata Métadonnées étendues.
* @param {string} key Nom du champ recherché.
* @returns {string} Valeur nettoyée, éventuellement vide.
*/
function readExtMetadataValue(extmetadata, key) {
const entry = extmetadata[key];
if (!entry || typeof entry !== "object") {
return "";
}
if (typeof entry.value !== "string") {
return "";
}
return sanitizeMetadataText(entry.value);
}
/**
* Nettoie une valeur HTML issue de Commons et la ramène à du texte.
* @param {string} value Valeur brute.
* @returns {string} Texte brut nettoyé.
*/
function sanitizeMetadataText(value) {
let sanitized = decodeHtmlEntities(value);
sanitized = sanitized.replace(/<br\s*\/?>/gi, " ");
sanitized = sanitized.replace(/<[^>]+>/g, " ");
sanitized = decodeHtmlEntities(sanitized);
sanitized = sanitized.replace(/\s+/g, " ").trim();
return sanitized;
}
/**
* Décode un sous-ensemble suffisant des entités HTML utilisées par Commons.
* @param {string} value Valeur HTML encodée.
* @returns {string} Valeur décodée.
*/
function decodeHtmlEntities(value) {
const namedEntities = {
amp: "&",
apos: "'",
gt: ">",
lt: "<",
nbsp: " ",
quot: "\"",
};
let decoded = value.replace(/&#x([0-9a-f]+);/gi, (match, digits) => {
const codePoint = Number.parseInt(digits, 16);
if (!Number.isInteger(codePoint)) {
return match;
}
if (codePoint < 0 || codePoint > 0x10ffff) {
return match;
}
return String.fromCodePoint(codePoint);
});
decoded = decoded.replace(/&#([0-9]+);/g, (match, digits) => {
const codePoint = Number.parseInt(digits, 10);
if (!Number.isInteger(codePoint)) {
return match;
}
if (codePoint < 0 || codePoint > 0x10ffff) {
return match;
}
return String.fromCodePoint(codePoint);
});
decoded = decoded.replace(/&([a-z]+);/gi, (match, name) => {
const key = name.toLowerCase();
if (Object.prototype.hasOwnProperty.call(namedEntities, key)) {
return namedEntities[key];
}
return match;
});
return decoded;
}
/**
* Assemble l'attribution finale telle qu'elle sera écrite dans le YAML.
* @param {string} artist Auteur nettoyé.
* @param {string} credit Crédit nettoyé.
* @param {string} licenseShortName Licence courte.
* @param {string} descriptionShortUrl URL courte Commons.
* @returns {string} Attribution concaténée.
*/
function buildAttribution(artist, credit, licenseShortName, descriptionShortUrl) {
const parts = [];
let creditLine = "";
if (artist) {
creditLine = `By ${artist}`;
}
if (credit) {
if (creditLine) {
creditLine = `${creditLine} - ${credit}`;
} else {
creditLine = credit;
}
}
if (creditLine) {
parts.push(creditLine);
}
if (licenseShortName) {
parts.push(licenseShortName);
}
if (descriptionShortUrl) {
parts.push(descriptionShortUrl);
}
return parts.join(", ");
}
/**
* Harmonise certains libellés de licence pour rester cohérente avec l'existant.
* @param {string} licenseShortName Libellé brut fourni par Commons.
* @returns {string} Libellé normalisé.
*/
function normalizeLicenseName(licenseShortName) {
if (licenseShortName === "Public domain") {
return "Public Domain";
}
return licenseShortName;
}
/**
* Télécharge un fichier binaire distant sur le disque.
* @param {string} url URL source.
* @param {string} targetPath Chemin cible.
*/
async function downloadFile(url, targetPath) {
const response = await fetch(url);
if (!response.ok) {
throw new Error(`Le téléchargement de ${url} a échoué avec le code ${response.status}.`);
}
const buffer = Buffer.from(await response.arrayBuffer());
await fs.writeFile(targetPath, buffer);
}
module.exports = {
extractFileTitleFromUrl,
fetchWikimediaAsset,
extractAssetFromApiResponse,
sanitizeMetadataText,
buildAttribution,
downloadFile,
};

View File

@@ -1,425 +0,0 @@
#!/usr/bin/env node
/**
* Parcourt tous les articles Markdown du dossier content/ et
* crée automatiquement un lien vers la page du mot-clé pour la
* première occurrence de chaque terme défini dans les taxonomies
* du frontmatter. Les occurrences déjà liées sont ignorées.
*
* Sort avec un code différent de 0 lorsqu'au moins un fichier est modifié.
*/
const fs = require("node:fs");
const path = require("node:path");
const yaml = require("js-yaml");
const PROJECT_ROOT = path.resolve(__dirname, "..");
const CONTENT_ROOT = path.join(PROJECT_ROOT, "content");
const TAXONOMIES_FILE = path.join(PROJECT_ROOT, "config", "_default", "taxonomies.yaml");
const FRONTMATTER_PATTERN = /^---\n([\s\S]+?)\n---\n?([\s\S]*)$/;
const WORD_CHAR = /[\p{L}\p{N}]/u;
const INLINE_FORMATTING_CHARS = ["*", "_"];
main();
function main() {
const taxonomyMapping = loadTaxonomyMapping(TAXONOMIES_FILE);
const files = collectMarkdownFiles(CONTENT_ROOT);
if (files.length === 0) {
console.log("Aucun article Markdown trouvé sous content/.");
return;
}
const changed = [];
for (const filePath of files) {
if (processFile(filePath, taxonomyMapping)) {
changed.push(filePath);
}
}
if (changed.length > 0) {
for (const filePath of changed) {
const rel = path.relative(PROJECT_ROOT, filePath);
console.log(`✏️ ${rel}`);
}
console.log("Des modifications ont été effectuées. Merci de les revoir.");
process.exit(2);
} else {
console.log("Tous les articles sont déjà correctement liés.");
}
}
function processFile(filePath, taxonomyMapping) {
let raw;
try {
raw = fs.readFileSync(filePath, "utf8");
} catch (error) {
console.warn(`⚠️ Impossible de lire ${filePath}: ${error.message}`);
return false;
}
const match = raw.match(FRONTMATTER_PATTERN);
if (!match) {
return false;
}
let frontmatter;
try {
frontmatter = yaml.load(match[1]) || {};
} catch (error) {
console.warn(`⚠️ Frontmatter invalide dans ${filePath}: ${error.message}`);
return false;
}
const keywords = extractKeywords(frontmatter, match[1], taxonomyMapping.fieldToCanonical);
if (keywords.length === 0) {
return false;
}
const { body, changed } = linkKeywordsInBody(match[2], keywords);
if (!changed) {
return false;
}
const prefixLength = raw.length - match[2].length;
const updated = raw.slice(0, prefixLength) + body;
fs.writeFileSync(filePath, updated, "utf8");
return true;
}
function loadTaxonomyMapping(configPath) {
let raw;
try {
raw = fs.readFileSync(configPath, "utf8");
} catch (error) {
console.error(`Impossible de lire ${configPath}: ${error.message}`);
process.exit(1);
}
let data;
try {
data = yaml.load(raw) || {};
} catch (error) {
console.error(`YAML invalide dans ${configPath}: ${error.message}`);
process.exit(1);
}
if (typeof data !== "object" || data === null) {
console.error(`Format inattendu dans ${configPath}`);
process.exit(1);
}
const fieldToCanonical = new Map();
for (const [singular, plural] of Object.entries(data)) {
const canonicalName =
typeof plural === "string" && plural.trim().length > 0 ? plural.trim() : singular.trim();
if (!canonicalName) continue;
const candidates = new Set([singular, canonicalName].filter(Boolean));
for (const name of candidates) {
fieldToCanonical.set(name, canonicalName);
}
}
if (fieldToCanonical.size === 0) {
console.error("Aucune taxonomie n'est définie dans la configuration.");
process.exit(1);
}
return { fieldToCanonical };
}
function collectMarkdownFiles(root) {
const files = [];
walk(root, files);
return files.sort((a, b) => a.localeCompare(b));
}
function walk(dir, bucket) {
let entries;
try {
entries = fs.readdirSync(dir, { withFileTypes: true });
} catch (error) {
console.warn(`⚠️ Impossible de parcourir ${dir}: ${error.message}`);
return;
}
for (const entry of entries) {
if (entry.name === ".git" || entry.name === "node_modules") {
continue;
}
const absolute = path.join(dir, entry.name);
if (entry.isDirectory()) {
walk(absolute, bucket);
} else if (entry.isFile() && entry.name.toLowerCase().endsWith(".md")) {
bucket.push(absolute);
}
}
}
function extractKeywords(frontmatter, frontmatterRaw, fieldToCanonical) {
const keywords = [];
const seen = new Set();
function addKeyword(taxonomy, term) {
if (!taxonomy || typeof term !== "string") return;
const normalized = term.trim();
if (!normalized) return;
const key = `${taxonomy}::${normalized.toLowerCase()}`;
if (seen.has(key)) return;
const slug = slugify(normalized);
if (!slug) return;
seen.add(key);
keywords.push({
taxonomy,
term: normalized,
url: `/${taxonomy}/${slug}/`,
});
}
if (typeof frontmatter === "object" && frontmatter !== null) {
for (const [field, value] of Object.entries(frontmatter)) {
const canonical = fieldToCanonical.get(field);
if (!canonical) continue;
const terms = normalizeTerms(value);
for (const term of terms) {
addKeyword(canonical, term);
}
}
}
for (const entry of extractCommentedTerms(frontmatterRaw, fieldToCanonical)) {
addKeyword(entry.taxonomy, entry.term);
}
return keywords;
}
function normalizeTerms(value) {
if (Array.isArray(value)) {
return value.map((item) => normalizeTerm(item)).filter(Boolean);
}
const single = normalizeTerm(value);
return single ? [single] : [];
}
function normalizeTerm(value) {
if (typeof value !== "string") return null;
const trimmed = value.trim();
return trimmed.length > 0 ? trimmed : null;
}
function extractCommentedTerms(frontmatterRaw, fieldToCanonical) {
if (typeof frontmatterRaw !== "string" || frontmatterRaw.length === 0) {
return [];
}
const results = [];
const lines = frontmatterRaw.split(/\r?\n/);
let currentCanonical = null;
let currentIndent = 0;
for (const line of lines) {
const indent = getIndentation(line);
const fieldMatch = line.match(/^\s*([A-Za-z0-9_]+):\s*(?:#.*)?$/);
if (fieldMatch) {
const fieldName = fieldMatch[1];
currentCanonical = fieldToCanonical.get(fieldName) || null;
currentIndent = indent;
continue;
}
if (!currentCanonical) continue;
const commentMatch = line.match(/^\s*#\s*-\s+(.*)$/);
if (!commentMatch) continue;
if (indent <= currentIndent) continue;
const term = commentMatch[1].trim();
if (!term) continue;
results.push({ taxonomy: currentCanonical, term });
}
return results;
}
function linkKeywordsInBody(body, keywords) {
if (typeof body !== "string" || body.length === 0 || keywords.length === 0) {
return { body, changed: false };
}
let updated = body;
let changed = false;
let linkRanges = computeLinkRanges(updated);
for (const keyword of keywords) {
const occurrence = findKeywordOccurrence(updated, keyword.term, linkRanges);
if (!occurrence) continue;
const expanded = includeFormattingCharacters(updated, occurrence.start, occurrence.end);
const before = updated.slice(0, expanded.start);
const label = updated.slice(expanded.start, expanded.end);
const after = updated.slice(expanded.end);
updated = `${before}[${label}](${keyword.url})${after}`;
changed = true;
linkRanges = computeLinkRanges(updated);
}
return { body: updated, changed };
}
function findKeywordOccurrence(text, keyword, linkRanges) {
if (!keyword) return null;
const escaped = escapeRegExp(keyword);
if (!escaped) return null;
const regex = new RegExp(escaped, "giu");
let match;
while ((match = regex.exec(text)) !== null) {
const start = match.index;
const end = start + match[0].length;
if (isInsideExistingLink(start, end, linkRanges)) {
continue;
}
if (!hasWordBoundaries(text, start, end)) {
continue;
}
return { start, end, text: match[0] };
}
return null;
}
function computeLinkRanges(text) {
const ranges = [];
if (typeof text !== "string" || text.length === 0) {
return ranges;
}
for (let i = 0; i < text.length; i++) {
let isImage = false;
if (text[i] === "!" && text[i + 1] === "[") {
isImage = true;
i += 1;
}
if (text[i] !== "[") continue;
const openBracket = i;
const closeBracket = findMatchingPair(text, openBracket, "[", "]");
if (closeBracket === -1) continue;
let pointer = closeBracket + 1;
while (pointer < text.length && /\s/.test(text[pointer])) pointer++;
if (pointer >= text.length || text[pointer] !== "(") {
i = closeBracket;
continue;
}
const openParen = pointer;
const closeParen = findMatchingPair(text, openParen, "(", ")");
if (closeParen === -1) break;
ranges.push({
textStart: openBracket + 1,
textEnd: closeBracket,
destStart: openParen + 1,
destEnd: closeParen,
isImage,
});
i = closeParen;
}
return ranges;
}
function findMatchingPair(text, startIndex, openChar, closeChar) {
let depth = 0;
for (let i = startIndex; i < text.length; i++) {
const ch = text[i];
if (ch === "\\") {
i++;
continue;
}
if (ch === openChar) {
depth++;
} else if (ch === closeChar) {
depth--;
if (depth === 0) {
return i;
}
}
}
return -1;
}
function isInsideExistingLink(start, end, ranges) {
return ranges.some((range) => {
const overlapsText = start < range.textEnd && end > range.textStart;
const overlapsDest =
typeof range.destStart === "number" &&
typeof range.destEnd === "number" &&
start < range.destEnd &&
end > range.destStart;
return overlapsText || overlapsDest;
});
}
function hasWordBoundaries(text, start, end) {
const before = start > 0 ? text[start - 1] : "";
const after = end < text.length ? text[end] : "";
const startChar = text[start];
const endChar = text[end - 1];
if (isWordChar(startChar) && isWordChar(before)) {
return false;
}
if (isWordChar(endChar) && isWordChar(after)) {
return false;
}
return true;
}
function isWordChar(ch) {
return Boolean(ch && WORD_CHAR.test(ch));
}
function slugify(value) {
return value
.normalize("NFD")
.replace(/\p{Diacritic}/gu, "")
.toLowerCase()
.replace(/[^a-z0-9]+/g, "-")
.replace(/^-+|-+$/g, "")
.replace(/-{2,}/g, "-");
}
function escapeRegExp(value) {
return value.replace(/[.*+?^${}()|[\]\\]/g, "\\$&");
}
function includeFormattingCharacters(text, start, end) {
let newStart = start;
let newEnd = end;
for (const marker of INLINE_FORMATTING_CHARS) {
let prefixCount = 0;
while (newStart - prefixCount - 1 >= 0 && text[newStart - prefixCount - 1] === marker) {
prefixCount++;
}
let suffixCount = 0;
while (newEnd + suffixCount < text.length && text[newEnd + suffixCount] === marker) {
suffixCount++;
}
const count = Math.min(prefixCount, suffixCount);
if (count > 0) {
newStart -= count;
newEnd += count;
}
}
return { start: newStart, end: newEnd };
}
function getIndentation(line) {
if (typeof line !== "string" || line.length === 0) return 0;
const match = line.match(/^\s*/);
return match ? match[0].length : 0;
}

View File

@@ -1,646 +0,0 @@
#!/usr/bin/env node
/**
* Automatically attaches taxonomy terms to Hugo articles by scanning the body
* of each Markdown file for known keywords that already exist in frontmatter.
*
* Usage:
* node tools/link_taxonomy_terms.js [--dry-run] [paths...]
*
* Without arguments every Markdown file under content/ is processed.
*/
const fs = require("node:fs");
const path = require("node:path");
const yaml = require("js-yaml");
const PROJECT_ROOT = path.resolve(__dirname, "..");
const CONTENT_ROOT = path.join(PROJECT_ROOT, "content");
const TAXONOMIES_PATH = path.join(PROJECT_ROOT, "config", "_default", "taxonomies.yaml");
const FRONTMATTER_PATTERN = /^---\n([\s\S]+?)\n---\n?([\s\S]*)$/;
const collator = new Intl.Collator("fr", { sensitivity: "base", usage: "sort" });
function main() {
const { options, targets } = parseArgs(process.argv.slice(2));
const taxonomyMapping = loadTaxonomyMapping(TAXONOMIES_PATH);
if (taxonomyMapping.canonicalNames.length === 0) {
console.error("❌ No taxonomies found in config/_default/taxonomies.yaml");
process.exit(1);
}
const files = collectMarkdownFiles(targets);
if (files.length === 0) {
console.log("No Markdown content found to analyse.");
return;
}
const articles = files
.map((filePath) => parseArticle(filePath))
.filter((article) => article !== null);
if (articles.length === 0) {
console.log("No articles with valid YAML frontmatter were found.");
return;
}
const { catalog, totalKeywords } = buildKeywordCatalog(articles, taxonomyMapping);
if (totalKeywords === 0) {
console.log("No taxonomy keywords available to propagate.");
return;
}
console.log(
`Catalogued ${totalKeywords} keyword${totalKeywords > 1 ? "s" : ""} across ${
catalog.size
} taxonomie${catalog.size > 1 ? "s" : ""}.`
);
const modifications = applyTaxonomies(articles, catalog, taxonomyMapping, options);
if (modifications.length === 0) {
console.log("No taxonomy updates required.");
if (options.dryRun) {
console.log("Dry-run only: no files would be modified.");
}
return;
}
for (const change of modifications) {
const relPath = path.relative(PROJECT_ROOT, change.path);
console.log(`✏️ ${relPath}`);
for (const [taxonomy, values] of change.additions.entries()) {
console.log(` ${taxonomy}: ${values.join(", ")}`);
}
}
if (options.dryRun) {
console.log(`Dry-run complete. ${modifications.length} article(s) would be updated.`);
} else {
console.log(`Updated ${modifications.length} article(s).`);
console.log(`Vérifier les modifications.`);
process.exit(2);
}
}
function parseArgs(argv) {
const options = { dryRun: false };
const targets = [];
for (const arg of argv) {
if (arg === "--dry-run" || arg === "--check") {
options.dryRun = true;
} else if (arg === "--help" || arg === "-h") {
showUsage();
process.exit(0);
} else if (arg.startsWith("-")) {
console.error(`Unknown option: ${arg}`);
showUsage();
process.exit(1);
} else {
targets.push(arg);
}
}
return { options, targets };
}
function showUsage() {
console.log(`Usage: node tools/link_taxonomy_terms.js [--dry-run] [path...]
Options
--dry-run Analyse files but do not rewrite anything
--help Show this message
Examples
node tools/link_taxonomy_terms.js --dry-run
node tools/link_taxonomy_terms.js content/interets/paleontologie`);
}
function loadTaxonomyMapping(configPath) {
let raw;
try {
raw = fs.readFileSync(configPath, "utf8");
} catch (error) {
throw new Error(`Unable to read ${configPath}: ${error.message}`);
}
let data;
try {
data = yaml.load(raw) || {};
} catch (error) {
throw new Error(`Invalid YAML in ${configPath}: ${error.message}`);
}
if (typeof data !== "object" || Array.isArray(data)) {
throw new Error(`Unexpected taxonomies format in ${configPath}`);
}
const fieldToCanonical = new Map();
const canonicalToFields = new Map();
for (const [singular, plural] of Object.entries(data)) {
const canonicalName = typeof plural === "string" && plural.trim().length > 0 ? plural.trim() : singular;
if (!canonicalName) continue;
const candidateNames = new Set([singular, canonicalName].filter(Boolean));
for (const name of candidateNames) {
fieldToCanonical.set(name, canonicalName);
if (!canonicalToFields.has(canonicalName)) {
canonicalToFields.set(canonicalName, new Set());
}
canonicalToFields.get(canonicalName).add(name);
}
}
return {
fieldToCanonical,
canonicalToFields,
canonicalNames: Array.from(canonicalToFields.keys()),
};
}
function collectMarkdownFiles(targets) {
const files = new Set();
if (targets.length === 0) {
walkContentTree(CONTENT_ROOT, files);
return Array.from(files).sort();
}
for (const target of targets) {
const absolute = path.resolve(PROJECT_ROOT, target);
if (!fs.existsSync(absolute)) {
console.warn(`⚠️ Skipping missing path: ${target}`);
continue;
}
const stats = fs.statSync(absolute);
if (stats.isDirectory()) {
walkContentTree(absolute, files);
} else if (stats.isFile() && absolute.toLowerCase().endsWith(".md")) {
files.add(absolute);
}
}
return Array.from(files).sort();
}
function walkContentTree(dir, fileSet) {
let entries;
try {
entries = fs.readdirSync(dir, { withFileTypes: true });
} catch (error) {
console.warn(`⚠️ Cannot read ${dir}: ${error.message}`);
return;
}
for (const entry of entries) {
const fullPath = path.join(dir, entry.name);
if (entry.isDirectory()) {
if (entry.name === ".git" || entry.name === "node_modules") continue;
walkContentTree(fullPath, fileSet);
} else if (entry.isFile() && entry.name.toLowerCase().endsWith(".md")) {
fileSet.add(fullPath);
}
}
}
function parseArticle(filePath) {
let raw;
try {
raw = fs.readFileSync(filePath, "utf8");
} catch (error) {
console.warn(`⚠️ Unable to read ${filePath}: ${error.message}`);
return null;
}
const match = raw.match(FRONTMATTER_PATTERN);
if (!match) {
console.warn(`⚠️ ${path.relative(PROJECT_ROOT, filePath)} is missing YAML frontmatter. Skipping.`);
return null;
}
let data = {};
try {
data = yaml.load(match[1]) || {};
} catch (error) {
console.warn(`⚠️ Failed to parse frontmatter in ${filePath}: ${error.message}`);
return null;
}
if (typeof data !== "object" || Array.isArray(data)) {
console.warn(`⚠️ Unexpected frontmatter structure in ${filePath}. Skipping.`);
return null;
}
return {
path: filePath,
frontmatter: data,
frontmatterRaw: match[1],
body: match[2] || "",
};
}
function buildKeywordCatalog(articles, taxonomyMapping) {
const keywordMaps = new Map();
for (const canonical of taxonomyMapping.canonicalNames) {
keywordMaps.set(canonical, new Map());
}
for (const article of articles) {
const frontmatter = article.frontmatter;
for (const [field, value] of Object.entries(frontmatter)) {
const canonical = taxonomyMapping.fieldToCanonical.get(field);
if (!canonical) continue;
const strings = toStringArray(value);
if (strings.length === 0) continue;
const lookup = keywordMaps.get(canonical);
for (const entry of strings) {
const normalized = normalizeTerm(entry);
if (!normalized || lookup.has(normalized)) continue;
lookup.set(normalized, entry);
}
}
}
const catalog = new Map();
let totalKeywords = 0;
for (const [canonical, map] of keywordMaps.entries()) {
if (map.size === 0) continue;
const sortedValues = Array.from(map.values()).sort(compareKeywords);
const entries = [];
for (const value of sortedValues) {
const pattern = buildKeywordPattern(value);
if (!pattern) continue;
entries.push({ value, pattern });
}
if (entries.length === 0) continue;
totalKeywords += entries.length;
catalog.set(canonical, entries);
}
return { catalog, totalKeywords };
}
function applyTaxonomies(articles, catalog, taxonomyMapping, options) {
const changes = [];
for (const article of articles) {
const additions = new Map();
let mutated = false;
const occupiedRanges = [];
const taxonomyStates = new Map();
const keywordTasks = [];
const ignoredKeywords = extractIgnoredKeywords(article.frontmatterRaw, taxonomyMapping);
for (const [canonical, keywordEntries] of catalog.entries()) {
if (keywordEntries.length === 0) continue;
const fieldName = resolveFieldName(article.frontmatter, canonical, taxonomyMapping);
const currentValues = toStringArray(article.frontmatter[fieldName]);
const normalizedExisting = new Set(currentValues.map((value) => normalizeTerm(value)));
const state = {
canonical,
fieldName,
currentValues,
normalizedExisting,
};
taxonomyStates.set(canonical, state);
for (const entry of keywordEntries) {
keywordTasks.push({
canonical,
value: entry.value,
pattern: entry.pattern,
state,
});
}
}
keywordTasks.sort((a, b) => compareKeywords(a.value, b.value));
const urlRanges = collectMarkdownUrlRanges(article.body);
const searchableBody = normalizeTypographyForSearch(article.body);
for (const task of keywordTasks) {
const { state, canonical, value, pattern } = task;
const regex = new RegExp(pattern, "gu");
const matchRange = findAvailableMatchRange(regex, searchableBody, occupiedRanges, urlRanges);
if (!matchRange) {
continue;
}
if (shouldSkipSingleWordMatch(value, article.body, matchRange)) {
occupiedRanges.push(matchRange);
continue;
}
occupiedRanges.push(matchRange);
const normalized = normalizeTerm(value);
if (state.normalizedExisting.has(normalized)) {
continue;
}
if (isIgnoredKeyword(canonical, normalized, ignoredKeywords)) {
continue;
}
state.currentValues.push(value);
state.normalizedExisting.add(normalized);
mutated = true;
article.frontmatter[state.fieldName] = state.currentValues;
if (!additions.has(canonical)) {
additions.set(canonical, []);
}
additions.get(canonical).push(value);
}
if (mutated) {
if (!options.dryRun) {
writeArticle(article);
}
changes.push({ path: article.path, additions });
}
}
return changes;
}
function resolveFieldName(frontmatter, canonicalName, taxonomyMapping) {
const candidateSet = taxonomyMapping.canonicalToFields.get(canonicalName);
if (candidateSet) {
for (const key of Object.keys(frontmatter)) {
if (candidateSet.has(key)) {
return key;
}
}
}
return canonicalName;
}
function writeArticle(article) {
const yamlContent = yaml.dump(article.frontmatter, { lineWidth: 120, sortKeys: false });
const finalBody = article.body || "";
const next = `---\n${yamlContent}---\n${finalBody}`;
fs.writeFileSync(article.path, next, "utf8");
}
function toStringArray(value) {
if (Array.isArray(value)) {
return value
.map((entry) => transformToString(entry))
.filter((entry) => entry.length > 0);
}
const single = transformToString(value);
return single.length > 0 ? [single] : [];
}
function transformToString(value) {
if (value === null || value === undefined) {
return "";
}
if (typeof value === "string") {
return value.trim();
}
if (typeof value === "number") {
return String(value);
}
return "";
}
function normalizeTerm(value) {
return transformToString(value).normalize("NFKC").toLocaleLowerCase("fr");
}
function compareKeywords(a, b) {
const diff = b.length - a.length;
if (diff !== 0) {
return diff;
}
return collator.compare(a, b);
}
function escapeRegExp(value) {
return value.replace(/[.*+?^${}()|[\]\\]/g, "\\$&");
}
function buildKeywordPattern(value) {
const keyword = transformToString(value);
if (!keyword) {
return null;
}
const characters = Array.from(keyword);
if (characters.length === 0) {
return null;
}
const firstChar = characters[0];
const restChars = characters.slice(1);
const firstPattern = buildFirstCharacterPattern(firstChar);
const restPattern = buildRemainingPattern(restChars);
return `(?<![\\p{L}\\p{N}_])${firstPattern}${restPattern}(?![\\p{L}\\p{N}_])`;
}
function buildFirstCharacterPattern(char) {
if (!/\p{L}/u.test(char)) {
return escapeRegExp(char);
}
const variants = new Set([char, char.toLocaleLowerCase("fr"), char.toLocaleUpperCase("fr")]);
const entries = Array.from(variants)
.filter((variant) => variant.length > 0)
.map((variant) => ({
raw: variant,
escaped: escapeRegExp(variant),
runeLength: Array.from(variant).length,
}));
if (entries.length === 1) {
return entries[0].escaped;
}
if (entries.every((entry) => entry.runeLength === 1)) {
return `[${entries.map((entry) => entry.escaped).join("")}]`;
}
return `(?:${entries.map((entry) => entry.escaped).join("|")})`;
}
function buildRemainingPattern(characters) {
if (characters.length === 0) {
return "";
}
let pattern = "";
let previousWasWhitespace = false;
for (const char of characters) {
if (/\s/u.test(char)) {
if (!previousWasWhitespace) {
pattern += "\\s+";
previousWasWhitespace = true;
}
continue;
}
pattern += escapeRegExp(char);
previousWasWhitespace = false;
}
return pattern;
}
function findAvailableMatchRange(regex, text, occupiedRanges, urlRanges) {
regex.lastIndex = 0;
let match;
while ((match = regex.exec(text)) !== null) {
const start = match.index;
const end = start + match[0].length;
if (rangeOverlaps(urlRanges, start, end)) {
continue;
}
if (!overlapsExistingRange(occupiedRanges, start, end)) {
return [start, end];
}
}
return null;
}
function overlapsExistingRange(ranges, start, end) {
for (const [existingStart, existingEnd] of ranges) {
if (start === existingStart && end === existingEnd) {
continue;
}
if (start < existingEnd && end > existingStart) {
return true;
}
}
return false;
}
function collectMarkdownUrlRanges(markdown) {
const ranges = [];
if (!markdown) {
return ranges;
}
const linkPattern = /\[[^\]]*\]\(([^)]+)\)/g;
let match;
while ((match = linkPattern.exec(markdown)) !== null) {
const relativeParen = match[0].indexOf("(");
if (relativeParen === -1) {
continue;
}
const urlStart = match.index + relativeParen + 1;
const urlEnd = urlStart + (match[1] ? match[1].length : 0);
ranges.push([urlStart, urlEnd]);
}
return ranges;
}
function rangeOverlaps(ranges, start, end) {
for (const [rangeStart, rangeEnd] of ranges) {
if (start < rangeEnd && end > rangeStart) {
return true;
}
}
return false;
}
function normalizeTypographyForSearch(text) {
if (!text) {
return "";
}
return text.replace(/[*_]/g, " ");
}
function shouldSkipSingleWordMatch(keyword, body, range) {
if (!keyword || /\s/.test(keyword)) {
return false;
}
const [, end] = range;
const lookahead = body.slice(end);
return /^\s+[A-Z\u00C0-\u017F]\./u.test(lookahead);
}
function extractIgnoredKeywords(rawFrontmatter, taxonomyMapping) {
const ignoreMap = new Map();
if (!rawFrontmatter) {
return ignoreMap;
}
const lines = rawFrontmatter.split(/\r?\n/);
let currentField = null;
for (const line of lines) {
const trimmed = line.trim();
if (trimmed.length === 0) {
continue;
}
const fieldMatch = line.match(/^([A-Za-z0-9_-]+):\s*(.*)$/);
if (fieldMatch && !line.trimStart().startsWith("#")) {
const fieldName = fieldMatch[1];
const remainder = fieldMatch[2];
if (remainder.trim().length === 0) {
currentField = fieldName;
} else {
currentField = null;
}
continue;
}
if (!currentField) {
continue;
}
const commentMatch = line.match(/^\s*#\s*-\s*(.+?)\s*$/);
if (!commentMatch) {
continue;
}
let value = commentMatch[1].trim();
if (!value) {
continue;
}
if ((value.startsWith('"') && value.endsWith('"')) || (value.startsWith("'") && value.endsWith("'"))) {
value = value.slice(1, -1).trim();
}
if (!value) {
continue;
}
const canonical = taxonomyMapping.fieldToCanonical.get(currentField);
if (!canonical) {
continue;
}
const normalized = normalizeTerm(value);
if (!normalized) {
continue;
}
if (!ignoreMap.has(canonical)) {
ignoreMap.set(canonical, new Set());
}
ignoreMap.get(canonical).add(normalized);
}
return ignoreMap;
}
function isIgnoredKeyword(canonical, normalizedValue, ignoreMap) {
if (!canonical || !normalizedValue) {
return false;
}
const values = ignoreMap.get(canonical);
if (!values) {
return false;
}
return values.has(normalizedValue);
}
main();

View File

@@ -1,508 +0,0 @@
#!/usr/bin/env node
const path = require("node:path");
const { spawnSync } = require("node:child_process");
const readline = require("node:readline/promises");
const { stdin, stdout } = require("node:process");
const { DateTime } = require("luxon");
const { listArchiveCaptures } = require("./lib/archive");
const {
resolveExternalLinksReportPath,
loadExternalLinksReport,
getLinksByStatus,
} = require("./lib/external_links_report");
const { findUrlOccurrences, replaceUrlInFiles } = require("./lib/url_replacements");
const PROJECT_ROOT = path.resolve(__dirname, "..");
const DEFAULT_CONTENT_DIR = path.join(PROJECT_ROOT, "content");
const DEFAULT_STATUS_CODE = 404;
/**
* Convertit une reponse utilisateur en boolen simple.
* @param {string} answer Reponse brute.
* @returns {boolean} true si la reponse signifie oui.
*/
function isYes(answer) {
if (typeof answer !== "string") {
return false;
}
const normalized = answer.trim().toLowerCase();
return normalized === "o" || normalized === "oui" || normalized === "y" || normalized === "yes";
}
/**
* Resout un chemin CLI par rapport a la racine du projet.
* @param {string} value Valeur fournie en argument.
* @returns {string} Chemin absolu.
*/
function resolveCliPath(value) {
if (typeof value !== "string" || !value.trim()) {
throw new Error("Le chemin fourni est vide.");
}
const trimmed = value.trim();
if (path.isAbsolute(trimmed)) {
return trimmed;
}
return path.resolve(PROJECT_ROOT, trimmed);
}
/**
* Analyse les arguments de la ligne de commande.
* @param {string[]} argv Arguments bruts.
* @returns {{ help: boolean, refresh: boolean, contentDir: string, reportPath: string|null }}
*/
function parseArgs(argv) {
const options = {
help: false,
refresh: true,
contentDir: DEFAULT_CONTENT_DIR,
reportPath: null,
};
for (const arg of argv.slice(2)) {
if (arg === "--help" || arg === "-h") {
options.help = true;
continue;
}
if (arg === "--no-refresh") {
options.refresh = false;
continue;
}
if (arg.startsWith("--content-dir=")) {
options.contentDir = resolveCliPath(arg.slice("--content-dir=".length));
continue;
}
if (arg.startsWith("--report-path=")) {
options.reportPath = resolveCliPath(arg.slice("--report-path=".length));
continue;
}
throw new Error(`Argument non pris en charge : ${arg}`);
}
return options;
}
/**
* Affiche l'aide du script.
*/
function showUsage() {
console.log(`Utilisation : node tools/manage_dead_links.js [options]
Options
--no-refresh Réutilise le rapport existant au lieu de relancer la vérification
--content-dir=<path> Racine à parcourir pour les remplacements
--report-path=<path> Rapport YAML à lire
--help, -h Affiche cette aide
Notes
- Par défaut, le script relance tools/check_external_links.js avant le traitement.
- Les remplacements portent sur les fichiers .md, .markdown, .json, .yaml et .yml.
- L'action de suppression est réservée pour plus tard et n'est pas encore implémentée.`);
}
/**
* Verifie que l'on ne combine pas des options incompatibles.
* @param {{ refresh: boolean, contentDir: string, reportPath: string|null }} options Options retenues.
*/
function validateOptions(options) {
const usesCustomContentDir = path.resolve(options.contentDir) !== path.resolve(DEFAULT_CONTENT_DIR);
const usesCustomReportPath = options.reportPath !== null;
if (!options.refresh) {
return;
}
if (usesCustomContentDir || usesCustomReportPath) {
throw new Error(
"Les options --content-dir et --report-path nécessitent --no-refresh, car le contrôleur de liens actuel travaille sur le projet courant."
);
}
}
/**
* Relance la generation du rapport des liens externes.
*/
function refreshExternalLinksReport() {
const scriptPath = path.join(PROJECT_ROOT, "tools", "check_external_links.js");
const result = spawnSync(process.execPath, [scriptPath], {
cwd: PROJECT_ROOT,
stdio: "inherit",
});
if (result.error) {
throw result.error;
}
if (result.status !== 0) {
throw new Error("La mise à jour du rapport des liens externes a échoué.");
}
}
/**
* Formate un horodatage Archive.org.
* @param {string} timestamp Horodatage brut.
* @returns {string} Representation lisible.
*/
function formatArchiveTimestamp(timestamp) {
if (typeof timestamp !== "string") {
return "horodatage inconnu";
}
const date = DateTime.fromFormat(timestamp, "yyyyLLddHHmmss", { zone: "utc" });
if (!date.isValid) {
return timestamp;
}
return date.setLocale("fr").toFormat("dd/LL/yyyy HH:mm:ss 'UTC'");
}
/**
* Rend un chemin plus lisible pour la console.
* @param {string} filePath Chemin absolu ou relatif.
* @returns {string} Chemin affiche.
*/
function formatFilePath(filePath) {
const resolvedPath = path.resolve(filePath);
const relativePath = path.relative(PROJECT_ROOT, resolvedPath);
if (relativePath && !relativePath.startsWith("..")) {
return relativePath;
}
return resolvedPath;
}
/**
* Affiche les emplacements connus pour un lien mort.
* @param {{ locations?: Array<{ file: string, line: number|null }> }} link Lien courant.
*/
function showLocations(link) {
let locations = [];
if (Array.isArray(link.locations)) {
locations = link.locations;
}
if (locations.length === 0) {
console.log("Emplacements connus : aucun");
return;
}
console.log("Emplacements connus :");
for (const location of locations) {
let label = ` - ${location.file}`;
if (typeof location.line === "number" && Number.isFinite(location.line)) {
label += `:${location.line}`;
}
console.log(label);
}
}
/**
* Affiche les actions disponibles pour le lien courant.
* @param {boolean} allowArchive Indique si Archive.org reste propose.
*/
function showActions(allowArchive) {
console.log("Actions :");
if (allowArchive) {
console.log(" 1. Remplacer par une capture Archive.org");
}
console.log(" 2. Remplacer par une autre URL");
console.log(" 3. Supprimer le lien (non implémenté)");
console.log(" s. Passer au lien suivant");
console.log(" q. Quitter");
}
/**
* Demande l'action a executer pour un lien.
* @param {readline.Interface} rl Interface readline.
* @param {boolean} allowArchive Indique si l'action Archive.org est disponible.
* @returns {Promise<"archive"|"custom"|"remove"|"skip"|"quit">}
*/
async function promptAction(rl, allowArchive) {
while (true) {
showActions(allowArchive);
const answer = (await rl.question("> ")).trim().toLowerCase();
if (answer === "1" && allowArchive) {
return "archive";
}
if (answer === "2") {
return "custom";
}
if (answer === "3") {
return "remove";
}
if (answer === "s") {
return "skip";
}
if (answer === "q") {
return "quit";
}
console.log("Choix invalide.");
}
}
/**
* Propose une selection de captures Wayback a l'utilisateur.
* @param {readline.Interface} rl Interface readline.
* @param {string} deadUrl URL d'origine.
* @returns {Promise<{ type: "selected", replacementUrl: string }|{ type: "unavailable" }|{ type: "cancel" }>}
*/
async function promptArchiveReplacement(rl, deadUrl) {
const captures = await listArchiveCaptures(deadUrl, { limit: 10 });
if (captures.length === 0) {
console.log("Aucune capture Archive.org exploitable n'a été trouvée pour cette URL.");
return { type: "unavailable" };
}
if (captures.length === 1) {
const capture = captures[0];
console.log("Une capture Archive.org a été trouvée :");
console.log(` - ${formatArchiveTimestamp(capture.timestamp)}`);
console.log(` ${capture.url}`);
const confirm = await rl.question("Utiliser cette capture ? (o/N) ");
if (isYes(confirm)) {
return { type: "selected", replacementUrl: capture.url };
}
return { type: "cancel" };
}
console.log("Captures Archive.org disponibles (10 plus récentes) :");
for (const [index, capture] of captures.entries()) {
console.log(` ${index + 1}. ${formatArchiveTimestamp(capture.timestamp)}`);
console.log(` ${capture.url}`);
}
while (true) {
const answer = (
await rl.question(`Choisissez une capture (1-${captures.length}) ou Entrée pour revenir au menu : `)
).trim();
if (!answer) {
return { type: "cancel" };
}
const selectedIndex = Number.parseInt(answer, 10);
if (Number.isNaN(selectedIndex)) {
console.log("Sélection invalide.");
continue;
}
if (selectedIndex < 1 || selectedIndex > captures.length) {
console.log("Sélection hors plage.");
continue;
}
const capture = captures[selectedIndex - 1];
return { type: "selected", replacementUrl: capture.url };
}
}
/**
* Demande une URL de remplacement manuelle.
* @param {readline.Interface} rl Interface readline.
* @param {string} deadUrl URL remplacee.
* @returns {Promise<string|null>} URL choisie ou null si abandon.
*/
async function promptCustomReplacement(rl, deadUrl) {
while (true) {
const answer = (await rl.question("Nouvelle URL (laisser vide pour revenir au menu) : ")).trim();
if (!answer) {
return null;
}
if (!URL.canParse(answer)) {
console.log("Cette URL n'est pas valide.");
continue;
}
const parsed = new URL(answer);
const protocol = parsed.protocol.toLowerCase();
if (protocol !== "http:" && protocol !== "https:") {
console.log("Seules les URL http et https sont acceptées.");
continue;
}
if (answer === deadUrl) {
console.log("La nouvelle URL est identique à l'ancienne.");
continue;
}
return answer;
}
}
/**
* Affiche le plan de remplacement avant confirmation.
* @param {{ url: string }} link Lien traite.
* @param {string} replacementUrl URL de destination.
* @param {Array<{ filePath: string, occurrences: number }>} matches Fichiers concernes.
* @param {readline.Interface} rl Interface readline.
* @returns {Promise<boolean>} true si l'utilisateur confirme.
*/
async function confirmReplacement(link, replacementUrl, matches, rl) {
let totalOccurrences = 0;
for (const match of matches) {
totalOccurrences += match.occurrences;
}
console.log(`Remplacement prévu : ${link.url}`);
console.log(` vers ${replacementUrl}`);
console.log(`Occurrences : ${totalOccurrences} dans ${matches.length} fichier(s)`);
for (const match of matches.slice(0, 10)) {
console.log(` - ${formatFilePath(match.filePath)} (${match.occurrences})`);
}
if (matches.length > 10) {
console.log(` - ... ${matches.length - 10} fichier(s) supplémentaire(s)`);
}
const answer = await rl.question("Confirmer le remplacement ? (o/N) ");
return isYes(answer);
}
/**
* Applique un remplacement deja choisi.
* @param {{ url: string }} link Lien courant.
* @param {string} replacementUrl URL de remplacement.
* @param {string} contentDir Racine de recherche.
* @param {readline.Interface} rl Interface readline.
* @returns {Promise<boolean>} true si des fichiers ont ete modifies.
*/
async function applyReplacement(link, replacementUrl, contentDir, rl) {
const matches = await findUrlOccurrences(contentDir, link.url);
if (matches.length === 0) {
console.log("Aucune occurrence exacte n'a été trouvée dans le contenu cible.");
return false;
}
const confirmed = await confirmReplacement(link, replacementUrl, matches, rl);
if (!confirmed) {
console.log("Remplacement annulé.");
return false;
}
const result = await replaceUrlInFiles(contentDir, link.url, replacementUrl, { matches });
console.log(
`${result.totalOccurrences} occurrence(s) remplacee(s) dans ${result.changedFiles.length} fichier(s).`
);
return result.changedFiles.length > 0;
}
/**
* Traite un lien 404 dans une boucle interactive.
* @param {readline.Interface} rl Interface readline.
* @param {{ url: string, locations?: Array<{ file: string, line: number|null }> }} link Lien a gerer.
* @param {number} index Index humain.
* @param {number} total Nombre total de liens.
* @param {string} contentDir Racine du contenu.
* @returns {Promise<"changed"|"skipped"|"quit">}
*/
async function processLink(rl, link, index, total, contentDir) {
let allowArchive = true;
while (true) {
console.log("");
console.log(`[${index}/${total}] Lien 404`);
console.log(link.url);
showLocations(link);
const action = await promptAction(rl, allowArchive);
if (action === "quit") {
return "quit";
}
if (action === "skip") {
return "skipped";
}
if (action === "remove") {
console.log("La suppression n'est pas encore implémentée. Retour au menu.");
continue;
}
if (action === "custom") {
const replacementUrl = await promptCustomReplacement(rl, link.url);
if (!replacementUrl) {
continue;
}
const changed = await applyReplacement(link, replacementUrl, contentDir, rl);
if (changed) {
return "changed";
}
continue;
}
if (action === "archive") {
const archiveSelection = await promptArchiveReplacement(rl, link.url);
if (archiveSelection.type === "unavailable") {
allowArchive = false;
continue;
}
if (archiveSelection.type === "cancel") {
continue;
}
const changed = await applyReplacement(link, archiveSelection.replacementUrl, contentDir, rl);
if (changed) {
return "changed";
}
continue;
}
}
}
/**
* Charge la liste des liens 404 a traiter.
* @param {{ reportPath: string|null }} options Options du script.
* @returns {Promise<Array<{ url: string, status: number|null, locations: Array<{ file: string, line: number|null, page: string|null }> }>>}
*/
async function load404Links(options) {
let reportPath = options.reportPath;
if (!reportPath) {
reportPath = await resolveExternalLinksReportPath(PROJECT_ROOT);
}
const report = loadExternalLinksReport(reportPath);
return getLinksByStatus(report, DEFAULT_STATUS_CODE);
}
async function main() {
const options = parseArgs(process.argv);
if (options.help) {
showUsage();
return;
}
validateOptions(options);
if (options.refresh) {
console.log("Actualisation du rapport des liens externes...");
refreshExternalLinksReport();
}
const links = await load404Links(options);
if (links.length === 0) {
console.log("Aucun lien 404 à traiter.");
return;
}
const rl = readline.createInterface({ input: stdin, output: stdout });
let changedCount = 0;
let skippedCount = 0;
const interactiveRun = async () => {
for (const [index, link] of links.entries()) {
const outcome = await processLink(rl, link, index + 1, links.length, options.contentDir);
if (outcome === "quit") {
break;
}
if (outcome === "changed") {
changedCount += 1;
continue;
}
skippedCount += 1;
}
};
await interactiveRun().finally(() => rl.close());
console.log("");
console.log(`Traitement terminé : ${changedCount} lien(s) traité(s), ${skippedCount} lien(s) ignoré(s).`);
if (changedCount > 0 && options.refresh) {
console.log("Régénération du rapport après modifications...");
refreshExternalLinksReport();
}
if (changedCount > 0 && !options.refresh) {
console.log("Le rapport n'a pas été régénéré car --no-refresh est actif.");
}
}
main().catch((error) => {
console.error("Erreur lors de la gestion des liens morts :", error.message);
process.exitCode = 1;
});

View File

@@ -1,226 +0,0 @@
#!/usr/bin/env node
const fs = require("node:fs");
const path = require("node:path");
const { Pool } = require("pg");
const { loadEnv } = require("./lib/env");
const { loadToolsConfig } = require("./lib/config");
const { readFrontmatterFile } = require("./lib/frontmatter");
const { isEffectivelyPublished } = require("./lib/publication");
const {
resolveBundlePath,
ensureBundleExists,
ensureWithinContent,
splitRelativeParts,
resolveDestination,
moveBundle,
addAlias,
cleanupEmptyParents,
findDateSegments,
} = require("./lib/article_move");
const {
normalizeLemmyConfig,
createLemmyClient,
buildArticleUrl,
buildCommunityDescriptor,
ensureCommunity,
isYearSegment,
isMonthSegment,
isDaySegment,
} = require("./lib/lemmy");
const CONTENT_ROOT = path.join(__dirname, "..", "content");
const FRONTMATTER_COMMENT_FIELD = "comments_url";
const DEFAULT_DATABASE_URL = "postgres:///lemmy?host=/run/postgresql&user=richard";
main().then(
() => {
process.exit(0);
},
(error) => {
console.error(`❌ Déplacement interrompu : ${error.message}`);
process.exit(1);
}
);
/**
* Point d'entrée : déplace le bundle et synchronise Lemmy si possible.
*/
async function main() {
loadEnv();
const args = process.argv.slice(2);
if (args.length < 2) {
throw new Error("Usage: node tools/move_article.js <chemin_source> <chemin_destination>");
}
const sourceInput = args[0];
const destinationInput = args[1];
const sourceBundle = resolveBundlePath(sourceInput);
ensureBundleExists(sourceBundle);
ensureWithinContent(sourceBundle, CONTENT_ROOT);
const sourceRelativeParts = splitRelativeParts(sourceBundle, CONTENT_ROOT);
const sourceSlug = sourceRelativeParts[sourceRelativeParts.length - 1];
const sourceDate = findDateSegments(sourceRelativeParts.slice(0, -1), isDateSegment);
const destination = resolveDestination(
destinationInput,
sourceSlug,
sourceDate,
CONTENT_ROOT,
isDateSegment
);
ensureWithinContent(destination.bundleDir, CONTENT_ROOT);
if (fs.existsSync(destination.bundleDir)) {
throw new Error(`Le bundle ${destination.bundleDir} existe déjà.`);
}
const sourceFrontmatter = readFrontmatterFile(path.join(sourceBundle, "index.md"));
if (!sourceFrontmatter) {
throw new Error(`Frontmatter introuvable pour ${sourceBundle}.`);
}
moveBundle(sourceBundle, destination.bundleDir);
addAlias(destination.bundleDir, sourceRelativeParts);
cleanupEmptyParents(path.dirname(sourceBundle), CONTENT_ROOT);
await updateLemmyIfNeeded(sourceFrontmatter.data, destination.bundleDir);
}
/**
* Met à jour Lemmy si un comments_url est présent.
* @param {object} frontmatterData Données du frontmatter.
* @param {string} bundleDir Chemin du bundle après déplacement.
*/
async function updateLemmyIfNeeded(frontmatterData, bundleDir) {
if (isEffectivelyPublished(frontmatterData) === false) {
return;
}
const commentsUrl = extractCommentsUrl(frontmatterData);
if (!commentsUrl) {
return;
}
const postId = extractPostId(commentsUrl);
if (!postId) {
console.warn("⚠️ comments_url invalide, mise à jour Lemmy ignorée.");
return;
}
const toolsConfig = await loadToolsConfig(path.join(__dirname, "config", "config.json"));
const lemmyConfig = normalizeLemmyConfig(toolsConfig.lemmy);
const client = await createLemmyClient(lemmyConfig);
const databaseUrl = resolveDatabaseUrl();
const pool = new Pool({ connectionString: databaseUrl });
const bundleParts = splitRelativeParts(bundleDir, CONTENT_ROOT);
const descriptor = buildCommunityDescriptor(bundleParts, lemmyConfig.community);
const community = await ensureCommunity(client, descriptor, lemmyConfig.community);
const communityId = community.id;
await updatePostForMove(pool, postId, communityId, buildArticleUrl(lemmyConfig.siteUrl, bundleParts));
await pool.end();
}
/**
* Extrait une URL de commentaires depuis les données du frontmatter.
* @param {object} frontmatterData Données du frontmatter.
* @returns {string} URL des commentaires ou chaîne vide.
*/
function extractCommentsUrl(frontmatterData) {
if (!frontmatterData) {
return "";
}
if (typeof frontmatterData[FRONTMATTER_COMMENT_FIELD] !== "string") {
return "";
}
return frontmatterData[FRONTMATTER_COMMENT_FIELD].trim();
}
/**
* Extrait l'identifiant numérique d'un comments_url Lemmy.
* @param {string} url URL issue du frontmatter.
* @returns {number|null} Identifiant ou null si non reconnu.
*/
function extractPostId(url) {
const trimmed = url.trim();
if (!trimmed) {
return null;
}
const normalized = trimmed.replace(/\/+$/, "");
const match = normalized.match(/\/(?:post|c\/[^/]+\/post)\/(\d+)(?:$|\?)/i);
if (!match) {
return null;
}
return Number.parseInt(match[1], 10);
}
/**
* Met à jour le post Lemmy après déplacement.
* @param {Pool} pool Pool Postgres.
* @param {number} postId Identifiant du post.
* @param {number} communityId Communauté cible.
* @param {string} newUrl Nouvelle URL Hugo.
*/
async function updatePostForMove(pool, postId, communityId, newUrl) {
await pool.query("update post set community_id = $1, url = $2 where id = $3", [
communityId,
newUrl,
postId,
]);
const hasAggregates = await tableHasColumn(pool, "post_aggregates", "community_id");
if (hasAggregates) {
await pool.query("update post_aggregates set community_id = $1 where post_id = $2", [
communityId,
postId,
]);
}
}
/**
* Indique si une table possède une colonne donnée.
* @param {Pool} pool Pool Postgres.
* @param {string} tableName Nom de la table.
* @param {string} columnName Nom de la colonne.
* @returns {Promise<boolean>} true si la colonne existe.
*/
async function tableHasColumn(pool, tableName, columnName) {
const result = await pool.query(
"select column_name from information_schema.columns where table_name = $1 and column_name = $2 limit 1",
[tableName, columnName]
);
return result.rowCount === 1;
}
/**
* Résout l'URL de connexion Postgres pour Lemmy.
* @returns {string} URL de connexion.
*/
function resolveDatabaseUrl() {
if (typeof process.env.LEMMY_DATABASE_URL === "string" && process.env.LEMMY_DATABASE_URL.trim()) {
return process.env.LEMMY_DATABASE_URL.trim();
}
return DEFAULT_DATABASE_URL;
}
/**
* Détermine si des segments forment une date au format année/mois/jour.
* @param {string[]} parts Segments du chemin.
* @param {number} index Position de départ.
* @returns {string[]|null} Segments de date ou null.
*/
function isDateSegment(parts, index) {
if (!isYearSegment(parts[index])) {
return null;
}
if (!isMonthSegment(parts[index + 1])) {
return null;
}
if (!isDaySegment(parts[index + 2])) {
return null;
}
return [parts[index], parts[index + 1], parts[index + 2]];
}

View File

@@ -1,125 +0,0 @@
#!/usr/bin/env node
const fs = require("fs/promises");
const path = require("path");
const {
extractRawDate,
readFrontmatter,
writeFrontmatter,
} = require("./lib/weather/frontmatter");
const { parseFrontmatterDate, formatDateTime, getHugoTimeZone } = require("./lib/datetime");
const CONTENT_ROOT = path.join(process.cwd(), "content");
/**
* Liste récursivement tous les fichiers Markdown d'un dossier.
* @param {string} root Dossier racine à parcourir.
* @returns {Promise<string[]>} Chemins absolus des fichiers trouvés.
*/
async function listMarkdownFiles(root) {
const entries = await fs.readdir(root, { withFileTypes: true });
const results = [];
for (const entry of entries) {
const fullPath = path.join(root, entry.name);
if (entry.isDirectory()) {
const nested = await listMarkdownFiles(fullPath);
results.push(...nested);
continue;
}
if (entry.isFile() && entry.name.endsWith(".md")) {
results.push(fullPath);
}
}
return results;
}
/**
* Harmonise la date d'un fichier Markdown si nécessaire.
* @param {string} filePath Chemin absolu du fichier.
* @returns {Promise<"updated"|"unchanged"|"skipped"|"invalid">} Statut de traitement.
*/
async function normalizeFileDate(filePath) {
const frontmatter = await readFrontmatter(filePath);
if (!frontmatter) {
return "skipped";
}
const rawDate = extractRawDate(frontmatter.frontmatterText);
const dateValue = frontmatter.doc.get("date");
if (!rawDate && (dateValue === undefined || dateValue === null)) {
return "skipped";
}
const sourceValue = rawDate !== null ? rawDate : dateValue;
const parsed = parseFrontmatterDate(sourceValue);
if (!parsed) {
return "invalid";
}
let hasTime = false;
if (typeof rawDate === "string") {
const timeMatch = rawDate.match(/[T ](\d{2}):(\d{2})(?::(\d{2}))?/);
if (timeMatch) {
const hour = Number(timeMatch[1]);
const minute = Number(timeMatch[2]);
const second = Number(timeMatch[3] || "0");
hasTime = !(hour === 0 && minute === 0 && second === 0);
}
}
const normalized = hasTime
? parsed.set({ millisecond: 0 })
: parsed.set({ hour: 12, minute: 0, second: 0, millisecond: 0 });
const formatted = formatDateTime(normalized);
const current = typeof dateValue === "string" ? dateValue.trim() : rawDate;
const quotedFormatted = `"${formatted}"`;
const currentComparable = typeof current === "string" ? current.trim() : "";
if (currentComparable === formatted || currentComparable === quotedFormatted) {
return "unchanged";
}
frontmatter.doc.set("date", formatted);
await writeFrontmatter(filePath, frontmatter.doc, frontmatter.body);
const rewritten = await fs.readFile(filePath, "utf8");
const normalizedContent = rewritten.replace(/^date:\s*.+$/m, `date: ${quotedFormatted}`);
if (rewritten !== normalizedContent) {
await fs.writeFile(filePath, normalizedContent, "utf8");
}
return "updated";
}
/**
* Point d'entrée du script.
*/
async function main() {
const timezone = getHugoTimeZone();
console.log(`Fuseau horaire Hugo : ${timezone}`);
const files = await listMarkdownFiles(CONTENT_ROOT);
let updated = 0;
let unchanged = 0;
let skipped = 0;
let invalid = 0;
for (const file of files) {
const status = await normalizeFileDate(file);
if (status === "updated") updated += 1;
else if (status === "unchanged") unchanged += 1;
else if (status === "invalid") {
invalid += 1;
const relative = path.relative(process.cwd(), file);
console.warn(`Date invalide : ${relative}`);
} else {
skipped += 1;
}
}
console.log(
`Terminé. ${updated} mis à jour, ${unchanged} inchangés, ${skipped} ignorés, ${invalid} invalides.`
);
}
main();

View File

@@ -1,32 +0,0 @@
#!/usr/bin/env node
const { loadArticles } = require("../lib/stats/articles");
function computeAveragePerMonth(articles) {
let first = null;
let last = null;
for (const article of articles) {
if (!article.date) continue;
if (!first || article.date < first) first = article.date;
if (!last || article.date > last) last = article.date;
}
if (!first || !last) {
return { average: 0, months: 0 };
}
const monthSpan = Math.max(1, Math.round(last.diff(first.startOf("month"), "months").months) + 1);
const total = articles.filter((a) => a.date).length;
const average = total / monthSpan;
return { average, months: monthSpan };
}
async function run({ contentDir }) {
const articles = await loadArticles(contentDir || "content");
const { average, months } = computeAveragePerMonth(articles);
return { value: average.toFixed(2), meta: { months } };
}
module.exports = { run };

View File

@@ -1,81 +0,0 @@
#!/usr/bin/env python3
import sys
import json
import os
from collections import defaultdict
CURRENT_DIR = os.path.dirname(os.path.abspath(__file__))
PARENT_DIR = os.path.abspath(os.path.join(CURRENT_DIR, os.pardir))
if CURRENT_DIR not in sys.path:
sys.path.append(CURRENT_DIR)
if PARENT_DIR not in sys.path:
sys.path.append(PARENT_DIR)
from common import load_articles, MONTH_LABELS, write_result # noqa: E402
def main():
try:
payload = json.load(sys.stdin)
except Exception as exc: # noqa: BLE001
print(f"Failed to read JSON: {exc}", file=sys.stderr)
sys.exit(1)
content_dir = payload.get("contentDir") or "content"
output_path = payload.get("outputPath")
public_path = payload.get("publicPath")
articles = load_articles(content_dir)
counts = defaultdict(int)
years = set()
first = None
last = None
for article in articles:
date = article.get("date")
if not date:
continue
year = date.year
month = date.month
years.add(year)
counts[(year, month)] += 1
if not first or date < first:
first = date
if not last or date > last:
last = date
month_numbers = list(range(1, 13))
labels = [MONTH_LABELS[m - 1] for m in month_numbers]
sorted_years = sorted(years)
series = []
for year in sorted_years:
values = [counts.get((year, m), 0) for m in month_numbers]
series.append({"label": str(year), "values": values})
# Render via shared renderer
try:
from lib.render_stats_charts import render_articles_per_month, setup_rcparams
except ImportError as exc: # noqa: BLE001
print(f"Failed to import renderer: {exc}", file=sys.stderr)
sys.exit(1)
setup_rcparams()
render_articles_per_month({"labels": labels, "series": series, "title": "Articles par mois"}, output_path)
write_result(
{
"image": public_path,
"meta": {
"from": first.isoformat() if first else None,
"to": last.isoformat() if last else None,
"months": len(labels),
},
}
)
if __name__ == "__main__":
main()

View File

@@ -1,69 +0,0 @@
#!/usr/bin/env python3
import sys
import json
import os
from collections import defaultdict
CURRENT_DIR = os.path.dirname(os.path.abspath(__file__))
PARENT_DIR = os.path.abspath(os.path.join(CURRENT_DIR, os.pardir))
if CURRENT_DIR not in sys.path:
sys.path.append(CURRENT_DIR)
if PARENT_DIR not in sys.path:
sys.path.append(PARENT_DIR)
from common import load_articles, write_result # noqa: E402
def main():
try:
payload = json.load(sys.stdin)
except Exception as exc: # noqa: BLE001
print(f"Failed to read JSON: {exc}", file=sys.stderr)
sys.exit(1)
content_dir = payload.get("contentDir") or "content"
output_path = payload.get("outputPath")
public_path = payload.get("publicPath")
articles = load_articles(content_dir)
counts = defaultdict(int)
for article in articles:
section = article.get("section") or "root"
counts[section] += 1
entries = sorted(counts.items(), key=lambda item: item[1], reverse=True)
max_slices = 21
top = entries[:max_slices]
rest = entries[max_slices:]
if rest:
rest_sum = sum(v for _, v in rest)
top.append(("Others", rest_sum))
labels = [label for label, _ in top]
values = [value for _, value in top]
try:
from lib.render_stats_charts import render_articles_per_section, setup_rcparams
except ImportError as exc: # noqa: BLE001
print(f"Failed to import renderer: {exc}", file=sys.stderr)
sys.exit(1)
setup_rcparams()
render_articles_per_section({"labels": labels, "values": values, "title": "Articles par section"}, output_path)
write_result(
{
"image": public_path,
"meta": {
"sections": len(entries),
},
}
)
if __name__ == "__main__":
main()

View File

@@ -1,73 +0,0 @@
#!/usr/bin/env python3
import sys
import json
import os
from collections import defaultdict
CURRENT_DIR = os.path.dirname(os.path.abspath(__file__))
PARENT_DIR = os.path.abspath(os.path.join(CURRENT_DIR, os.pardir))
if CURRENT_DIR not in sys.path:
sys.path.append(CURRENT_DIR)
if PARENT_DIR not in sys.path:
sys.path.append(PARENT_DIR)
from common import load_articles, write_result # noqa: E402
def main():
try:
payload = json.load(sys.stdin)
except Exception as exc: # noqa: BLE001
print(f"Failed to read JSON: {exc}", file=sys.stderr)
sys.exit(1)
content_dir = payload.get("contentDir") or "content"
output_path = payload.get("outputPath")
public_path = payload.get("publicPath")
articles = load_articles(content_dir)
counts = defaultdict(int)
first = None
last = None
for article in articles:
date = article.get("date")
if not date:
continue
year = date.year
counts[year] += 1
if not first or date < first:
first = date
if not last or date > last:
last = date
sorted_years = sorted(counts.keys())
labels = [str(y) for y in sorted_years]
values = [counts[y] for y in sorted_years]
try:
from lib.render_stats_charts import render_articles_per_year, setup_rcparams
except ImportError as exc: # noqa: BLE001
print(f"Failed to import renderer: {exc}", file=sys.stderr)
sys.exit(1)
setup_rcparams()
render_articles_per_year({"labels": labels, "values": values, "title": "Articles par an"}, output_path)
write_result(
{
"image": public_path,
"meta": {
"from": first.isoformat() if first else None,
"to": last.isoformat() if last else None,
"years": len(labels),
},
}
)
if __name__ == "__main__":
main()

View File

@@ -1,174 +0,0 @@
#!/usr/bin/env python3
import os
import re
import json
import yaml
from datetime import datetime, date, timezone
MONTH_LABELS = ["Jan", "Fev", "Mar", "Avr", "Mai", "Jun", "Jul", "Aou", "Sep", "Oct", "Nov", "Dec"]
def find_markdown_files(root):
files = []
for dirpath, dirnames, filenames in os.walk(root):
for filename in filenames:
if not filename.lower().endswith(".md"):
continue
if filename == "_index.md":
continue
files.append(os.path.join(dirpath, filename))
return files
def collect_section_dirs(root):
section_dirs = set()
for dirpath, dirnames, filenames in os.walk(root):
if "_index.md" in filenames:
section_dirs.add(os.path.abspath(dirpath))
return section_dirs
def leaf_sections(section_dirs):
leaves = set()
for section in section_dirs:
is_leaf = True
for other in section_dirs:
if other == section:
continue
if other.startswith(section + os.sep):
is_leaf = False
break
if is_leaf:
leaves.add(section)
return leaves
def parse_frontmatter(path):
with open(path, "r", encoding="utf-8") as handle:
content = handle.read()
if content.startswith("---"):
parts = content.split("---", 2)
if len(parts) >= 3:
fm_text = parts[1]
body = parts[2]
else:
return {}, content
else:
return {}, content
try:
data = yaml.safe_load(fm_text) or {}
except Exception:
data = {}
return data, body
def parse_date(value):
if not value:
return None
dt = None
if isinstance(value, datetime):
dt = value
elif isinstance(value, date):
dt = datetime.combine(value, datetime.min.time())
elif isinstance(value, (int, float)):
try:
dt = datetime.fromtimestamp(value)
except Exception:
dt = None
elif isinstance(value, str):
# try ISO-like formats
for fmt in (
"%Y-%m-%dT%H:%M:%S%z",
"%Y-%m-%dT%H:%M:%S",
"%Y-%m-%d %H:%M:%S",
"%Y-%m-%d %H:%M",
"%Y-%m-%d",
"%Y/%m/%d",
"%d/%m/%Y",
):
try:
dt = datetime.strptime(value, fmt)
break
except Exception:
continue
if dt is None:
try:
dt = datetime.fromisoformat(value)
except Exception:
dt = None
if dt is None:
return None
if dt.tzinfo is not None:
dt = dt.astimezone(timezone.utc).replace(tzinfo=None)
return dt
WORD_RE = re.compile(r"[\w'-]+", re.UNICODE)
def count_words(text):
if not text:
return 0
words = WORD_RE.findall(text)
return len(words)
def resolve_section(file_path, content_root, leaf_dirs):
content_root = os.path.abspath(content_root)
current = os.path.abspath(os.path.dirname(file_path))
best = None
while current.startswith(content_root):
if current in leaf_dirs:
best = current
break
parent = os.path.dirname(current)
if parent == current:
break
current = parent
if not best:
return None
rel = os.path.relpath(best, content_root)
return rel.replace(os.sep, "/") if rel != "." else "."
def load_articles(content_root):
files = find_markdown_files(content_root)
section_dirs = collect_section_dirs(content_root)
leaf_dirs = leaf_sections(section_dirs)
articles = []
for file_path in files:
fm, body = parse_frontmatter(file_path)
date = parse_date(fm.get("date"))
title = fm.get("title") or os.path.splitext(os.path.basename(file_path))[0]
word_count = count_words(body)
rel_path = os.path.relpath(file_path, content_root)
section = resolve_section(file_path, content_root, leaf_dirs)
weather = fm.get("weather") if isinstance(fm, dict) else None
articles.append(
{
"path": file_path,
"relativePath": rel_path,
"title": title,
"date": date,
"wordCount": word_count,
"section": section,
"weather": weather,
}
)
return articles
def write_result(data):
import sys
json.dump(data, sys.stdout)
sys.stdout.flush()

View File

@@ -1,95 +0,0 @@
#!/usr/bin/env python3
import sys
import json
import os
from collections import defaultdict
CURRENT_DIR = os.path.dirname(os.path.abspath(__file__))
PARENT_DIR = os.path.abspath(os.path.join(CURRENT_DIR, os.pardir))
if CURRENT_DIR not in sys.path:
sys.path.append(CURRENT_DIR)
if PARENT_DIR not in sys.path:
sys.path.append(PARENT_DIR)
from common import load_articles, write_result # noqa: E402
def month_key(dt):
return f"{dt.year}-{dt.month:02d}"
def main():
try:
payload = json.load(sys.stdin)
except Exception as exc: # noqa: BLE001
print(f"Failed to read JSON: {exc}", file=sys.stderr)
sys.exit(1)
content_dir = payload.get("contentDir") or "content"
output_path = payload.get("outputPath")
public_path = payload.get("publicPath")
articles = load_articles(content_dir)
monthly_articles = defaultdict(int)
monthly_words = defaultdict(int)
months_set = set()
for article in articles:
date = article.get("date")
if not date:
continue
key = month_key(date)
monthly_articles[key] += 1
monthly_words[key] += article.get("wordCount") or 0
months_set.add(key)
sorted_months = sorted(months_set)
cum_articles = []
cum_words = []
labels = []
total_a = 0
total_w = 0
for key in sorted_months:
total_a += monthly_articles.get(key, 0)
total_w += monthly_words.get(key, 0)
labels.append(key)
cum_articles.append(total_a)
cum_words.append(total_w)
try:
from lib.render_stats_charts import render_cumulative, setup_rcparams
except ImportError as exc: # noqa: BLE001
print(f"Failed to import renderer: {exc}", file=sys.stderr)
sys.exit(1)
setup_rcparams()
render_cumulative(
{
"labels": labels,
"articles": cum_articles,
"words": cum_words,
"title": "Cumul articles / mots",
},
output_path,
)
write_result(
{
"image": public_path,
"meta": {
"months": len(labels),
"articles": total_a,
"words": total_w,
},
}
)
if __name__ == "__main__":
main()

View File

@@ -1,45 +0,0 @@
#!/usr/bin/env node
const { fetchGoAccessJson, aggregateLastNDays } = require("../lib/stats/goaccess");
const { loadToolsConfig } = require("../lib/config");
let cache = null;
async function loadData(url) {
if (!cache) {
cache = await fetchGoAccessJson(url);
}
return cache;
}
function latestMonthEntry(months) {
if (!months || months.length === 0) return null;
return months[months.length - 1];
}
async function run({ stat }) {
const toolsConfig = await loadToolsConfig();
const url = stat.url || toolsConfig.goaccess?.url;
if (!url) {
throw new Error("URL GoAccess manquante (GOACCESS_URL ou goaccess.url dans tools/config/config.json)");
}
const metric = stat.metric || "hits";
const windowDays = Number.isFinite(stat.days) ? stat.days : 30;
const data = await loadData(url);
const window = aggregateLastNDays(data, windowDays, { adjustCrawlers: true });
if (!window || !window.to) return { value: 0 };
const value = metric === "visitors" ? window.visitors : window.hits;
return {
value,
meta: {
from: window.from || null,
to: window.to || null,
days: windowDays,
metric,
raw: value,
},
};
}
module.exports = { run };

View File

@@ -1,36 +0,0 @@
#!/usr/bin/env node
const { DateTime } = require("luxon");
const { loadArticles } = require("../lib/stats/articles");
async function computeMostProlificMonth(contentDir) {
const articles = await loadArticles(contentDir);
const counts = new Map();
for (const article of articles) {
if (!article.date) continue;
const monthKey = article.date.toFormat("yyyy-MM");
counts.set(monthKey, (counts.get(monthKey) || 0) + 1);
}
if (counts.size === 0) {
return null;
}
const sorted = Array.from(counts.entries()).sort((a, b) => {
if (b[1] !== a[1]) return b[1] - a[1];
return a[0] < b[0] ? -1 : 1;
});
const [monthKey, count] = sorted[0];
const label = DateTime.fromISO(`${monthKey}-01`).setLocale("fr").toFormat("LLLL yyyy");
return { value: `${label} (${count})`, month: monthKey, count };
}
async function run({ contentDir }) {
const result = await computeMostProlificMonth(contentDir || "content");
return result || { value: null };
}
module.exports = { run };

View File

@@ -1,131 +0,0 @@
#!/usr/bin/env python3
import sys
import json
import os
import urllib.request
CURRENT_DIR = os.path.dirname(os.path.abspath(__file__))
TOOLS_DIR = os.path.abspath(os.path.join(CURRENT_DIR, os.pardir))
ROOT_DIR = os.path.abspath(os.path.join(TOOLS_DIR, os.pardir))
if CURRENT_DIR not in sys.path:
sys.path.append(CURRENT_DIR)
if TOOLS_DIR not in sys.path:
sys.path.append(TOOLS_DIR)
def load_env(env_path=None):
path = env_path or os.path.join(ROOT_DIR, ".env")
if not os.path.exists(path):
return
try:
with open(path, "r", encoding="utf-8") as handle:
for line in handle:
stripped = line.strip()
if not stripped or stripped.startswith("#") or "=" not in stripped:
continue
key, value = stripped.split("=", 1)
key = key.strip()
if not key or key in os.environ:
continue
os.environ[key] = value
except Exception as exc: # noqa: BLE001
print(f"Failed to load .env: {exc}", file=sys.stderr)
def load_config():
cfg_path = os.path.join(ROOT_DIR, "tools", "config", "config.json")
try:
with open(cfg_path, "r", encoding="utf-8") as handle:
return json.load(handle)
except Exception:
return {}
def fetch_goaccess(url, timeout=10):
with urllib.request.urlopen(url, timeout=timeout) as resp:
data = resp.read().decode("utf-8")
return json.loads(data)
def crawler_ratios(data):
browsers = (data.get("browsers") or {}).get("data") or []
crawler = next((entry for entry in browsers if entry.get("data") == "Crawlers"), None)
if not crawler:
return {"hits": 0.0, "visitors": 0.0}
def total(field):
return sum((entry.get(field, {}) or {}).get("count", 0) for entry in browsers)
total_hits = total("hits")
total_visitors = total("visitors")
return {
"hits": min(1.0, (crawler.get("hits", {}) or {}).get("count", 0) / total_hits) if total_hits else 0.0,
"visitors": min(1.0, (crawler.get("visitors", {}) or {}).get("count", 0) / total_visitors)
if total_visitors
else 0.0,
}
def adjust(value, ratio):
return max(0, round(value * (1 - ratio)))
def main():
try:
payload = json.load(sys.stdin)
except Exception as exc: # noqa: BLE001
print(f"Failed to read JSON: {exc}", file=sys.stderr)
sys.exit(1)
output_path = payload.get("outputPath")
public_path = payload.get("publicPath")
url = payload.get("stat", {}).get("url")
load_env()
cfg = load_config()
goaccess_url = url or os.environ.get("GOACCESS_URL") or (cfg.get("goaccess") or {}).get("url")
if not goaccess_url:
print("Missing GoAccess URL (set GOACCESS_URL or goaccess.url in tools/config/config.json)", file=sys.stderr)
sys.exit(1)
try:
data = fetch_goaccess(goaccess_url)
except Exception as exc: # noqa: BLE001
print(f"Failed to fetch GoAccess JSON: {exc}", file=sys.stderr)
sys.exit(1)
ratios = crawler_ratios(data)
reqs = (data.get("requests") or {}).get("data") or []
# entries have .data = path, hits.count, visitors.count ?
cleaned = []
for entry in reqs:
path = entry.get("data") or ""
hits = (entry.get("hits") or {}).get("count", 0)
if not path or hits <= 0:
continue
cleaned.append((path, adjust(hits, ratios["hits"])))
cleaned.sort(key=lambda item: item[1], reverse=True)
top = cleaned[:10]
labels = [item[0] for item in top]
values = [item[1] for item in top]
try:
from lib.render_stats_charts import render_top_requests, setup_rcparams
except ImportError as exc: # noqa: BLE001
print(f"Failed to import renderer: {exc}", file=sys.stderr)
sys.exit(1)
setup_rcparams()
render_top_requests({"labels": labels, "values": values, "title": "Top 10 requêtes (hors crawlers)"}, output_path)
json.dump({"image": public_path}, sys.stdout)
sys.stdout.flush()
if __name__ == "__main__":
main()

View File

@@ -1,78 +0,0 @@
#!/usr/bin/env python3
import sys
import json
import os
CURRENT_DIR = os.path.dirname(os.path.abspath(__file__))
PARENT_DIR = os.path.abspath(os.path.join(CURRENT_DIR, os.pardir))
if CURRENT_DIR not in sys.path:
sys.path.append(CURRENT_DIR)
if PARENT_DIR not in sys.path:
sys.path.append(PARENT_DIR)
from common import load_articles, write_result # noqa: E402
def main():
try:
payload = json.load(sys.stdin)
except Exception as exc: # noqa: BLE001
print(f"Failed to read JSON: {exc}", file=sys.stderr)
sys.exit(1)
content_dir = payload.get("contentDir") or "content"
output_path = payload.get("outputPath")
public_path = payload.get("publicPath")
articles = load_articles(content_dir)
temps = []
hums = []
presses = []
for article in articles:
weather = article.get("weather") or {}
try:
t = float(weather.get("temperature"))
h = float(weather.get("humidity"))
except Exception:
continue
temps.append(t)
hums.append(h)
try:
p = float(weather.get("pressure"))
presses.append(p)
except Exception:
presses.append(None)
# Align pressures length
if all(p is None for p in presses):
presses = []
else:
presses = [p for p in presses if p is not None]
try:
from lib.render_stats_charts import render_weather_hexbin, setup_rcparams
except ImportError as exc: # noqa: BLE001
print(f"Failed to import renderer: {exc}", file=sys.stderr)
sys.exit(1)
setup_rcparams()
render_weather_hexbin(
{
"temps": temps,
"hums": hums,
"presses": presses if len(presses) == len(temps) else [],
"title": "Conditions météo à la publication",
},
output_path,
)
write_result({"image": public_path, "meta": {"points": len(temps)}})
if __name__ == "__main__":
main()

View File

@@ -1,65 +0,0 @@
#!/usr/bin/env python3
import sys
import json
import os
CURRENT_DIR = os.path.dirname(os.path.abspath(__file__))
PARENT_DIR = os.path.abspath(os.path.join(CURRENT_DIR, os.pardir))
if CURRENT_DIR not in sys.path:
sys.path.append(CURRENT_DIR)
if PARENT_DIR not in sys.path:
sys.path.append(PARENT_DIR)
from common import load_articles, write_result # noqa: E402
WEEKDAY_LABELS = ["Lun", "Mar", "Mer", "Jeu", "Ven", "Sam", "Dim"]
def main():
try:
payload = json.load(sys.stdin)
except Exception as exc: # noqa: BLE001
print(f"Failed to read JSON: {exc}", file=sys.stderr)
sys.exit(1)
content_dir = payload.get("contentDir") or "content"
output_path = payload.get("outputPath")
public_path = payload.get("publicPath")
articles = load_articles(content_dir)
counts = [0] * 7
words = [0] * 7
for article in articles:
dt = article.get("date")
if not dt:
continue
weekday = dt.weekday() # Monday=0
counts[weekday] += 1
words[weekday] += article.get("wordCount") or 0
try:
from lib.render_stats_charts import render_weekday_activity, setup_rcparams
except ImportError as exc: # noqa: BLE001
print(f"Failed to import renderer: {exc}", file=sys.stderr)
sys.exit(1)
setup_rcparams()
render_weekday_activity(
{
"labels": WEEKDAY_LABELS,
"articles": counts,
"words": words,
"title": "Articles et mots par jour de semaine",
},
output_path,
)
write_result({"image": public_path})
if __name__ == "__main__":
main()

View File

@@ -1,59 +0,0 @@
#!/usr/bin/env python3
import sys
import json
import os
CURRENT_DIR = os.path.dirname(os.path.abspath(__file__))
PARENT_DIR = os.path.abspath(os.path.join(CURRENT_DIR, os.pardir))
if CURRENT_DIR not in sys.path:
sys.path.append(CURRENT_DIR)
if PARENT_DIR not in sys.path:
sys.path.append(PARENT_DIR)
from common import load_articles, write_result # noqa: E402
def main():
try:
payload = json.load(sys.stdin)
except Exception as exc: # noqa: BLE001
print(f"Failed to read JSON: {exc}", file=sys.stderr)
sys.exit(1)
content_dir = payload.get("contentDir") or "content"
output_path = payload.get("outputPath")
public_path = payload.get("publicPath")
articles = load_articles(content_dir)
values = [a.get("wordCount") or 0 for a in articles if a.get("wordCount")]
try:
from lib.render_stats_charts import render_words_histogram, setup_rcparams
except ImportError as exc: # noqa: BLE001
print(f"Failed to import renderer: {exc}", file=sys.stderr)
sys.exit(1)
setup_rcparams()
render_words_histogram(
{
"values": values,
"title": "Distribution des longueurs d'article",
"bins": 20,
},
output_path,
)
write_result(
{
"image": public_path,
"meta": {
"articles": len(values),
},
}
)
if __name__ == "__main__":
main()

View File

@@ -1,88 +0,0 @@
#!/usr/bin/env python3
import sys
import json
import os
from collections import defaultdict
CURRENT_DIR = os.path.dirname(os.path.abspath(__file__))
PARENT_DIR = os.path.abspath(os.path.join(CURRENT_DIR, os.pardir))
if CURRENT_DIR not in sys.path:
sys.path.append(CURRENT_DIR)
if PARENT_DIR not in sys.path:
sys.path.append(PARENT_DIR)
from common import load_articles, MONTH_LABELS, write_result # noqa: E402
def main():
try:
payload = json.load(sys.stdin)
except Exception as exc: # noqa: BLE001
print(f"Failed to read JSON: {exc}", file=sys.stderr)
sys.exit(1)
content_dir = payload.get("contentDir") or "content"
output_path = payload.get("outputPath")
public_path = payload.get("publicPath")
articles = load_articles(content_dir)
buckets = defaultdict(lambda: {"words": 0, "count": 0})
years = set()
total_words = 0
total_articles = 0
for article in articles:
date = article.get("date")
if not date:
continue
year = date.year
month = date.month
key = (year, month)
years.add(year)
buckets[key]["words"] += article.get("wordCount") or 0
buckets[key]["count"] += 1
total_words += article.get("wordCount") or 0
total_articles += 1
month_numbers = list(range(1, 13))
labels = [MONTH_LABELS[m - 1] for m in month_numbers]
sorted_years = sorted(years)
series = []
for year in sorted_years:
values = []
for month in month_numbers:
entry = buckets.get((year, month))
if not entry or entry["count"] == 0:
values.append(0)
else:
values.append(round(entry["words"] / entry["count"]))
series.append({"label": str(year), "values": values})
average = round(total_words / total_articles) if total_articles > 0 else 0
try:
from lib.render_stats_charts import render_words_per_article, setup_rcparams
except ImportError as exc: # noqa: BLE001
print(f"Failed to import renderer: {exc}", file=sys.stderr)
sys.exit(1)
setup_rcparams()
render_words_per_article({"labels": labels, "series": series, "title": "Moyenne de mots par article (par mois)"}, output_path)
write_result(
{
"image": public_path,
"meta": {
"average": average,
"articles": total_articles,
},
}
)
if __name__ == "__main__":
main()

File diff suppressed because it is too large Load Diff

View File

@@ -1,719 +0,0 @@
#!/usr/bin/env node
/**
* Synchronise Hugo critique frontmatter with Wikidata metadata.
*
* The script:
* 1. Reads the critique bundle (expects an index.md with YAML frontmatter).
* 2. Uses the frontmatter title (or --query) plus the bundle type (film, série, etc.)
* to search Wikidata and lets the user confirm the right entity.
* 3. Fetches structured data (genres, cast, crew, companies...) according to the
* Hugo taxonomies currently available in config/_default/taxonomies.yaml.
* 4. Adds the missing taxonomy terms to the frontmatter without removing anything.
*/
const fs = require("node:fs");
const path = require("node:path");
const yaml = require("js-yaml");
const readline = require("node:readline/promises");
const { stdin, stdout } = require("node:process");
const LANGUAGE_FALLBACK = ["fr", "fr-ca", "fr-fr", "en", "en-gb", "en-ca"];
const MAX_WIKIDATA_IDS_PER_REQUEST = 50;
const WIKIDATA_ID_FIELD = "wikidata_id";
const PROJECT_ROOT = path.resolve(__dirname, "..");
const DEFAULT_CRITIQUES_ROOT = path.join(PROJECT_ROOT, "content", "critiques");
const WIKIPEDIA_PREFERRED_LANGS = ["fr", "en"];
const OFFICIAL_SITE_PROPERTY = "P856";
const WIKIDATA_ID_PATTERN = /^Q\d+$/i;
const TYPE_CONFIG = {
films: {
label: "film",
queryAugment: "film",
descriptionHints: ["film", "movie"],
taxonomyMap: {
genres: ["P136"],
personnalites: ["P57", "P58", "P162", "P86", "P161"],
entreprises: ["P272", "P750"],
},
roleQualifiers: [{ claim: "P161", qualifier: "P453" }],
},
series: {
label: "série TV",
queryAugment: '"série télévisée"',
descriptionHints: ["série", "tv series", "télévisée", "television"],
taxonomyMap: {
genres: ["P136"],
personnalites: ["P57", "P58", "P162", "P86", "P161"],
entreprises: ["P272", "P449", "P750"],
},
roleQualifiers: [{ claim: "P161", qualifier: "P453" }],
},
"jeux-video": {
label: "jeu vidéo",
queryAugment: '"jeu vidéo"',
descriptionHints: ["jeu vidéo", "video game", "jeu-vidéo", "jeu video"],
taxonomyMap: {
genres: ["P136"],
personnalites: ["P57", "P58", "P162", "P86", "P161"],
entreprises: ["P178", "P123", "P750"],
},
roleQualifiers: [{ claim: "P161", qualifier: "P453" }],
},
livres: {
label: "livre",
queryAugment: "livre",
descriptionHints: ["roman", "novel", "book", "livre", "comic", "nouvelle", "script"],
taxonomyMap: {
genres: ["P136"],
personnalites: ["P50", "P110"],
entreprises: ["P123"],
},
},
};
function parseArgs(argv) {
const args = argv.slice(2);
const options = {
language: "fr",
limit: 8,
query: null,
autoSelect: false,
};
const targets = [];
for (const arg of args) {
if (arg.startsWith("--lang=")) {
options.language = arg.slice("--lang=".length).trim() || options.language;
} else if (arg.startsWith("--limit=")) {
const value = Number.parseInt(arg.slice("--limit=".length), 10);
if (!Number.isNaN(value) && value > 0) {
options.limit = value;
}
} else if (arg.startsWith("--query=")) {
options.query = arg.slice("--query=".length).trim() || null;
} else if (arg === "--auto" || arg === "--yes") {
options.autoSelect = true;
} else if (arg === "--help" || arg === "-h") {
options.help = true;
} else if (arg.length > 0) {
targets.push(arg);
}
}
return { options, targets };
}
function showUsage() {
console.log(`Usage: node tools/sync_wiki_metadata.js <critique_path>
Options
--lang=fr Primary language used for Wikidata labels (default: fr)
--limit=8 Max number of Wikidata search results to show
--query="..." Override the query derived from the frontmatter title
--auto Skip the interactive prompt and pick the first result
Notes
• Without arguments, every critique bundle under content/critiques is processed.
• Provide one or more bundle paths to limit the scope manually.
Examples
node tools/sync_wiki_metadata.js content/critiques/films/crocodile-dunde-ii
node tools/sync_wiki_metadata.js --lang=en --query="Galaxy Quest film" content/critiques/films/galaxy-quest
`);
}
function resolveArticleDir(targetPath) {
const absolute = path.resolve(targetPath);
if (!fs.existsSync(absolute)) {
throw new Error(`Path not found: ${absolute}`);
}
const stats = fs.statSync(absolute);
if (stats.isDirectory()) {
return absolute;
}
if (stats.isFile()) {
if (path.basename(absolute) !== "index.md") {
throw new Error(`Expected an index.md, got ${absolute}`);
}
return path.dirname(absolute);
}
throw new Error(`Unsupported path type: ${absolute}`);
}
function getIndexPath(articleDir) {
const indexPath = path.join(articleDir, "index.md");
if (!fs.existsSync(indexPath)) {
throw new Error(`Missing index.md in ${articleDir}`);
}
return indexPath;
}
function readFrontmatter(indexPath) {
const raw = fs.readFileSync(indexPath, "utf8");
const match = raw.match(/^---\n([\s\S]+?)\n---\n?([\s\S]*)$/);
if (!match) {
throw new Error(`No valid frontmatter found in ${indexPath}`);
}
const data = yaml.load(match[1]) || {};
const body = match[2] || "";
return { data, body };
}
function detectCritiqueType(articleDir) {
const normalized = articleDir.split(path.sep);
const idx = normalized.lastIndexOf("critiques");
if (idx === -1 || idx + 1 >= normalized.length) {
return null;
}
return normalized[idx + 1];
}
function collectCritiqueBundles(rootDir) {
if (!fs.existsSync(rootDir)) {
return [];
}
const bundles = [];
const stack = [rootDir];
while (stack.length > 0) {
const currentDir = stack.pop();
const entries = fs.readdirSync(currentDir, { withFileTypes: true });
const hasIndex = entries.some((entry) => entry.isFile() && entry.name === "index.md");
if (hasIndex) {
bundles.push(currentDir);
continue;
}
for (const entry of entries) {
if (!entry.isDirectory()) {
continue;
}
if (entry.name.startsWith(".")) {
continue;
}
const nextPath = path.join(currentDir, entry.name);
stack.push(nextPath);
}
}
return bundles.sort((a, b) => a.localeCompare(b, "fr"));
}
function buildLanguageOrder(primary) {
const order = [primary, ...LANGUAGE_FALLBACK];
return order.filter((value, index) => order.indexOf(value) === index);
}
async function wikidataApi(params) {
const url = new URL("https://www.wikidata.org/w/api.php");
for (const [key, value] of Object.entries(params)) {
if (value !== undefined && value !== null) {
url.searchParams.set(key, value);
}
}
url.searchParams.set("format", "json");
const response = await fetch(url);
if (!response.ok) {
throw new Error(`Wikidata API error ${response.status}: ${response.statusText}`);
}
return response.json();
}
async function searchEntities(term, typeConfig, options) {
const queries = [];
if (typeConfig.queryAugment) {
queries.push(`${term} ${typeConfig.queryAugment}`);
}
queries.push(term);
for (const query of queries) {
const data = await wikidataApi({
action: "wbsearchentities",
search: query,
language: options.language,
uselang: options.language,
type: "item",
limit: String(options.limit),
strictlanguage: false,
origin: "*",
});
if (!data.search || data.search.length === 0) {
continue;
}
let results = data.search;
if (typeConfig.descriptionHints && typeConfig.descriptionHints.length > 0) {
const hints = typeConfig.descriptionHints.map((hint) => hint.toLowerCase());
const filtered = results.filter((entry) => {
if (!entry.description) {
return false;
}
const desc = entry.description.toLowerCase();
return hints.some((hint) => desc.includes(hint));
});
if (filtered.length > 0) {
results = filtered;
}
}
return results;
}
return [];
}
async function fetchEntity(entityId, languages) {
const data = await wikidataApi({
action: "wbgetentities",
ids: entityId,
props: "labels|descriptions|claims|sitelinks",
languages: languages.join("|"),
origin: "*",
});
if (!data.entities || !data.entities[entityId]) {
throw new Error(`Unable to load Wikidata entity ${entityId}`);
}
return data.entities[entityId];
}
function collectClaimIds(entity, property) {
const claims = entity.claims?.[property];
if (!claims) {
return [];
}
const ids = [];
for (const claim of claims) {
const value = claim.mainsnak?.datavalue?.value;
if (value && typeof value === "object" && value.id) {
ids.push(value.id);
}
}
return ids;
}
function collectClaimUrls(entity, property) {
const claims = entity.claims?.[property];
if (!claims) {
return [];
}
const urls = [];
for (const claim of claims) {
const value = claim.mainsnak?.datavalue?.value;
if (typeof value === "string") {
urls.push(value);
}
}
return [...new Set(urls)];
}
function collectRoleIds(entity, roleConfig) {
const ids = [];
if (!roleConfig) {
return ids;
}
for (const { claim, qualifier } of roleConfig) {
const claims = entity.claims?.[claim];
if (!claims) {
continue;
}
for (const entry of claims) {
const qualifiers = entry.qualifiers?.[qualifier];
if (!qualifiers) {
continue;
}
for (const qual of qualifiers) {
const value = qual.datavalue?.value;
if (value && typeof value === "object" && value.id) {
ids.push(value.id);
} else if (value && typeof value === "object" && value.text) {
ids.push(value.text);
}
}
}
}
return ids;
}
async function fetchLabels(ids, languages) {
const uniqueIds = [...new Set(ids.filter((value) => typeof value === "string" && value.startsWith("Q")))];
const labelMap = {};
if (uniqueIds.length === 0) {
return labelMap;
}
for (let i = 0; i < uniqueIds.length; i += MAX_WIKIDATA_IDS_PER_REQUEST) {
const chunk = uniqueIds.slice(i, i + MAX_WIKIDATA_IDS_PER_REQUEST);
const data = await wikidataApi({
action: "wbgetentities",
ids: chunk.join("|"),
props: "labels",
languages: languages.join("|"),
origin: "*",
});
for (const [id, entity] of Object.entries(data.entities || {})) {
labelMap[id] = pickLabel(entity.labels, languages) || id;
}
}
return labelMap;
}
function pickLabel(labels = {}, languages) {
for (const lang of languages) {
if (labels[lang]) {
return labels[lang].value;
}
}
const fallback = Object.values(labels)[0];
return fallback ? fallback.value : null;
}
function pickWikipediaLink(sitelinks = {}) {
for (const lang of WIKIPEDIA_PREFERRED_LANGS) {
const key = `${lang}wiki`;
const link = sitelinks[key];
if (!link) {
continue;
}
if (link.url) {
return { lang, url: link.url };
}
if (link.title) {
const encodedTitle = encodeURIComponent(link.title.replace(/ /g, "_"));
return { lang, url: `https://${lang}.wikipedia.org/wiki/${encodedTitle}` };
}
}
return null;
}
function inferLanguageFromUrl(rawUrl) {
try {
const { hostname, pathname } = new URL(rawUrl);
const host = hostname.toLowerCase();
if (host.endsWith(".fr") || host.includes(".fr.")) {
return "fr";
}
if (host.endsWith(".de")) {
return "de";
}
if (host.endsWith(".es")) {
return "es";
}
if (host.endsWith(".it")) {
return "it";
}
if (host.endsWith(".pt") || host.endsWith(".br")) {
return "pt";
}
if (host.endsWith(".co.uk") || host.endsWith(".uk") || host.endsWith(".us") || host.endsWith(".com")) {
// only infer English when explicitly present in the path
if (pathname.toLowerCase().includes("/en/") || pathname.toLowerCase().startsWith("/en")) {
return "en";
}
return null;
}
if (pathname.toLowerCase().startsWith("/fr/") || pathname.toLowerCase().includes("/fr/")) {
return "fr";
}
return null;
} catch {
return null;
}
}
function buildExternalLinks(entity) {
const links = [];
const seen = new Set();
const addLink = (entry) => {
if (!entry || !entry.url || seen.has(entry.url)) {
return;
}
const normalized = {
name: entry.name || "Lien",
url: entry.url,
};
if (entry.lang) {
normalized.lang = entry.lang;
}
links.push(normalized);
seen.add(entry.url);
};
const wikiLink = pickWikipediaLink(entity.sitelinks);
if (wikiLink) {
addLink({
name: "Page Wikipédia",
url: wikiLink.url,
lang: wikiLink.lang,
});
}
const officialUrls = collectClaimUrls(entity, OFFICIAL_SITE_PROPERTY);
for (const url of officialUrls) {
const link = {
name: "Site officiel",
url,
};
const detectedLang = inferLanguageFromUrl(url);
if (detectedLang) {
link.lang = detectedLang;
}
addLink(link);
}
return links;
}
function buildTaxonomyValues(entity, typeConfig, labelLookup) {
const taxonomyData = {};
const addValue = (taxonomy, value) => {
if (!value) {
return;
}
if (!taxonomyData[taxonomy]) {
taxonomyData[taxonomy] = new Set();
}
taxonomyData[taxonomy].add(value);
};
for (const [taxonomy, properties] of Object.entries(typeConfig.taxonomyMap)) {
for (const property of properties) {
const ids = collectClaimIds(entity, property);
for (const id of ids) {
const label = labelLookup(id);
if (!label || WIKIDATA_ID_PATTERN.test(label)) {
continue;
}
addValue(taxonomy, label);
}
}
}
if (typeConfig.roleQualifiers) {
const roleIds = collectRoleIds(entity, typeConfig.roleQualifiers);
for (const roleId of roleIds) {
if (typeof roleId === "string" && WIKIDATA_ID_PATTERN.test(roleId)) {
const resolved = labelLookup(roleId);
if (!resolved || WIKIDATA_ID_PATTERN.test(resolved)) {
continue;
}
addValue("personnages_de_fiction", resolved);
} else {
addValue("personnages_de_fiction", roleId);
}
}
}
return Object.fromEntries(
Object.entries(taxonomyData).map(([taxonomy, values]) => [taxonomy, [...values]])
);
}
function mergeFrontmatter(frontmatter, newValues) {
let updated = false;
for (const [taxonomy, values] of Object.entries(newValues)) {
if (!values || values.length === 0) {
continue;
}
const list = Array.isArray(frontmatter[taxonomy])
? [...frontmatter[taxonomy]]
: frontmatter[taxonomy]
? [frontmatter[taxonomy]]
: [];
const existing = new Set(list);
let added = 0;
for (const value of values) {
if (!existing.has(value)) {
list.push(value);
existing.add(value);
added += 1;
}
}
if (added > 0) {
list.sort((a, b) => a.localeCompare(b, "fr"));
frontmatter[taxonomy] = list;
updated = true;
console.log(` ↳ Added ${added} value(s) to "${taxonomy}"`);
}
}
return updated;
}
function mergeLinks(frontmatter, linksToAdd) {
if (!linksToAdd || linksToAdd.length === 0) {
return false;
}
if (Object.prototype.hasOwnProperty.call(frontmatter, "links")) {
return false;
}
frontmatter.links = [...linksToAdd];
console.log(` ↳ Added ${linksToAdd.length} link(s) to "links"`);
return true;
}
async function promptForSelection(results, rl) {
if (results.length === 1) {
const only = results[0];
const answer = await rl.question(
`Found a single match: ${only.label}${only.description || "sans description"} [${only.id}]. Use it? (Y/n) `
);
if (answer.trim() === "" || /^y(es)?$/i.test(answer.trim())) {
return only;
}
return null;
}
console.log("Sélectionnez l'œuvre correspondante :");
results.forEach((result, index) => {
console.log(
` ${index + 1}. ${result.label}${result.description || "sans description"} [${result.id}]`
);
});
console.log(" 0. Annuler");
while (true) {
const answer = await rl.question("Choix : ");
const choice = Number.parseInt(answer, 10);
if (!Number.isNaN(choice)) {
if (choice === 0) {
return null;
}
if (choice >= 1 && choice <= results.length) {
return results[choice - 1];
}
}
console.log("Veuillez saisir un numéro valide ou 0 pour annuler.");
}
}
async function processCritique(target, options, rl) {
const articleDir = resolveArticleDir(target);
const typeKey = detectCritiqueType(articleDir);
if (!typeKey) {
console.log(`⚠️ Impossible de déduire le type pour ${articleDir}. Ignoré.`);
return;
}
const typeConfig = TYPE_CONFIG[typeKey];
if (!typeConfig) {
console.log(`⚠️ Type "${typeKey}" non pris en charge pour ${articleDir}.`);
return;
}
const indexPath = getIndexPath(articleDir);
const { data: frontmatter, body } = readFrontmatter(indexPath);
const storedEntityId = frontmatter[WIKIDATA_ID_FIELD];
const searchTerm = options.query || frontmatter?.title;
if (!storedEntityId && !searchTerm) {
console.log(`⚠️ Aucun titre trouvé dans ${indexPath}.`);
return;
}
console.log(`\n📄 ${indexPath}`);
console.log(` Type détecté : ${typeKey}`);
let entityId = storedEntityId;
let selection = null;
if (entityId) {
console.log(` 🆔 Identifiant Wikidata déjà enregistré : ${entityId}`);
} else {
console.log(` Recherche Wikidata : "${searchTerm}"`);
const results = await searchEntities(searchTerm, typeConfig, options);
if (!results.length) {
console.log(" ❌ Aucun résultat Wikidata trouvé.");
return;
}
if (options.autoSelect) {
selection = results[0];
console.log(
` ⚙️ Mode automatique: sélection du premier résultat (${selection.label}${selection.description || "sans description"})`
);
} else {
selection = await promptForSelection(results, rl);
}
if (!selection) {
console.log(" ❎ Sélection annulée.");
return;
}
entityId = selection.id;
}
const languages = buildLanguageOrder(options.language);
const entity = await fetchEntity(entityId, languages);
const entityLabel = pickLabel(entity.labels, languages) || selection?.label || entityId;
if (storedEntityId) {
console.log(` ✔ Entité chargée : ${entityLabel} (${entityId})`);
} else {
console.log(` ✔ Entité sélectionnée : ${entityLabel} (${entityId})`);
}
const idsToResolve = new Set();
for (const properties of Object.values(typeConfig.taxonomyMap)) {
for (const property of properties) {
collectClaimIds(entity, property).forEach((id) => idsToResolve.add(id));
}
}
if (typeConfig.roleQualifiers) {
collectRoleIds(entity, typeConfig.roleQualifiers)
.filter((id) => typeof id === "string" && id.startsWith("Q"))
.forEach((id) => idsToResolve.add(id));
}
const labelMap = await fetchLabels([...idsToResolve], languages);
const lookup = (id) => labelMap[id] || id;
const taxonomyValues = buildTaxonomyValues(entity, typeConfig, lookup);
const externalLinks = buildExternalLinks(entity);
let updated = mergeFrontmatter(frontmatter, taxonomyValues);
if (mergeLinks(frontmatter, externalLinks)) {
updated = true;
}
if (frontmatter[WIKIDATA_ID_FIELD] !== entityId) {
frontmatter[WIKIDATA_ID_FIELD] = entityId;
updated = true;
console.log(` ↳ Champ ${WIKIDATA_ID_FIELD} ajouté/mis à jour`);
}
if (!updated) {
console.log(" Aucun nouveau terme à ajouter.");
return;
}
const newFrontmatter = yaml.dump(frontmatter, { lineWidth: -1 });
fs.writeFileSync(indexPath, `---\n${newFrontmatter}---\n${body}`, "utf8");
console.log(" 💾 Frontmatter mis à jour.");
}
async function main() {
const { options, targets: cliTargets } = parseArgs(process.argv);
if (options.help) {
showUsage();
return;
}
let targets = [...cliTargets];
if (targets.length === 0) {
console.log(`🔄 Recherche de critiques dans ${DEFAULT_CRITIQUES_ROOT}...`);
targets = collectCritiqueBundles(DEFAULT_CRITIQUES_ROOT);
if (targets.length === 0) {
console.log("Aucune critique à traiter. Veuillez fournir un chemin explicite.");
return;
}
console.log(`${targets.length} critique(s) détectée(s).`);
}
const rl = options.autoSelect ? null : readline.createInterface({ input: stdin, output: stdout });
try {
for (const target of targets) {
await processCritique(target, options, rl);
}
} catch (error) {
console.error(`Erreur: ${error.message}`);
process.exitCode = 1;
} finally {
if (rl) {
rl.close();
}
}
}
main();

View File

@@ -1,21 +0,0 @@
const { getArchiveUrl, saveToArchive } = require("../lib/archive");
(async () => {
const testUrl = "https://richard-dern.fr";
console.log(`🔍 Checking Archive.org for: ${testUrl}`);
let archiveUrl = await getArchiveUrl(testUrl);
if (archiveUrl) {
console.log(`✔ Archive found: ${archiveUrl}`);
} else {
console.log(`❌ No archive found, requesting a new one...`);
archiveUrl = await saveToArchive(testUrl);
if (archiveUrl) {
console.log(`✔ URL successfully archived: ${archiveUrl}`);
} else {
console.log(`❌ Failed to archive the URL.`);
}
}
})();

View File

@@ -1,42 +0,0 @@
const test = require("node:test");
const assert = require("node:assert/strict");
const fs = require("node:fs/promises");
const os = require("node:os");
const path = require("node:path");
const { findLatestBundle, resolveBundlePath } = require("../lib/bundles");
/**
* Crée un bundle Hugo minimal pour les tests.
* @param {string} rootDir Racine temporaire.
* @param {string} relativePath Chemin relatif du bundle.
* @returns {Promise<string>} Chemin absolu du bundle créé.
*/
async function createBundle(rootDir, relativePath) {
const bundleDir = path.join(rootDir, relativePath);
await fs.mkdir(bundleDir, { recursive: true });
await fs.writeFile(path.join(bundleDir, "index.md"), "---\ntitle: Test\ndate: 2026-03-20T12:00:00+01:00\n---\n", "utf8");
return bundleDir;
}
test("resolveBundlePath accepte un dossier de bundle ou un index.md", () => {
const bundleDir = path.resolve("content/example/bundle");
const indexPath = path.join(bundleDir, "index.md");
assert.equal(resolveBundlePath(bundleDir), bundleDir);
assert.equal(resolveBundlePath(indexPath), bundleDir);
});
test("findLatestBundle retourne le bundle modifie le plus recemment", async () => {
const tempRoot = await fs.mkdtemp(path.join(os.tmpdir(), "bundles-test-"));
const olderBundle = await createBundle(tempRoot, "alpha/article-a");
const newerBundle = await createBundle(tempRoot, "beta/article-b");
await fs.utimes(olderBundle, new Date("2026-03-20T10:00:00Z"), new Date("2026-03-20T10:00:00Z"));
await fs.utimes(newerBundle, new Date("2026-03-20T11:00:00Z"), new Date("2026-03-20T11:00:00Z"));
const latestBundle = await findLatestBundle(tempRoot);
assert.equal(latestBundle, newerBundle);
await fs.rm(tempRoot, { recursive: true, force: true });
});

View File

@@ -1,93 +0,0 @@
const test = require("node:test");
const assert = require("node:assert/strict");
const fs = require("node:fs/promises");
const os = require("node:os");
const path = require("node:path");
const { findUrlOccurrences, replaceUrlInFiles } = require("../lib/url_replacements");
const { loadExternalLinksReport, getLinksByStatus } = require("../lib/external_links_report");
/**
* Ecrit un fichier texte en creant son dossier parent.
* @param {string} filePath Chemin absolu du fichier.
* @param {string} content Contenu a ecrire.
*/
async function writeFixture(filePath, content) {
await fs.mkdir(path.dirname(filePath), { recursive: true });
await fs.writeFile(filePath, content, "utf8");
}
test("replaceUrlInFiles remplace les occurrences exactes dans markdown, yaml et json", async () => {
const tempRoot = await fs.mkdtemp(path.join(os.tmpdir(), "dead-links-replace-"));
const contentRoot = path.join(tempRoot, "content");
const deadUrl = "https://dead.example.com/page";
const replacementUrl = "https://archive.example.com/page";
await writeFixture(
path.join(contentRoot, "article", "index.md"),
`Lien [mort](${deadUrl})\nEncore ${deadUrl}\n`
);
await writeFixture(
path.join(contentRoot, "article", "data", "meta.yaml"),
`source: "${deadUrl}"\n`
);
await writeFixture(
path.join(contentRoot, "stats", "data.json"),
JSON.stringify({ url: deadUrl, untouched: "https://ok.example.com" }, null, 2)
);
const matches = await findUrlOccurrences(contentRoot, deadUrl);
assert.deepStrictEqual(
matches.map((match) => [path.relative(contentRoot, match.filePath), match.occurrences]),
[
["article/data/meta.yaml", 1],
["article/index.md", 2],
["stats/data.json", 1],
]
);
const result = await replaceUrlInFiles(contentRoot, deadUrl, replacementUrl, { matches });
assert.equal(result.totalOccurrences, 4);
assert.equal(result.changedFiles.length, 3);
const markdown = await fs.readFile(path.join(contentRoot, "article", "index.md"), "utf8");
const yaml = await fs.readFile(path.join(contentRoot, "article", "data", "meta.yaml"), "utf8");
const json = await fs.readFile(path.join(contentRoot, "stats", "data.json"), "utf8");
assert.ok(markdown.includes(replacementUrl));
assert.ok(!markdown.includes(deadUrl));
assert.ok(yaml.includes(replacementUrl));
assert.ok(json.includes(replacementUrl));
await fs.rm(tempRoot, { recursive: true, force: true });
});
test("loadExternalLinksReport retourne correctement les liens 404 du cache YAML", async () => {
const tempRoot = await fs.mkdtemp(path.join(os.tmpdir(), "dead-links-cli-"));
const reportPath = path.join(tempRoot, "external_links.yaml");
const deadUrl = "https://dead.example.com/article";
await writeFixture(
reportPath,
[
"generatedAt: '2026-03-25T21:00:00.000Z'",
"links:",
` - url: ${deadUrl}`,
" status: 404",
" locations:",
" - file: content/demo/index.md",
" line: 5",
" page: /demo",
"",
].join("\n")
);
const report = loadExternalLinksReport(reportPath);
const deadLinks = getLinksByStatus(report, 404);
assert.equal(report.generatedAt, "2026-03-25T21:00:00.000Z");
assert.equal(deadLinks.length, 1);
assert.equal(deadLinks[0].url, deadUrl);
assert.equal(deadLinks[0].locations[0].file, "content/demo/index.md");
assert.equal(deadLinks[0].locations[0].line, 5);
await fs.rm(tempRoot, { recursive: true, force: true });
});

View File

@@ -1,96 +0,0 @@
const test = require("node:test");
const assert = require("node:assert/strict");
const { Readable } = require("node:stream");
const {
collectMarkdownLinksFromStream,
extractLinksFromText,
sanitizeUrlCandidate,
} = require("../lib/markdown_links");
test("extractLinksFromText returns sanitized external URLs only once", () => {
const input =
"See [example](https://example.com) and <https://foo.com>. " +
"Autolink https://bar.com/path).\nDuplicate https://example.com!";
const urls = extractLinksFromText(input);
assert.deepStrictEqual(urls, ["https://example.com", "https://foo.com", "https://bar.com/path"]);
});
test("extractLinksFromText does not extend a markdown destination past the closing parenthesis", () => {
const input = "J'ai eu mon lot d'installations du couple [anope](https://www.anope.org/)/epona.";
const urls = extractLinksFromText(input);
assert.deepStrictEqual(urls, ["https://www.anope.org/"]);
});
test("collectMarkdownLinksFromStream preserves line numbers", async () => {
const content = [
"Intro line with no link",
"Markdown [link](https://docs.example.org/page).",
"Plain link https://news.example.net/article.",
"Trailing <https://portal.example.com/path> punctuation.",
"Markdown [link](https://docs.example.org/page(with more valid content)).",
"Le **[baume du Canada](https://fr.wikipedia.org/wiki/Baume_du_Canada)**",
"(_Theropoda [incertae sedis](https://fr.wikipedia.org/wiki/Incertae_sedis)_)",
"[CDN](https://fr.wikipedia.org/wiki/Réseau_de_diffusion_de_contenu)[^2].",
"(heu... [oui](https://github.com/opencart/opencart/tree/master/upload/system/storage/vendor)...)"
].join("\n");
const stream = Readable.from([content]);
const links = await collectMarkdownLinksFromStream(stream);
assert.deepStrictEqual(links, [
{ url: "https://docs.example.org/page", line: 2 },
{ url: "https://news.example.net/article", line: 3 },
{ url: "https://portal.example.com/path", line: 4 },
{ url: "https://docs.example.org/page(with more valid content)", line: 5 },
{ url: "https://fr.wikipedia.org/wiki/Baume_du_Canada", line: 6 },
{ url: "https://fr.wikipedia.org/wiki/Incertae_sedis", line: 7 },
{ url: "https://fr.wikipedia.org/wiki/Réseau_de_diffusion_de_contenu", line: 8 },
{ url: "https://github.com/opencart/opencart/tree/master/upload/system/storage/vendor", line: 9 },
]);
});
test("collectMarkdownLinksFromStream ignores inline code, fenced code blocks and indented code blocks", async () => {
const content = [
"Visible https://visible.example.com.",
"Inline code `https://inline.example.com` and normal https://normal.example.com.",
"",
"```yaml",
"uses: https://github.com/easingthemes/ssh-deploy@main",
"```",
"",
" https://indented.example.com",
"After code https://after.example.com.",
].join("\n");
const stream = Readable.from([content]);
const links = await collectMarkdownLinksFromStream(stream);
assert.deepStrictEqual(links, [
{ url: "https://visible.example.com", line: 1 },
{ url: "https://normal.example.com", line: 2 },
{ url: "https://after.example.com", line: 9 },
]);
});
test("collectMarkdownLinksFromStream ignores URLs in front matter entirely", async () => {
const content = [
"---",
"links:",
" # url: https://ignored.example.com",
" - url: https://included.example.com",
"---",
"Body with https://body.example.com link.",
].join("\n");
const stream = Readable.from([content]);
const links = await collectMarkdownLinksFromStream(stream);
assert.deepStrictEqual(links, [
{ url: "https://body.example.com", line: 6 },
]);
});
test("sanitizeUrlCandidate removes spurious trailing punctuation", () => {
const cases = [
["https://example.com).", "https://example.com"],
["https://example.com!\"", "https://example.com"],
["<https://example.com>", "https://example.com"],
];
for (const [input, expected] of cases) {
assert.equal(sanitizeUrlCandidate(input), expected);
}
});

View File

@@ -1,63 +0,0 @@
const test = require("node:test");
const assert = require("node:assert/strict");
const {
parseBoolean,
isDraftValue,
isEffectivelyPublished,
isEffectivelyPublishedDocument,
} = require("../lib/publication");
test("parseBoolean converts common boolean representations", () => {
assert.equal(parseBoolean(true), true);
assert.equal(parseBoolean(false), false);
assert.equal(parseBoolean("true"), true);
assert.equal(parseBoolean("TRUE"), true);
assert.equal(parseBoolean("1"), true);
assert.equal(parseBoolean("on"), true);
assert.equal(parseBoolean("false"), false);
assert.equal(parseBoolean("0"), false);
assert.equal(parseBoolean("off"), false);
assert.equal(parseBoolean(""), null);
assert.equal(parseBoolean("unknown"), null);
assert.equal(parseBoolean(1), null);
});
test("isDraftValue returns true only for explicit draft values", () => {
assert.equal(isDraftValue(true), true);
assert.equal(isDraftValue("true"), true);
assert.equal(isDraftValue("yes"), true);
assert.equal(isDraftValue(false), false);
assert.equal(isDraftValue("false"), false);
assert.equal(isDraftValue(undefined), false);
});
test("isEffectivelyPublished excludes draft frontmatter", () => {
assert.equal(isEffectivelyPublished({ draft: true }), false);
assert.equal(isEffectivelyPublished({ draft: "true" }), false);
assert.equal(isEffectivelyPublished({ draft: false }), true);
assert.equal(isEffectivelyPublished({ title: "Article" }), true);
assert.equal(isEffectivelyPublished(null), true);
});
test("isEffectivelyPublishedDocument supports YAML-like get()", () => {
const docDraft = {
get(key) {
if (key === "draft") {
return true;
}
return null;
},
};
const docPublished = {
get(key) {
if (key === "draft") {
return false;
}
return null;
},
};
assert.equal(isEffectivelyPublishedDocument(docDraft), false);
assert.equal(isEffectivelyPublishedDocument(docPublished), true);
assert.equal(isEffectivelyPublishedDocument(null), true);
});

View File

@@ -1,13 +0,0 @@
const { scrapePage } = require("../lib/puppeteer");
const path = require("path");
(async () => {
const testUrl = "https://richard-dern.fr";
const screenshotPath = path.join(__dirname, "test_screenshot.png");
console.log(`🔍 Testing Puppeteer module on: ${testUrl}`);
const metadata = await scrapePage(testUrl, screenshotPath);
console.log("📄 Page metadata:");
console.log(metadata);
})();

View File

@@ -1,65 +0,0 @@
const test = require("node:test");
const assert = require("node:assert/strict");
const {
extractFileTitleFromUrl,
extractAssetFromApiResponse,
sanitizeMetadataText,
} = require("../lib/wikimedia");
test("extractFileTitleFromUrl supporte les URLs Commons et les fragments media de Wikipedia", () => {
const commonsUrl = "https://commons.wikimedia.org/wiki/File:IBookG3_Palourde2.png";
const wikipediaMediaUrl = "https://en.wikipedia.org/wiki/IBook#/media/File:IBookG3_Palourde2.png";
assert.equal(extractFileTitleFromUrl(commonsUrl), "File:IBookG3_Palourde2.png");
assert.equal(extractFileTitleFromUrl(wikipediaMediaUrl), "File:IBookG3_Palourde2.png");
});
test("sanitizeMetadataText decode le HTML de Commons", () => {
const rawValue = "No machine-readable author provided. <a href=\"//commons.wikimedia.org/wiki/User:Ocmey\">Ocmey</a> assumed &amp; credited.";
assert.equal(
sanitizeMetadataText(rawValue),
"No machine-readable author provided. Ocmey assumed & credited."
);
});
test("extractAssetFromApiResponse reconstruit l'attribution et la description", () => {
const response = {
query: {
pages: {
"903939": {
title: "File:IBookG3 Palourde2.png",
imageinfo: [
{
url: "https://upload.wikimedia.org/wikipedia/commons/b/b3/IBookG3_Palourde2.png",
descriptionurl: "https://commons.wikimedia.org/wiki/File:IBookG3_Palourde2.png",
descriptionshorturl: "https://commons.wikimedia.org/w/index.php?curid=903939",
extmetadata: {
ImageDescription: {
value: "iBook G3 Open and Closed",
},
Credit: {
value: "No machine-readable source provided. Own work assumed (based on copyright claims).",
},
Artist: {
value: "No machine-readable author provided. <a href=\"//commons.wikimedia.org/wiki/User:Ocmey\" title=\"User:Ocmey\">Ocmey</a> assumed (based on copyright claims).",
},
LicenseShortName: {
value: "Public domain",
},
},
},
],
},
},
},
};
const asset = extractAssetFromApiResponse(response);
assert.equal(asset.fileName, "IBookG3_Palourde2.png");
assert.equal(asset.description, "iBook G3 Open and Closed");
assert.equal(
asset.attribution,
"By No machine-readable author provided. Ocmey assumed (based on copyright claims). - No machine-readable source provided. Own work assumed (based on copyright claims)., Public Domain, https://commons.wikimedia.org/w/index.php?curid=903939"
);
});

View File

@@ -1,624 +0,0 @@
#!/usr/bin/env node
/**
* Met à jour la date de publication et le titre des posts Lemmy à partir des articles Hugo.
* Pré-requis : accès en écriture à la base Postgres de Lemmy
* (par exemple via LEMMY_DATABASE_URL=postgres:///lemmy?host=/run/postgresql&user=lemmy
* et exécution en tant qu'utilisateur système lemmy).
*
* Règles :
* - L'article doit contenir un frontmatter valide avec un champ date.
* - L'article doit contenir un comments_url pointant vers /post/{id}.
* - La date est appliquée sur post.published (timestamp avec fuseau Hugo).
* - Le titre Lemmy est aligné sur le titre Hugo (post.name et, si disponible, post.embed_title).
*/
const path = require("node:path");
const fs = require("node:fs");
const crypto = require("node:crypto");
const sharp = require("sharp");
const { LemmyHttp } = require("lemmy-js-client");
const { Pool } = require("pg");
const { collectBundles } = require("./lib/content");
const { parseFrontmatterDate } = require("./lib/datetime");
const { readFrontmatterFile } = require("./lib/frontmatter");
const { loadEnv } = require("./lib/env");
const { loadToolsConfig } = require("./lib/config");
const { isEffectivelyPublished } = require("./lib/publication");
const CONTENT_ROOT = path.join(__dirname, "..", "content");
const DEFAULT_DATABASE_URL = "postgres:///lemmy?host=/run/postgresql&user=richard";
const TOOLS_CONFIG_PATH = path.join(__dirname, "config", "config.json");
const THUMBNAIL_CACHE_DIR = path.join(__dirname, "cache", "lemmy_thumbnails");
const THUMBNAIL_FORMAT = "png";
const MAX_THUMBNAIL_WIDTH = 320;
const MAX_THUMBNAIL_HEIGHT = 240;
const THUMBNAIL_QUALITY = 82;
const FRONTMATTER_COVER_FIELD = "cover";
main().then(
() => {
process.exit(0);
},
(error) => {
console.error(`❌ Mise à jour interrompue : ${error.message}`);
process.exit(1);
}
);
/**
* Point d'entrée : collecte les articles, se connecte à Postgres, applique les dates.
*/
async function main() {
loadEnv();
const databaseUrl = resolveDatabaseUrl();
const pool = new Pool({ connectionString: databaseUrl });
const hasEmbedTitle = await detectEmbedTitleColumn(pool);
const bundles = await collectBundles(CONTENT_ROOT);
const articles = collectArticlesWithPostId(bundles);
if (articles.length === 0) {
console.log("Aucun article muni d'un comments_url et d'une date valide.");
await pool.end();
return;
}
let updated = 0;
let unchanged = 0;
let missing = 0;
let lemmyClient = null;
let lemmyConfig = null;
for (const article of articles) {
const targetDate = article.publication.set({ millisecond: 0 });
const iso = targetDate.toISO();
const row = await fetchPost(pool, article.postId, hasEmbedTitle);
if (!row) {
missing += 1;
console.warn(`⚠️ Post ${article.postId} introuvable pour ${article.bundle.relativePath}`);
continue;
}
const currentIso = new Date(row.published).toISOString();
const expectedUtcIso = targetDate.toUTC().toISO();
const expectedTitle = article.title;
const currentTitle = typeof row.name === "string" ? row.name.trim() : "";
const needsDateUpdate = currentIso !== expectedUtcIso;
const needsTitleUpdate = currentTitle !== expectedTitle;
let needsEmbedTitleUpdate = false;
if (hasEmbedTitle) {
const currentEmbedTitle =
typeof row.embed_title === "string" ? row.embed_title.trim() : "";
needsEmbedTitleUpdate = currentEmbedTitle !== expectedTitle;
}
const currentThumbnailUrl =
typeof row.thumbnail_url === "string" ? row.thumbnail_url.trim() : "";
let needsThumbnailCreation = false;
let coverPath = null;
const currentUrl = typeof row.url === "string" ? row.url.trim() : "";
const urlLooksLikePictrsImage = currentUrl.includes("/pictrs/image/");
if (!currentThumbnailUrl || urlLooksLikePictrsImage) {
coverPath = resolveCoverPath(article.bundle, article.frontmatter.data);
if (coverPath) {
needsThumbnailCreation = true;
}
}
if (!needsDateUpdate && !needsTitleUpdate && !needsEmbedTitleUpdate && !needsThumbnailCreation) {
unchanged += 1;
continue;
}
if (needsDateUpdate || needsTitleUpdate || needsEmbedTitleUpdate) {
await applyPostUpdates(
pool,
article.postId,
needsDateUpdate ? iso : null,
needsTitleUpdate ? expectedTitle : null,
needsEmbedTitleUpdate ? expectedTitle : null,
hasEmbedTitle
);
}
let createdThumbnailDescription = null;
if (needsThumbnailCreation && coverPath) {
if (!lemmyClient) {
const lemmySetup = await createLemmyClientFromConfig();
lemmyClient = lemmySetup.client;
lemmyConfig = lemmySetup.config;
}
const thumbnail = await buildThumbnailAsset(coverPath, article.bundle);
console.log(`Miniature générée : ${thumbnail.cachePath}`);
const thumbnailUrl = await uploadThumbnail(
lemmyClient,
thumbnail.buffer,
{
cachePath: thumbnail.cachePath,
bundlePath: article.bundle.relativePath,
},
lemmyConfig.instanceUrl
);
const articleUrl = buildArticleUrl(lemmyConfig.siteUrl, article.bundle.parts);
await attachThumbnailToPost(
lemmyClient,
article.postId,
expectedTitle,
currentUrl || null,
articleUrl,
thumbnailUrl
);
cleanupThumbnail(thumbnail.cachePath);
createdThumbnailDescription = "miniature";
}
updated += 1;
const operations = [];
if (needsDateUpdate) {
operations.push(`date ${iso}`);
}
if (needsTitleUpdate || needsEmbedTitleUpdate) {
operations.push("titre");
}
if (createdThumbnailDescription) {
operations.push(createdThumbnailDescription);
}
const details = operations.length > 0 ? ` (${operations.join(" + ")})` : "";
console.log(`✅ Post ${article.postId} mis à jour${details} (${article.bundle.relativePath})`);
}
await pool.end();
console.log("");
console.log("Résumé des ajustements Lemmy");
console.log(`Posts mis à jour : ${updated}`);
console.log(`Posts déjà alignés : ${unchanged}`);
console.log(`Posts introuvables : ${missing}`);
}
/**
* Détermine l'URL de connexion Postgres.
* @returns {string} Chaîne de connexion.
*/
function resolveDatabaseUrl() {
if (typeof process.env.LEMMY_DATABASE_URL === "string" && process.env.LEMMY_DATABASE_URL.trim()) {
return process.env.LEMMY_DATABASE_URL.trim();
}
if (typeof process.env.DATABASE_URL === "string" && process.env.DATABASE_URL.trim()) {
return process.env.DATABASE_URL.trim();
}
return DEFAULT_DATABASE_URL;
}
/**
* Construit la liste des articles éligibles avec identifiant de post Lemmy.
* @param {Array<object>} bundles Bundles Hugo.
* @returns {Array<object>} Articles prêts à être appliqués.
*/
function collectArticlesWithPostId(bundles) {
const articles = [];
for (const bundle of bundles) {
const frontmatter = readFrontmatterFile(bundle.indexPath);
if (!frontmatter) {
continue;
}
if (isEffectivelyPublished(frontmatter.data) === false) {
continue;
}
const publication = parseFrontmatterDate(frontmatter.data?.date);
if (!publication) {
continue;
}
const title = typeof frontmatter.data?.title === "string" ? frontmatter.data.title.trim() : "";
if (!title) {
continue;
}
const commentsUrl =
typeof frontmatter.data?.comments_url === "string" ? frontmatter.data.comments_url.trim() : "";
if (!commentsUrl) {
continue;
}
const postId = extractPostId(commentsUrl);
if (postId === null) {
continue;
}
articles.push({
bundle,
publication,
title,
frontmatter,
postId,
});
}
articles.sort((a, b) => {
const diff = a.publication.toMillis() - b.publication.toMillis();
if (diff !== 0) {
return diff;
}
return a.bundle.relativePath.localeCompare(b.bundle.relativePath);
});
return articles;
}
/**
* Indique si la colonne embed_title est présente sur la table post.
* @param {Pool} pool Pool Postgres.
* @returns {Promise<boolean>} true si embed_title est disponible.
*/
async function detectEmbedTitleColumn(pool) {
const result = await pool.query(
"select column_name from information_schema.columns where table_name = 'post' and column_name = 'embed_title' limit 1"
);
return result.rowCount === 1;
}
/**
* Normalise une URL simple en supprimant les espaces et slashs finaux.
* @param {string|null|undefined} url URL brute.
* @returns {string|null} URL normalisée ou null.
*/
function normalizeUrl(url) {
if (typeof url !== "string") {
return null;
}
const trimmed = url.trim();
if (!trimmed) {
return null;
}
return trimmed.replace(/\/+$/, "");
}
/**
* Normalise la configuration Lemmy extraite des fichiers tools/config.
* @param {object} rawConfig Configuration brute.
* @returns {{ instanceUrl: string, siteUrl: string, auth: { jwt: string|null, username: string|null, password: string|null } }} Configuration prête à l'emploi.
*/
function normalizeLemmyConfig(rawConfig) {
if (!rawConfig || typeof rawConfig !== "object") {
throw new Error("La configuration Lemmy est manquante (tools/config/config.json).");
}
const instanceUrl = normalizeUrl(rawConfig.instanceUrl);
if (!instanceUrl) {
throw new Error(
"lemmy.instanceUrl doit être renseigné dans tools/config/config.json ou via l'environnement."
);
}
const siteUrl = normalizeUrl(rawConfig.siteUrl);
if (!siteUrl) {
throw new Error("lemmy.siteUrl doit être défini pour construire les URLs des articles.");
}
const auth = rawConfig.auth || {};
const hasJwt = typeof auth.jwt === "string" && auth.jwt.trim().length > 0;
const hasCredentials =
typeof auth.username === "string" &&
auth.username.trim().length > 0 &&
typeof auth.password === "string" &&
auth.password.length > 0;
if (!hasJwt && !hasCredentials) {
throw new Error("lemmy.auth.jwt ou lemmy.auth.username + lemmy.auth.password doivent être fournis.");
}
return {
instanceUrl,
siteUrl,
auth: {
jwt: hasJwt ? auth.jwt.trim() : null,
username: hasCredentials ? auth.username.trim() : null,
password: hasCredentials ? auth.password : null,
},
};
}
/**
* Crée un client Lemmy configuré à partir de tools/config/config.json.
* @returns {Promise<{ client: LemmyHttp, config: { instanceUrl: string, siteUrl: string, auth: object } }>}
*/
async function createLemmyClientFromConfig() {
const toolsConfig = await loadToolsConfig(TOOLS_CONFIG_PATH);
const rawLemmy = toolsConfig.lemmy || {};
const config = normalizeLemmyConfig(rawLemmy);
const client = new LemmyHttp(config.instanceUrl);
if (config.auth.jwt) {
client.setHeaders({ Authorization: `Bearer ${config.auth.jwt}` });
return { client, config };
}
const loginResponse = await client.login({
username_or_email: config.auth.username,
password: config.auth.password,
});
client.setHeaders({ Authorization: `Bearer ${loginResponse.jwt}` });
return { client, config };
}
/**
* Construit l'URL publique d'un article à partir de son chemin Hugo.
* @param {string} siteUrl Domaine du site.
* @param {string[]} parts Segments du chemin du bundle.
* @returns {string} URL finale.
*/
function buildArticleUrl(siteUrl, parts) {
const relative = parts.join("/");
return `${siteUrl}/${relative}`;
}
/**
* Extrait l'identifiant numérique d'un comments_url Lemmy.
* @param {string} url URL issue du frontmatter.
* @returns {number|null} Identifiant ou null si non reconnu.
*/
function extractPostId(url) {
const trimmed = url.trim();
if (!trimmed) {
return null;
}
const normalized = trimmed.replace(/\/+$/, "");
const match = normalized.match(/\/(?:post|c\/[^/]+\/post)\/(\d+)(?:$|\?)/i);
if (!match) {
return null;
}
return Number.parseInt(match[1], 10);
}
/**
* Récupère un post Lemmy par identifiant.
* @param {Pool} pool Pool Postgres.
* @param {number} postId Identifiant du post.
* @param {boolean} withEmbedTitle true pour récupérer embed_title si disponible.
* @returns {Promise<object|null>} Enregistrement ou null.
*/
async function fetchPost(pool, postId, withEmbedTitle) {
const columns = withEmbedTitle
? "id, published, name, embed_title, url, thumbnail_url"
: "id, published, name, url, thumbnail_url";
const result = await pool.query(`select ${columns} from post where id = $1`, [postId]);
if (result.rowCount !== 1) {
return null;
}
return result.rows[0];
}
/**
* Applique la date et le titre fournis sur le post ciblé.
* @param {Pool} pool Pool Postgres.
* @param {number} postId Identifiant du post.
* @param {string|null} isoDate Timestamp ISO (null pour laisser inchangé).
* @param {string|null} title Titre attendu (null pour laisser inchangé).
* @param {string|null} embedTitle Titre embarqué attendu (null pour laisser inchangé).
* @param {boolean} withEmbedTitle true si embed_title peut être mis à jour.
*/
async function applyPostUpdates(pool, postId, isoDate, title, embedTitle, withEmbedTitle) {
const fields = [];
const values = [];
let index = 1;
if (isoDate !== null) {
fields.push(`published = $${index}`);
values.push(isoDate);
index += 1;
}
if (title !== null) {
fields.push(`name = $${index}`);
values.push(title);
index += 1;
}
if (withEmbedTitle && embedTitle !== null) {
fields.push(`embed_title = $${index}`);
values.push(embedTitle);
index += 1;
}
if (fields.length === 0) {
return;
}
values.push(postId);
const sql = `update post set ${fields.join(", ")} where id = $${index}`;
await pool.query(sql, values);
}
/**
* Détermine le chemin absolu vers l'image de couverture déclarée.
* @param {object} bundle Bundle en cours de traitement.
* @param {object} frontmatterData Données du frontmatter.
* @returns {string|null} Chemin absolu ou null si inexistant.
*/
function resolveCoverPath(bundle, frontmatterData) {
const cover =
typeof frontmatterData?.[FRONTMATTER_COVER_FIELD] === "string"
? frontmatterData[FRONTMATTER_COVER_FIELD].trim()
: "";
if (!cover) {
return null;
}
const normalized = cover.replace(/^\.?\//, "").replace(/\/{2,}/g, "/");
if (!normalized) {
return null;
}
const absolute = path.join(bundle.dir, normalized);
if (!fs.existsSync(absolute)) {
console.warn(`⚠️ ${bundle.relativePath} : couverture ${normalized} introuvable.`);
return null;
}
const stats = fs.statSync(absolute);
if (!stats.isFile()) {
console.warn(`⚠️ ${bundle.relativePath} : ${normalized} n'est pas un fichier image.`);
return null;
}
return absolute;
}
/**
* Génère une miniature et la sauvegarde sur disque pour inspection si besoin.
* @param {string} coverPath Chemin absolu de l'image source.
* @param {object} bundle Bundle concerné.
* @returns {Promise<{ buffer: Buffer, cachePath: string }>} Miniature prête à l'emploi.
*/
async function buildThumbnailAsset(coverPath, bundle) {
const buffer = await createThumbnail(coverPath, THUMBNAIL_FORMAT);
const cachePath = writeThumbnailToCache(buffer, bundle, coverPath, THUMBNAIL_FORMAT);
return { buffer, cachePath };
}
/**
* Construit la miniature à partir de l'image de couverture.
* @param {string} absolutePath Chemin absolu vers l'image source.
* @param {string} format Format de sortie (jpeg|png).
* @returns {Promise<Buffer>} Données JPEG redimensionnées.
*/
async function createThumbnail(absolutePath, format = "jpeg") {
const base = sharp(absolutePath).resize({
width: MAX_THUMBNAIL_WIDTH,
height: MAX_THUMBNAIL_HEIGHT,
fit: "inside",
withoutEnlargement: true,
});
if (format === "png") {
return base
.png({
compressionLevel: 9,
palette: true,
effort: 5,
})
.toBuffer();
}
return base.jpeg({ quality: THUMBNAIL_QUALITY, mozjpeg: true }).toBuffer();
}
/**
* Téléverse une miniature vers Lemmy et retourne l'URL publique.
* @param {LemmyHttp} client Client Lemmy.
* @param {Buffer} thumbnailBuffer Données de l'image.
* @param {object} info Informations de contexte pour les erreurs.
* @param {string|null|undefined} instanceUrl URL de l'instance Lemmy (pour harmoniser le schéma).
* @returns {Promise<string>} URL de la miniature hébergée par Lemmy.
*/
async function uploadThumbnail(client, thumbnailBuffer, info, instanceUrl) {
const label = info?.bundlePath ? ` pour ${info.bundlePath}` : "";
const location = info?.cachePath ? ` (miniature : ${info.cachePath})` : "";
const upload = await client.uploadImage({ image: thumbnailBuffer }).catch((error) => {
const reason = typeof error?.message === "string" && error.message.trim() ? error.message : "erreur inconnue";
throw new Error(`Téléversement Lemmy échoué${label}${location} : ${reason}`);
});
if (!upload) {
throw new Error(`Miniature rejetée${label}${location} : réponse vide`);
}
if (upload.error) {
throw new Error(`Miniature rejetée${label}${location} : ${upload.error}`);
}
if (upload.msg !== "ok" || !upload.url) {
const details = JSON.stringify(upload);
throw new Error(`Miniature rejetée${label}${location} : réponse inattendue ${details}`);
}
let url = upload.url;
if (typeof url !== "string") {
throw new Error(`Miniature rejetée${label}${location} : URL de réponse invalide`);
}
url = url.trim();
if (!url) {
throw new Error(`Miniature rejetée${label}${location} : URL de réponse vide`);
}
if (
typeof instanceUrl === "string" &&
instanceUrl.startsWith("https://") &&
url.startsWith("http://")
) {
url = `https://${url.slice("http://".length)}`;
}
return url;
}
/**
* Associe une miniature à un post Lemmy existant via l'API.
* @param {LemmyHttp} client Client Lemmy.
* @param {number} postId Identifiant du post.
* @param {string} title Titre attendu.
* @param {string|null} currentUrl URL actuelle du post.
* @param {string} articleUrl URL Hugo de l'article.
* @param {string} thumbnailUrl URL de la miniature hébergée.
* @returns {Promise<void>} Promesse résolue une fois l'édition terminée.
*/
async function attachThumbnailToPost(client, postId, title, currentUrl, articleUrl, thumbnailUrl) {
const normalizedCurrentUrl =
typeof currentUrl === "string" && currentUrl.trim().length > 0 ? currentUrl.trim() : null;
const payload = {
post_id: postId,
name: title,
custom_thumbnail: thumbnailUrl,
};
let targetUrl = normalizedCurrentUrl;
if (!targetUrl || targetUrl.includes("/pictrs/image/")) {
targetUrl = articleUrl;
}
if (targetUrl) {
payload.url = targetUrl;
}
await client.editPost(payload);
}
/**
* Écrit la miniature générée sur disque dans tools/cache pour inspection.
* @param {Buffer} buffer Miniature prête.
* @param {object} bundle Bundle concerné.
* @param {string} coverPath Chemin de la couverture d'origine.
* @param {string} format Format de la miniature (jpeg|png).
* @returns {string} Chemin du fichier écrit.
*/
function writeThumbnailToCache(buffer, bundle, coverPath, format) {
const targetPath = computeThumbnailCachePath(bundle, coverPath, format);
fs.mkdirSync(path.dirname(targetPath), { recursive: true });
fs.writeFileSync(targetPath, buffer);
return targetPath;
}
/**
* Supprime la miniature mise en cache une fois le post Lemmy mis à jour.
* @param {string|null} cachePath Chemin éventuel de la miniature.
*/
function cleanupThumbnail(cachePath) {
if (!cachePath) {
return;
}
if (fs.existsSync(cachePath)) {
fs.unlinkSync(cachePath);
}
}
/**
* Construit un nom de fichier stable et assaini pour la miniature en cache.
* @param {object} bundle Bundle concerné.
* @param {string} coverPath Chemin de la couverture.
* @param {string} format Format de la miniature.
* @returns {string} Chemin cible dans tools/cache/lemmy_thumbnails.
*/
function computeThumbnailCachePath(bundle, coverPath, format = "jpeg") {
const base = `${bundle.relativePath}__${path.basename(coverPath)}`;
const safeBase = base
.replace(/[^a-zA-Z0-9._-]+/g, "_")
.replace(/_{2,}/g, "_")
.replace(/^_|_$/g, "");
const hash = crypto.createHash("sha1").update(base).digest("hex").slice(0, 10);
const extension = format === "png" ? "png" : "jpg";
const name = `${safeBase || "thumbnail"}_${hash}.${extension}`;
return path.join(THUMBNAIL_CACHE_DIR, name);
}