Goal: keep the full skills/** archive browsable on GitHub, while keeping pipeline authority and automation in registry-core.
Contains:
- Crawlers + build scripts (
scripts/,crawler/) - Source lists (
sources/) - Schemas (
schema/) - Published metadata (
registry.json) - GitHub Pages site sources (
docs/) - Authoritative workflows and sync policy
Does not commit:
- The expanded skill archive
skills/**(stored inregistry-data)
Contains:
- archived skill contents (category folders like
development/,documents/,data/, etc.)
Contains:
- merged tree built from
core + datafor browsing/compatibility consumers
Policy:
mainis not canonical for pipeline behavior.- If docs/workflows conflict between repos,
corewins.
- Only
coreruns scheduled discovery/download/sync jobs. corechecks outregistry-datainto./skills/during CI and updates archive/index outputs.corepushes archive changes toregistry-dataand index/site outputs incore.corecan trigger amainpublish workflow using pinnedcore_sha+data_sha.mainrebuilds merged outputs from those SHAs for reproducibility.
Avoid:
- Running crawler/sync schedules in
main - Letting
mainwrite todata - Publishing metrics without clear raw vs deduplicated labels
- Repository variable:
REGISTRY_DATA_REPO(e.g.yourname/claude-skill-registry-data) - Secret:
DATA_REPO_TOKEN(PAT withreposcope for private orpublic_repofor public) - Repository variable:
REGISTRY_MAIN_REPO(e.g.yourname/claude-skill-registry) - Secret:
MAIN_REPO_TOKEN(token that can dispatch workflows in main repo)
Use the merge script to rebuild the main repo from core + data:
bash scripts/sync_main_repo.sh \
--core ../claude-skill-registry-core \
--data ../claude-skill-registry-data \
--main ../claude-skill-registry