mirror of
https://github.com/webrecorder/browsertrix-crawler.git
synced 2025-12-25 11:20:18 +00:00
- hashes stored in separate crawl specific entries, h:<crawlid> - wacz files stored in crawl specific list, c:<crawlid>:wacz - hashes committed to 'alldupes' hashset when crawl is complete, crawls added to 'allcrawls' set - store filename, crawlId in related.requires list entries for each wacz
4.6 KiB
4.6 KiB