From 760ee9d1b905e5e79b7a601f9f2a83c992665edb Mon Sep 17 00:00:00 2001 From: root Date: Wed, 12 Mar 2025 08:59:30 +0100 Subject: [PATCH] fix roadmap --- README.md | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/README.md b/README.md index 8693159..799a26c 100644 --- a/README.md +++ b/README.md @@ -125,6 +125,13 @@ V1.2.0 SimpleX Crawler: - V1.2.2: crawler.py: make the script categorize the onion links into "onion websites", the simplex chatroom invite links into "simplex chatrooms", and the simplex servers smp and xftp links into "simplex serv ers" categories, AND in unverified.csv directly +V1.3.0 Onion Crawler: +- V1.3.0: crawler.py: make the script iterate over every onion link in verified.csv, and from the page itself it should find every other a href html/php/txt file on that link directly (recursively), however it should have a limit to prevent crawling endlessly (make it configurable, for now it should crawl up to 10 sub-pages per onion site by default). +- V1.3.1: crawler.py: Make it download those webpages in a temporary folder "onioncrawling/{onionwebsitename1.onion,onionwebsitename2.onion}/{index.html,links.php}" Once a website has been crawled, make it delete the entire folder and mark it as crawled in onion-crawl.csv (columns: link (http://blahlbahadazdazaz.onion), crawled (y/n)) +- V1.3.2: crawler.py: in each crawled html/php/txt file, make it find every simplex chatroom link, simplex server link, and every onion link. +- V1.3.3: crawler.py: with every link found, make sure it is properly categorized just like in v1.2.2, directly into unverified.csv + + V1.4.0+ PGP support: - csv+php+py: implement PGP support to list public pgp keys for verified websites - csv+php: figure out how to expand the software to include simplex chatrooms (maybe add another column ?)