73 Commits (master)

Author SHA1 Message Date
arkiver daab40aa6e Version 20240216.01. Use fixed minimum Wget version 1.21.3-at.20231213.03. Use TLSv1.2. Fix check on svc comment content check. 3 months ago
arkiver e350e69f89 Version 20231020.01. Use gnutls. Support new method of serving Reddit comments. 7 months ago
arkiver 3add4f891c Version 20230910.04. Install lua utf8 library. Fix converting unicode codepoint to utf8 character support. 9 months ago
arkiver 8a46824231 Version 20230910.02. Remove old Lua files. 9 months ago
arkiver a2ffd1f671 Version 20230910.01. Use cjson instead of JSON.lua. 9 months ago
arkiver bb6198cc1a Version 20230627.01. Queue outlinks directly to the urls project. 11 months ago
arkiver f1ef7d1697 Version 20230619.02. Accept 404 on mediaembed URL. 11 months ago
arkiver d2571cde06 Version 20230619.01. Primitive fix to user post verification problems. 12 months ago
masterX244 520e8b95d6
Ignore for some garbge URLs that 404
wget guesses too much and generates bad URLs, ignore needed
12 months ago
arkiver bea971f375 Version 20230614.03. Better check for level error page on svc URL. 12 months ago
arkiver be6e32cba5 Version 20230614.02. Extra validity checks. 12 months ago
arkiver e84e804fc5 Version 20230614.01. Fix check for valid data. 12 months ago
arkiver 4936505b0f Version 20230612.02. Add Reddit problem check for /comments/.../comment/ URL. 12 months ago
arkiver 57adbb381c Version 20230612.01. Kill grab when reddit seems to have problems. 12 months ago
arkiver a974b81618 Version 20230611.01. Extra very simple check on validity of old.reddit.com returned body. 12 months ago
arkiver 15a0a1a6f5 Version 20230607.06. Ignore discovered /r/FIFA URL if coming from a /r/EASportFC parent URL. 12 months ago
arkiver fe17191306 Version 20230607.05. Better checking for video. Abort item if no post is found (during blackout for example). 12 months ago
arkiver 7bb5c39419 Version 20230607.04. Abort on video for now. 12 months ago
arkiver f63c8ab696 Version 20230607.03. Prevent getting URL ending with /". Ignore /message/compose URLs. 12 months ago
arkiver 393407520b Version 20230607.02. Very simple content checks to check if response is complete. Properly prevent writing to WARC in cases and do not abort all items when finding a problematic URL. 12 months ago
arkiver 48b24323c6 Version 20230530.01. Queue discovered outlinks to urls-stash-reddit. 1 year ago
arkiver a3b5bcecc1 Version 20230529.01. Correctly extract more comment pages from comment pages in the new design. Print debug infrmation for comment pages on old design. 1 year ago
arkiver b2654e9317 Version 20230509.01. Support for new design. 1 year ago
arkiver 7f4db17348 Version 20221021.01. Ignore /tailwind-build.css URL from comment in HTML. 2 years ago
arkiver 8a27002fd3 Version 20221005.01. Max tries for backfeed to 10. 2 years ago
arkiver 35e31af37f Queue redditstatic.com URLs as outlinks. 2 years ago
arkiver bab4b4dcd2 Version 20220729.05. Fix aborting item on bad status code on url: item. Keep old retry code otherwise. 2 years ago
arkiver 8c45a263aa Version 20220729.04. Queue extra found URLs on media URLs to backfeed. 2 years ago
arkiver e8fe03fbd0 Version 20220729.03. Add url: prefix to url item. 2 years ago
arkiver f81b2ce97e Version 20220729.01. Queue media URLs back to reddit project and download individually. 2 years ago
arkiver cc83009a94 Version 20220605.01. Support GNU Wget 1.21.3-at.20220503.02. Fix killing crawl when items cannot be queued. 2 years ago
arkiver 0ce1c59ca4 Version 20220415.01. Do not queue /r/undefined/ URLs. 2 years ago
arkiver da28d3c902 Version 20220323.03. Fix items to maxtries variable name. Fix backfeed key name. 2 years ago
arkiver 8944cf1fc6 Version 20220323.02. Fix items to maxtries variable name. 2 years ago
arkiver 10eaa7c50c Version 20220323.01. Fix backfeed. Fix maxtries use. 2 years ago
arkiver 28f132a052 Version 20220312.01. Fix backfeed. 2 years ago
arkiver 4f50a0d699 Version 20220311.01. Use new backfeed endpoint for queuing. 2 years ago
arkiver 383c101aef Version 20220109.02. Cut off URL at space when found between brackets without href= in front. 2 years ago
arkiver df35317e0c Version 20220109.01. Add codepoint to utf8 support. Percent encode outlinks correctly. 2 years ago
arkiver ed80cb5a9d Version 20210707.01. Do not get media for cross posts. 3 years ago
arkiver 4b976e2ea7 Version 20210521.01. Use TLS 1.2. 3 years ago
arkiver 6e15841550 Version 20210407.01. Improve video archiving. Detect if video is still being processed by reddit. 3 years ago
arkiver 1b3690d994 Version 20210330.04. Only decode unicode characters in URLs on v.redd.it URLs. 3 years ago
arkiver ce7fff480d Version 20210330.03. Unescape unicode characters. Do not HLS for video. 3 years ago
arkiver ad04f45d4f Fix typo. 3 years ago
arkiver adc7f9c6fb Version 20210330.02. Skip images that are only in JSON and not on web page. 3 years ago
arkiver 07ed16c44b Version 20210330.01. Handle 403 on v.redd.it on deleted post. 3 years ago
arkiver 8849165130 Version 20210321.01. Do not get all video sizes. 3 years ago
arkiver d3b6659419 Version 20210312.01. Get URLs with utm_* and context params. 3 years ago
arkiver 11d5777391 Version 20210130.01. Support & in URL. Properly abort selected items. 3 years ago