Commit Graph

215 Commits (main)

Author SHA1 Message Date
Aloïs Micard 8a4515d364 Release 1.0.0 3 years ago
Aloïs Micard a1b997530d
Release 1.0.0-rc2 3 years ago
Aloïs Micard 6f79af997f
s/creekorful/darkspot(-org)/ 3 years ago
Aloïs Micard 61bbc6d4be
Release 1.0.0-rc1 3 years ago
Aloïs Micard a2ffa21dce
Allow proper cache configuration 3 years ago
Aloïs Micard 1f5dc1e45d
Allow to customize redis password 3 years ago
Aloïs Micard 02a0dca6ed
Add process description 3 years ago
Aloïs Micard 96c75a85b1
Rename project 3 years ago
Aloïs Micard b657123d91
Bump version 3 years ago
Aloïs Micard 6c678478a1
Implement better blacklist config 3 years ago
Aloïs Micard 871daf1bcc
scheduler: hash url before caching it 3 years ago
Aloïs Micard 69352f7237
indexer: sort headers to have deterministic output 3 years ago
Aloïs Micard afed403e6a
Remove useless regex 3 years ago
Aloïs Micard 4e33813b21
Merge remote-tracking branch 'origin/develop' into 124-improve-scheduler-speed 3 years ago
Aloïs Micard 7820820fa9
scheduler: add batch support for dialing with cache 3 years ago
Aloïs Micard 9b46dc205e
Indexer: support buffered indexing 3 years ago
Aloïs Micard 71f82d4aad
process: Rework whole flags system
- Turn the flag into Feature system to allow easier configuration.
- Add prefetch flag to event feature
3 years ago
Aloïs Micard 829afcbb6a
Release 0.10.0 3 years ago
Aloïs Micard ec3357be5d
Big improvements
- Reduce debug noise
- Create scripts to blacklist 'famous' legit hostnames
- Make blacklister more resilient
- Merge archiver & indexer together
- Better prefix for cache key
- Rework scheduling process
- Update architecture.png
- Remove trandoshanctl
- Improve testing
3 years ago
Aloïs Micard 8da1f29a43
little fixes 3 years ago
Aloïs Micard d0dffb9928
Implement new blacklister 3 years ago
Aloïs Micard 2133a1aeb5
bump app versions 3 years ago
Aloïs Micard a27092fd13
Use new storage format 3 years ago
Aloïs Micard cc3c0d62d6
remove hacky check 3 years ago
Aloïs Micard c8352d3299
Use url cache to determinate if crawling should be done 3 years ago
Aloïs Micard e245e5d79a
last fixes 3 years ago
Aloïs Micard 60a23f7182
Fix ttl 3 years ago
Aloïs Micard 12362e0100
Fix tests case 3 years ago
Aloïs Micard 4a0fbd0b9b
add configapi key prefix 3 years ago
Aloïs Micard 0aba4fa4f9
Finalize redis cache impl 3 years ago
Aloïs Micard 387a93b7b9
Create new flags for cache 3 years ago
Aloïs Micard d826fe73b6
Refactor configapi to use new cache 3 years ago
Aloïs Micard 477092316b
Implement cache logic 3 years ago
Aloïs Micard 55ae36f3b9
s/database/index 3 years ago
Aloïs Micard 2d6beb26ce
elastic: pre-declare index mapping 3 years ago
Aloïs Micard 15bae2143d
improve test cases 3 years ago
Aloïs Micard 4bddf39335
make scheduler use whitelisting instead of blacklisting 3 years ago
Aloïs Micard 188df77541
improve logging 3 years ago
Aloïs Micard 2eb416845e
improve logging 3 years ago
Aloïs Micard ad808e6b31
indexer: do not publish duplicate URLs 3 years ago
Aloïs Micard 797c3df9a5
move api client into appropriate package 3 years ago
Aloïs Micard 4d250b6cb0
Finalize refactoring 3 years ago
Aloïs Micard a996bf2d5b
Turn API into indexer 3 years ago
Aloïs Micard 9e75f85923
Rename handler 3 years ago
Aloïs Micard db7d204724
Fix comment 3 years ago
Aloïs Micard f4099838e1
Make API publish new index event 3 years ago
Aloïs Micard a70a958ee4
Create constraint package 3 years ago
Aloïs Micard d62af0889e
extractor: check forbidden hostnames 3 years ago
Aloïs Micard 2f6a569349
api: fix tests 3 years ago
Aloïs Micard 17e4e0cd61
crawler: skip forbidden hostnames 3 years ago