Aloïs Micard
|
e07ed8156e
|
scheduler: hash url before caching it
|
3 years ago |
Aloïs Micard
|
cae3bb514f
|
Merge pull request #128 from creekorful/114-fix-tests
indexer: sort headers to have deterministic output
|
3 years ago |
Aloïs Micard
|
faee8b48c1
|
indexer: sort headers to have deterministic output
|
3 years ago |
Aloïs Micard
|
8297dc7616
|
Merge pull request #127 from creekorful/124-improve-scheduler-speed
scheduler: increase performances
|
3 years ago |
Aloïs Micard
|
84a28c5be0
|
scheduler: increase event prefetch
|
3 years ago |
Aloïs Micard
|
afed403e6a
|
Remove useless regex
|
3 years ago |
Aloïs Micard
|
4e33813b21
|
Merge remote-tracking branch 'origin/develop' into 124-improve-scheduler-speed
|
3 years ago |
Aloïs Micard
|
7820820fa9
|
scheduler: add batch support for dialing with cache
|
3 years ago |
Aloïs Micard
|
de50ed02e3
|
Merge pull request #126 from creekorful/125-indexer-bulk-indexation
Indexer: implement bulk indexation
|
3 years ago |
Aloïs Micard
|
9b46dc205e
|
Indexer: support buffered indexing
|
3 years ago |
Aloïs Micard
|
71f82d4aad
|
process: Rework whole flags system
- Turn the flag into Feature system to allow easier configuration.
- Add prefetch flag to event feature
|
3 years ago |
Aloïs Micard
|
4075dfc98a
|
Merge pull request #121 from creekorful/develop
Release 0.10.0
|
3 years ago |
Aloïs Micard
|
829afcbb6a
|
Release 0.10.0
|
3 years ago |
Aloïs Micard
|
ec3357be5d
|
Big improvements
- Reduce debug noise
- Create scripts to blacklist 'famous' legit hostnames
- Make blacklister more resilient
- Merge archiver & indexer together
- Better prefix for cache key
- Rework scheduling process
- Update architecture.png
- Remove trandoshanctl
- Improve testing
|
3 years ago |
Aloïs Micard
|
2d7499f7e2
|
Merge pull request #118 from creekorful/106-improve-blacklister
Implement new blacklister
|
3 years ago |
Aloïs Micard
|
8da1f29a43
|
little fixes
|
3 years ago |
Aloïs Micard
|
d0dffb9928
|
Implement new blacklister
|
3 years ago |
Aloïs Micard
|
6c4ecc1a7d
|
Merge pull request #117 from creekorful/develop
Release 0.9.0
|
3 years ago |
Aloïs Micard
|
2133a1aeb5
|
bump app versions
|
3 years ago |
Aloïs Micard
|
46a7a05e4a
|
Merge pull request #116 from creekorful/110-archiver-new-format
Implement new storage format
|
3 years ago |
Aloïs Micard
|
a27092fd13
|
Use new storage format
|
3 years ago |
Aloïs Micard
|
1ac5c1e036
|
Merge remote-tracking branch 'origin/develop' into 110-archiver-new-format
|
3 years ago |
Aloïs Micard
|
571b1e2628
|
Merge pull request #115 from creekorful/111-prevent-duplicates-urls
Prevent duplicates urls in crawlingQueue
|
3 years ago |
Aloïs Micard
|
cc3c0d62d6
|
remove hacky check
|
3 years ago |
Aloïs Micard
|
c8352d3299
|
Use url cache to determinate if crawling should be done
|
3 years ago |
Aloïs Micard
|
e245e5d79a
|
last fixes
|
3 years ago |
Aloïs Micard
|
60a23f7182
|
Fix ttl
|
3 years ago |
Aloïs Micard
|
12362e0100
|
Fix tests case
|
3 years ago |
Aloïs Micard
|
4a0fbd0b9b
|
add configapi key prefix
|
3 years ago |
Aloïs Micard
|
0aba4fa4f9
|
Finalize redis cache impl
|
3 years ago |
Aloïs Micard
|
387a93b7b9
|
Create new flags for cache
|
3 years ago |
Aloïs Micard
|
d826fe73b6
|
Refactor configapi to use new cache
|
3 years ago |
Aloïs Micard
|
477092316b
|
Implement cache logic
|
3 years ago |
Aloïs Micard
|
55ae36f3b9
|
s/database/index
|
3 years ago |
Aloïs Micard
|
87a2fb246f
|
Add new hostname to blacklist
|
3 years ago |
Aloïs Micard
|
38a0a36de0
|
Merge pull request #113 from creekorful/109-pre-declared-mapping
elastic: pre-declare index mapping
|
3 years ago |
Aloïs Micard
|
2d6beb26ce
|
elastic: pre-declare index mapping
|
3 years ago |
Aloïs Micard
|
33ba6b4e7d
|
Merge pull request #112 from creekorful/101-scheduler-whitelisting
make scheduler use whitelisting instead of blacklisting
|
3 years ago |
Aloïs Micard
|
15bae2143d
|
improve test cases
|
3 years ago |
Aloïs Micard
|
4bddf39335
|
make scheduler use whitelisting instead of blacklisting
|
3 years ago |
Aloïs Micard
|
d5eb551d82
|
Merge pull request #104 from creekorful/103-turn-api-into-indexer
Turn api into indexer
|
3 years ago |
Aloïs Micard
|
188df77541
|
improve logging
|
3 years ago |
Aloïs Micard
|
039f8cb76c
|
update architecture.png
|
3 years ago |
Aloïs Micard
|
2eb416845e
|
improve logging
|
3 years ago |
Aloïs Micard
|
c5bd0b3b87
|
remove old CD stuff
|
3 years ago |
Aloïs Micard
|
ad808e6b31
|
indexer: do not publish duplicate URLs
|
3 years ago |
Aloïs Micard
|
797c3df9a5
|
move api client into appropriate package
|
3 years ago |
Aloïs Micard
|
4d250b6cb0
|
Finalize refactoring
|
3 years ago |
Aloïs Micard
|
c42cb26a11
|
remove extractor
|
3 years ago |
Aloïs Micard
|
a996bf2d5b
|
Turn API into indexer
|
3 years ago |