Aloïs Micard
|
ec3357be5d
|
Big improvements
- Reduce debug noise
- Create scripts to blacklist 'famous' legit hostnames
- Make blacklister more resilient
- Merge archiver & indexer together
- Better prefix for cache key
- Rework scheduling process
- Update architecture.png
- Remove trandoshanctl
- Improve testing
|
3 years ago |
Aloïs Micard
|
2d7499f7e2
|
Merge pull request #118 from creekorful/106-improve-blacklister
Implement new blacklister
|
3 years ago |
Aloïs Micard
|
8da1f29a43
|
little fixes
|
3 years ago |
Aloïs Micard
|
d0dffb9928
|
Implement new blacklister
|
3 years ago |
Aloïs Micard
|
2133a1aeb5
|
bump app versions
|
3 years ago |
Aloïs Micard
|
46a7a05e4a
|
Merge pull request #116 from creekorful/110-archiver-new-format
Implement new storage format
|
3 years ago |
Aloïs Micard
|
a27092fd13
|
Use new storage format
|
3 years ago |
Aloïs Micard
|
1ac5c1e036
|
Merge remote-tracking branch 'origin/develop' into 110-archiver-new-format
|
3 years ago |
Aloïs Micard
|
571b1e2628
|
Merge pull request #115 from creekorful/111-prevent-duplicates-urls
Prevent duplicates urls in crawlingQueue
|
3 years ago |
Aloïs Micard
|
cc3c0d62d6
|
remove hacky check
|
3 years ago |
Aloïs Micard
|
c8352d3299
|
Use url cache to determinate if crawling should be done
|
3 years ago |
Aloïs Micard
|
e245e5d79a
|
last fixes
|
3 years ago |
Aloïs Micard
|
60a23f7182
|
Fix ttl
|
3 years ago |
Aloïs Micard
|
12362e0100
|
Fix tests case
|
3 years ago |
Aloïs Micard
|
4a0fbd0b9b
|
add configapi key prefix
|
3 years ago |
Aloïs Micard
|
0aba4fa4f9
|
Finalize redis cache impl
|
3 years ago |
Aloïs Micard
|
387a93b7b9
|
Create new flags for cache
|
3 years ago |
Aloïs Micard
|
d826fe73b6
|
Refactor configapi to use new cache
|
3 years ago |
Aloïs Micard
|
477092316b
|
Implement cache logic
|
3 years ago |
Aloïs Micard
|
55ae36f3b9
|
s/database/index
|
3 years ago |
Aloïs Micard
|
87a2fb246f
|
Add new hostname to blacklist
|
3 years ago |
Aloïs Micard
|
38a0a36de0
|
Merge pull request #113 from creekorful/109-pre-declared-mapping
elastic: pre-declare index mapping
|
3 years ago |
Aloïs Micard
|
2d6beb26ce
|
elastic: pre-declare index mapping
|
3 years ago |
Aloïs Micard
|
33ba6b4e7d
|
Merge pull request #112 from creekorful/101-scheduler-whitelisting
make scheduler use whitelisting instead of blacklisting
|
3 years ago |
Aloïs Micard
|
15bae2143d
|
improve test cases
|
3 years ago |
Aloïs Micard
|
4bddf39335
|
make scheduler use whitelisting instead of blacklisting
|
3 years ago |
Aloïs Micard
|
d5eb551d82
|
Merge pull request #104 from creekorful/103-turn-api-into-indexer
Turn api into indexer
|
3 years ago |
Aloïs Micard
|
188df77541
|
improve logging
|
3 years ago |
Aloïs Micard
|
039f8cb76c
|
update architecture.png
|
3 years ago |
Aloïs Micard
|
2eb416845e
|
improve logging
|
3 years ago |
Aloïs Micard
|
c5bd0b3b87
|
remove old CD stuff
|
3 years ago |
Aloïs Micard
|
ad808e6b31
|
indexer: do not publish duplicate URLs
|
3 years ago |
Aloïs Micard
|
797c3df9a5
|
move api client into appropriate package
|
3 years ago |
Aloïs Micard
|
4d250b6cb0
|
Finalize refactoring
|
3 years ago |
Aloïs Micard
|
c42cb26a11
|
remove extractor
|
3 years ago |
Aloïs Micard
|
a996bf2d5b
|
Turn API into indexer
|
3 years ago |
Aloïs Micard
|
ace3b4a0fc
|
Merge pull request #102 from creekorful/100-archiver-rework
Make API publish new index event
|
3 years ago |
Aloïs Micard
|
9e75f85923
|
Rename handler
|
3 years ago |
Aloïs Micard
|
db7d204724
|
Fix comment
|
3 years ago |
Aloïs Micard
|
f4099838e1
|
Make API publish new index event
|
3 years ago |
Aloïs Micard
|
f855f51a8e
|
add missing flag to extractor
|
3 years ago |
Aloïs Micard
|
1efc3d5263
|
Merge pull request #99 from creekorful/replicated-checks
Replicated checks
|
3 years ago |
Aloïs Micard
|
a70a958ee4
|
Create constraint package
|
3 years ago |
Aloïs Micard
|
d62af0889e
|
extractor: check forbidden hostnames
|
3 years ago |
Aloïs Micard
|
9fb1861d62
|
Merge pull request #98 from creekorful/replicated-checks
Replicated checks
|
3 years ago |
Aloïs Micard
|
2f6a569349
|
api: fix tests
|
3 years ago |
Aloïs Micard
|
17e4e0cd61
|
crawler: skip forbidden hostnames
|
3 years ago |
Aloïs Micard
|
ab9bf3d4c3
|
api: do not save forbidden hostname
|
3 years ago |
Aloïs Micard
|
40470b2221
|
release.sh: assume release commit exist
|
3 years ago |
Aloïs Micard
|
592a1621d6
|
Bump app versions
|
3 years ago |