Commit Graph

61 Commits (ec3357be5d402205d562500ee378dc36a330384c)

Author SHA1 Message Date
Aloïs Micard ec3357be5d
Big improvements
- Reduce debug noise
- Create scripts to blacklist 'famous' legit hostnames
- Make blacklister more resilient
- Merge archiver & indexer together
- Better prefix for cache key
- Rework scheduling process
- Update architecture.png
- Remove trandoshanctl
- Improve testing
3 years ago
Aloïs Micard a70a958ee4
Create constraint package 3 years ago
Aloïs Micard 17e4e0cd61
crawler: skip forbidden hostnames 3 years ago
Aloïs Micard 0cca8b0037
make crawler publish event in case crawling has failed 3 years ago
Aloïs Micard 275bad8a6e
Finalize refactoring 4 years ago
Aloïs Micard 8555c5eb05
Refactor configapi 4 years ago
Aloïs Micard 1feda6e3b9
Refactor archiver 4 years ago
Aloïs Micard d1633844c4
Refactor crawler using new process library 4 years ago
Aloïs Micard aacb28ea1c
Start refactoring crawler 4 years ago
Aloïs Micard 18bb162ac7
event: Create new SubscribeAll function 4 years ago
Aloïs Micard 2d86123c4a
SubscribeAsync -> Subscribe 4 years ago
Aloïs Micard a9e1d44e6c
Rework event system: add RawMessage 4 years ago
Aloïs Micard 54ff87a130
event: add PublishJson method 4 years ago
Aloïs Micard bf884d16c2
Bump app versions 4 years ago
Aloïs Micard 5b37a4aeb0
Add time to new resource event 4 years ago
Aloïs Micard ca47be907f
Fix state visibility 4 years ago
Aloïs Micard 4ff76ed552
Refactor crawler 4 years ago
Aloïs Micard 91a0dbb0ba
Move http folder into crawler 4 years ago
Aloïs Micard 3a78e26ee2
Merge remote-tracking branch 'origin/master' into develop 4 years ago
Aloïs Micard ae31e70c42
Rename event-srv -> hub 4 years ago
Aloïs Micard db983c584b
Merge remote-tracking branch 'origin/develop' into rabbitmq-refactoring 4 years ago
Aloïs Micard 5f657cfc74
Add mock http client & response 4 years ago
Aloïs Micard 59aa2cf86f
Crawl headers 4 years ago
Aloïs Micard 0dc70f63f7
Refactor to use RabbitMQ 4 years ago
Aloïs Micard 0c4013f0c1
Bump app versions 4 years ago
Aloïs Micard e4a01a1876
Harmonize error management 4 years ago
Aloïs Micard f297c9eab5
Harmonize logging messages 4 years ago
Aloïs Micard c752eb95e0
Release 0.5.1 4 years ago
Aloïs Micard d55e0e4609
Release 0.5.0 4 years ago
Aloïs Micard a4f86fbee9
Finalize usage of authentication for components 4 years ago
Aloïs Micard f24f86fa6e
Fix build (disable goreportcard-action) 4 years ago
Aloïs Micard fd32c66774
Implement API authentication
Also split source code into new architecture + start writing tests
4 years ago
Aloïs Micard cf92b5ff31
s/process/component 4 years ago
Aloïs Micard 7582af03f2
Release 0.4.0 4 years ago
Aloïs Micard fa348dca5d
Last cleanups
- API: implement pagination for search endpoints
- Crawler: do not save body when code > 302
- Scripts: add stop.sh
4 years ago
Aloïs Micard 0e6477dd0a
Now follow redirect 4 years ago
Aloïs Micard 5b220de671
Move messaging into internal package 4 years ago
Aloïs Micard 1c8368704c
Delete pkg/ package and split it 4 years ago
Aloïs Micard a0be5160dc
Start implementing new architecture 4 years ago
Aloïs Micard 8eedbdd572
Release 0.3.0 4 years ago
Aloïs Micard 325c6ef175
Migrate to zerolog
Closes: #20
4 years ago
Aloïs Micard c043ad86f7
Release 0.2.0 4 years ago
Aloïs Micard a635722690
Fix docker image build 4 years ago
Aloïs Micard 82a4a9c527
Add tdsh- prefix to executables.
API#searchResources:

- Serialize date
- Do not return body in get
4 years ago
Aloïs Micard 5c739b5809 Release 0.1.0 4 years ago
Aloïs Micard 1413680121
[#9] Prevent from crawling binary, image, etc... 4 years ago
Aloïs Micard b5b58a8d19
Cleanup code 4 years ago
Aloïs Micard 05df5c56a4
Name apps, write test 4 years ago
Aloïs Micard 56cb94258f
Crawler: Allow to customize user agent 4 years ago
Aloïs Micard 6b28f074d1
Implement API
Now persister process will use API to save resource content.
Scheduler will also use the API to get resource by URL, and will later
determinate if scheduling should be done based on his own algorithm
4 years ago