Commit Graph

215 Commits (main)

Author SHA1 Message Date
Aloïs Micard 9e2186b97a
Add missing comments 3 years ago
Aloïs Micard 5b37a4aeb0
Add time to new resource event 3 years ago
Aloïs Micard ac983b25ef
scheduler: fix errors perms 3 years ago
Aloïs Micard dea2cfe7b0
Refactor API 3 years ago
Aloïs Micard 6b54772ac4
Improve scheduler error management 3 years ago
Aloïs Micard 8516c8a00c
Refactor scheduler 3 years ago
Aloïs Micard ca47be907f
Fix state visibility 3 years ago
Aloïs Micard 73cb76e1f7
Refactor extractor 3 years ago
Aloïs Micard 4ff76ed552
Refactor crawler 3 years ago
Aloïs Micard 5ad83d57a0
Rewrite event support 3 years ago
Aloïs Micard 91a0dbb0ba
Move http folder into crawler 3 years ago
Aloïs Micard 972c76a383
Remove unusued regex 3 years ago
Aloïs Micard 4235145591
Create package for API 4 years ago
Aloïs Micard 3a78e26ee2
Merge remote-tracking branch 'origin/master' into develop 4 years ago
Aloïs Micard 39769e724f
Make message persistent by default 4 years ago
Aloïs Micard 180182482c
Final (tested) fixes to the api search method 4 years ago
Aloïs Micard fa542b4bcb
Fix ES db query 4 years ago
Aloïs Micard de4779724f
Prevent duplicates (enough this time?) 4 years ago
Aloïs Micard 8417539395
Fix RabbitMQ consumer 4 years ago
Aloïs Micard 1eb45d82b8
Merge pull request #53 from creekorful/rabbitmq-refactoring
Refactor to use RabbitMQ
4 years ago
Aloïs Micard 29ed1f2f5f
Put headers & meta in lowercase 4 years ago
Aloïs Micard ae31e70c42
Rename event-srv -> hub 4 years ago
Aloïs Micard db983c584b
Merge remote-tracking branch 'origin/develop' into rabbitmq-refactoring 4 years ago
Aloïs Micard 3b320d49c7
Merge remote-tracking branch 'origin/develop' into 54-extract-http-headers 4 years ago
Aloïs Micard 5f657cfc74
Add mock http client & response 4 years ago
Aloïs Micard 59aa2cf86f
Crawl headers 4 years ago
Aloïs Micard 82868521ab
extractor: prevent from publishing duplicates URLs 4 years ago
Aloïs Micard b365954e31
scheduler: filter protocol 4 years ago
Aloïs Micard 0dc70f63f7
Refactor to use RabbitMQ 4 years ago
Aloïs Micard 0c4013f0c1
Bump app versions 4 years ago
Aloïs Micard fec9d5c506
Make things more readable 4 years ago
Aloïs Micard c3d387b545
Finalize ACL implementation 4 years ago
Aloïs Micard 5e3bc78ae1
Improve TestHandleMessage 4 years ago
Aloïs Micard 4bcbcfaefd
Add todo about allow search by meta 4 years ago
Aloïs Micard 1f106dca49
Add meta & description to resource 4 years ago
Aloïs Micard e4a01a1876
Harmonize error management 4 years ago
Aloïs Micard f297c9eab5
Harmonize logging messages 4 years ago
Aloïs Micard e1c0320a7b
Allow to skip scheduling for url with forbidden extensions
Closes: #42
4 years ago
Aloïs Micard 0e05349f05
scheduler: reduce log noise 4 years ago
Aloïs Micard c752eb95e0
Release 0.5.1 4 years ago
carter a80360a8ce Fixed doubling-up of URLs due to URL structure miss-match 4 years ago
Aloïs Micard 8233880fb8
Fix api test 4 years ago
Aloïs Micard 1e238a34d8
[#34] Improve search resources 4 years ago
Aloïs Micard 2301a25dff
Move database in api folder 4 years ago
Aloïs Micard d55e0e4609
Release 0.5.0 4 years ago
Aloïs Micard 8750830a62
Use resty client 4 years ago
Aloïs Micard a4f86fbee9
Finalize usage of authentication for components 4 years ago
Aloïs Micard 4633cc7695
Unit test scheduler 4 years ago
Aloïs Micard 73f52703f1
Unit test extractor 4 years ago
Aloïs Micard 27e7c9d2fa
Implement basic user registration system 4 years ago
Aloïs Micard f24f86fa6e
Fix build (disable goreportcard-action) 4 years ago
Aloïs Micard fd32c66774
Implement API authentication
Also split source code into new architecture + start writing tests
4 years ago
Aloïs Micard c2e501d0c2
Merge remote-tracking branch 'origin/master' into feature/api-authentication 4 years ago
Aloïs Micard cf92b5ff31
s/process/component 4 years ago
Aloïs Micard 87cff914b6
Start implement authentication endpoints 4 years ago
Aloïs Micard a1b17d7196
Merge remote-tracking branch 'origin/master' into feature/api-authentication 4 years ago
Aloïs Micard 362133bb23
Improve search rendering on trandoshanctl 4 years ago
Aloïs Micard 25d6452c65
Implement authorization trough JWT 4 years ago
Aloïs Micard 646da7dfdf
API contract: now pagine-able! 4 years ago
Aloïs Micard 7582af03f2
Release 0.4.0 4 years ago
Aloïs Micard b85e9944a2
API: fix startDate/endDate query param 4 years ago
Aloïs Micard 62b54bf385
[#12] Allow duplicate resource crawling 4 years ago
Aloïs Micard e61dc42d3c
Some cleanup 4 years ago
Aloïs Micard fa348dca5d
Last cleanups
- API: implement pagination for search endpoints
- Crawler: do not save body when code > 302
- Scripts: add stop.sh
4 years ago
Aloïs Micard e0dfc648b6
Implement search in trandoshanctl 4 years ago
Aloïs Micard 8d9d9524a7
Finalize search endpoint 4 years ago
Aloïs Micard cacf4f1236
Improve api search endpoint 4 years ago
Aloïs Micard 0e6477dd0a
Now follow redirect 4 years ago
Aloïs Micard 742ccbaa79
Finalize whole implementation 4 years ago
Aloïs Micard 6081a6a7c2
Move url extraction logic to extractor 4 years ago
Aloïs Micard f2b8984356
Little cleanup 4 years ago
Aloïs Micard ae5812c566
Make extractor publish found URLs 4 years ago
Aloïs Micard 560d7cb846
Implement extractor 4 years ago
Aloïs Micard 5b220de671
Move messaging into internal package 4 years ago
Aloïs Micard 20f67edd28
Create API client 4 years ago
Aloïs Micard 1c8368704c
Delete pkg/ package and split it 4 years ago
Aloïs Micard 11f04b1ca3
Add missing comments 4 years ago
Aloïs Micard a0be5160dc
Start implementing new architecture 4 years ago
Aloïs Micard 8eedbdd572
Release 0.3.0 4 years ago
Aloïs Micard 325c6ef175
Migrate to zerolog
Closes: #20
4 years ago
Aloïs Micard 42ee930160
Unit test scheduler 4 years ago
Aloïs Micard c043ad86f7
Release 0.2.0 4 years ago
Aloïs Micard a635722690
Fix docker image build 4 years ago
Aloïs Micard 82a4a9c527
Add tdsh- prefix to executables.
API#searchResources:

- Serialize date
- Do not return body in get
4 years ago
Aloïs Micard 5c739b5809 Release 0.1.0 4 years ago
Aloïs Micard 1413680121
[#9] Prevent from crawling binary, image, etc... 4 years ago
Aloïs Micard 45a9848395
Lint source code 4 years ago
Aloïs Micard b5b58a8d19
Cleanup code 4 years ago
Aloïs Micard 8ae38445cf
Move ResourceDto to proto package 4 years ago
Aloïs Micard 482dde3e17
[#8] Handle case no ES collection yet 4 years ago
Aloïs Micard 6519672b13
Api#addUrl: Fix sent message 4 years ago
Aloïs Micard ed7ea4596b
[#7] API should publish to URLFoundSubject
this will allow scheduler to approve or not.
4 years ago
Aloïs Micard 945651b93a
[#7] Fix publish URL 4 years ago
Aloïs Micard 05df5c56a4
Name apps, write test 4 years ago
Aloïs Micard 56cb94258f
Crawler: Allow to customize user agent 4 years ago
Aloïs Micard 599e6ef4d3
Fix wrong endpoint being used by scheduler
Also b64 encode the URL.

Closes #6
4 years ago
Aloïs Micard 75fa6724c9
Allow to submit new URL trough the API
Closes #4
4 years ago
Aloïs Micard 68ddf09aaa
Use logrus everywhere 4 years ago
Aloïs Micard 8a32bbe5fa
Finx lint issues 4 years ago
Aloïs Micard 6b28f074d1
Implement API
Now persister process will use API to save resource content.
Scheduler will also use the API to get resource by URL, and will later
determinate if scheduling should be done based on his own algorithm
4 years ago
Aloïs Micard 680eccef96
Fix wrong usage of logrus in trandoshan-api 4 years ago
Aloïs Micard 317a4eabbd
Add api process 4 years ago
Aloïs Micard 5f1dd4bec8
Implement persister
Add kibana & elasticsearch dependencies.
4 years ago
Aloïs Micard 7d2e666ba9
Refactor nats logic into natsutil 4 years ago
Aloïs Micard 82250b46ae
Centralize logging initialization 4 years ago
Aloïs Micard fd9d2e2b9e
Add persister process 4 years ago
Aloïs Micard 4e49e0aca9
Crawler: now publish message with resource body 4 years ago
Aloïs Micard 29da7859b4
Scheduler: normalized received URLs 4 years ago
Aloïs Micard cf8c2875cb
Run gofmt over the project 4 years ago
Aloïs Micard 28f32042c6
Setup crawler to use tor proxy to reach hidden services 4 years ago
Aloïs Micard 33269f7ffa
Centralize ReadJSON into natsutil 4 years ago
Aloïs Micard 6d349da3a6
Fix lint errors 4 years ago
Aloïs Micard c6a857f45b
Add basic scheduler implementation
- Create separate proto package to store Trandoshan related protocol implementation
4 years ago
Aloïs Micard 06f31f8d9c
Implement crawler process
- Also change module URL.
- Create natsutil package
4 years ago
Aloïs Micard 2f17ee088a
Initial commit 4 years ago