You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
 
Go to file
Aleksa Sarai 6f1b70e5eb util.utf8: improve CJK character detection
Previously the CJK character detection defined only characters in the
range U+4000..U+AFFF as "CJK characters". This excludes an incredibly
large number of CJK characters within the BMP, let alone the whole two
planes dedicated to rarer CJK characters (the SIP and TIP). As a result,
a very large number of Chinese, Japanese, and Korean characters were not
detected as being CJK characters.

While slightly less elegant-looking, it is far more accurate to compute
the codepoint from the utf8 character and then see if it falls within
one of the defined CJK blocks. This is not future-proof against future
CJK ideograph extensions in future Unicode versions, but there is no
real way to accurately predict such changes so this is the best we can
do without accidentally treating characters explicitily defined as being
non-CJK in Unicode as CJK.

While we're at it, copy Lua 5.3's utf8.charpattern constant definition
so that we can more easily write utf8 iterators with string.gmatch (at
least in the interim until there is a rework of utf8 handling in
KOReader and everything is rebuilt on top of utf8proc).

Some unit tests are added for Korean and Japanese text, and the existing
unit tests needed a minor adjustment to handle the fact that
isSplittable now correctly detects CJK punctuation as a character to
compare against the forbidden split rules.

Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
3 years ago
.ci CI: Update to Busted 2.0.0 3 years ago
.circleci [CI] Fix certificate issue with quick koreader/koappimage:0.1.8 (#8305) 3 years ago
.github Bug report changes (#7709) 3 years ago
base@cf317f2fc5 bump crengine: upstream sync, page splitting fix (#8360) 3 years ago
doc Fix docs, evernote → exporter (#8158) 3 years ago
frontend util.utf8: improve CJK character detection 3 years ago
l10n@4ea307af6e Update translations for 2021.10 (#8334) 3 years ago
metadata/en-US F-Droid description: tame expectations (#8178) 3 years ago
platform Bump android-luajit-launcher 3 years ago
plugins DocSettings/Purge .sdr: reword, don't purge other books (#8348) 3 years ago
resources Quick start guide: revamp text and look (#7985) 3 years ago
spec/unit util.utf8: improve CJK character detection 3 years ago
test@86eeb0b43d various test/coverage optimization 8 years ago
tools wbuilder: use correct call for BookStatus widget (#8342) 3 years ago
.busted Travis update 9 years ago
.codecov.yml [CI] Add .codecov.yml (#4695) 5 years ago
.editorconfig experimental port to Mac OSX 8 years ago
.gitignore Add macOS target 4 years ago
.gitmodules add basic metadata for F-Droid 3 years ago
.luacheckrc Update UI layout code to use new SVG icons 3 years ago
.luacov [CI] Also run coverage on plugins (#3447) 7 years ago
.shellcheckrc [CI] Add curly braces check (#5809) 4 years ago
.travis.yml quickstart fix (#2804) 7 years ago
COPYING switch license to AGPLv3 10 years ago
Makefile Bump android backend (#7813) 3 years ago
README.md Readme: Tame down expectations (#8177) 3 years ago
datastorage.lua [CI] Mac OS app (#6955) 3 years ago
defaults.lua Raise DocumentCache hard cap to 512MB 3 years ago
kodev Bump android backend (#7813) 3 years ago
reader.lua FileManager/ReaderUI: Clarify the current instance accessor (#7658) 3 years ago
setupkoenv.lua Truly silence the attempt at loading SDL2 3 years ago

README.md

KOReader

KOReader is a document viewer primarily aimed at e-ink readers.

AGPL Licence Latest release Gitter Mobileread Build Status Coverage Status Weblate Status

DownloadUser guideWikiDeveloper docs

Main features

  • portable: runs on embedded devices (Cervantes, Kindle, Kobo, PocketBook, reMarkable), Android and Linux computers. Developers can run a KOReader emulator in Linux and MacOS.

  • multi-format documents: supports fixed page formats (PDF, DjVu, CBT, CBZ) and reflowable e-book formats (EPUB, FB2, Mobi, DOC, CHM, TXT). Scanned PDF/DjVu documents can also be reflowed with the built-in K2pdfopt library.

  • full-featured reading: multi-lingual user interface with a highly customizable reader view and many typesetting options. You can set arbitrary page margins, override line spacing and choose external fonts and styles. It has multi-lingual hyphenation dictionaries bundled into the application.

  • integrated with calibre (search metadata, receive ebooks wirelessly, browse library via OPDS), Wallabag, Wikipedia, Google Translate and other content providers.

  • optimized for e-ink devices: custom UI without animation, with paginated menus, adjustable text contrast, and easy zoom to fit content or page in paged media.

  • extensible: via plugins

  • fast: on some older devices, it has been measured to have less than half the page-turn delay as the built in reading software.

  • and much more: look up words with StarDict dictionaries / Wikipedia, add your own online OPDS catalogs and RSS feeds, over-the-air software updates, an FTP client, an SSH server, …

Please check the user guide and the wiki to discover more features and to help us document them.

Screenshots

Installation

Please follow the model specific steps for your device:

AndroidCervantesKindleKoboLinuxPocketbookreMarkable

Development

Setting up a build environmentCollaborating with GitBuilding targetsPortingDeveloper docs

Support

KOReader is developed and supported by volunteers all around the world. There are many ways you can help:

Right now we only support liberapay donations, but you can also create a bounty to motivate others to work on a specific bug or feature request.

Contributors

Last commit Commit activity