Architectures like SPARC do not allow unaligned accesses. Avoid them by
memcpy()ing the data to an aligned buffer. On x86 systems where
unaligned loads are fast, the memcpy() will be compiled away and the
same code generated as before.
- `__APPLE__` vs `__OS2__` - does this actually work? Would some please test on apple and give feedback?
- pthread via `AX_PTHREAD`
- version via `PACKAGE_VERSION` in `config.h`
- large file support via `AC_SYS_LARGEFILE`