3.6. TODO

3.6.1. For next release

  • nice progress bars for large uploads and downloads (in client + server log)

  • more joedbc code generation:

    • Split Database with Database_Storage parent

    • Readonly_Interpreted_Database

    • Reflection macro: tables/<table_name>/reflection.h

  • Blob cache:

    • keep blob translation index in a joedb file (erasable)

    • write blobs to another file with max size

    • when max size reached, start again from the start (evict overwritten entries)

  • proper handling of unique_index with more than one column:

    • joedbc produces a function to update multiple values simultaneously. Index columns cannot be updated individually.

    • do not allow more than one unique index with the same last column.

    • when reading the file, update index only when last column is updated

    • This may break old files.

    • No more than one unique index per table?

  • allow reading dropped fields in custom functions that are invoked before the drop. Store data in a column vector, and clear the vector at the time of the drop. Make sure field id is not reused. (make access function private, and custom functions are friends)

  • Add support for vcpkg

  • non-durable transactions that do not break durability: - switch checkpoints only after durable transaction - use negative value for non-durable checkpoint - when opening a file: if non-durable checkpoint is equal to file size, OK by default (but option) - client option to run a durable transaction every n seconds

3.6.2. New Operations and Types

  • Add an undo operation to the log. This way, it is possible to keep all branches of history.

  • Use diff for large-string update

  • Differentiate between “storage type” and “usage type”:

    • remove bool type and use int8 instead, with bool usage

    • usages: bool(int8), date(int64).

    • uint8, uint16, uint32, uint64

    • custom usage label: ip address(int32), URL(string), PNG file(string), UTF8(string) (use base64 instead for json output), …?

3.6.3. Blobs

  • network protocol extension to handle local blob cache without downloading everything

  • zero-copy access to blob data using memory-mapped file

3.6.4. On-disk Storage

  • In a directory

  • A checkpoint file (2 copies, valid if identical)

  • A subdirectory for each table

  • One file per column vector

  • One file for string data (string column = size + start_index)

  • Use memory-mapped files (is there a portable way?)

3.6.5. Compiler

  • check that vector range is OK in constructor of vector update

  • modularize code generation

    • Each module should have:

      • required include files

      • data structure for storing data

      • additional hidden table fields?

      • triggers (after/before insert/update/delete)

      • public methods

    • Possible to modularize:

      • indexes

      • sort functions

      • referential integrity

      • safety checks

      • incrementally updated group-by queries

  • use std::set and std::multiset for indexes? Might be better for strings.

  • Table options:

    • no_delete: allows more efficient indexing (+smaller code)

    • last N (for web access log) (last 0 = none)

  • Allow the user to write custom event-processing functions and store information in custom data structures (for instance: collect statistics from web access log without storing whole log in RAM).

  • Compiler utilities:

    • referential integrity

    • queries (SQL compiler?)

    • incrementally-updated group-by queries (OLAP, hypercube, …)

3.6.6. Better Freedom_Keeper

  • index returned by public methods of Freedom_Keeper should be record ids.

  • No need to maintain a linked list of individual records

  • A linked list of intervals instead, to unify everything

  • Let joedb_merge fuse intervals to remove holes (100% update_vector)

  • joedb_to_json can also become more efficient

  • Get ready for “last-N” storage, and no_delete option (force single interval).

3.6.7. Concurrency

  • joedb_server:

  • SHA-256: option for either none, fast or full.

  • Connection_Multiplexer for multiple parallel backup servers? Complicated. requires asynchronous client code.

  • Do not crash on write error, continue to allow reading?

  • Notifications from server to client, in a second channel:

    • when another client makes a push

    • when the lock times out

    • when the server is interrupted

    • ping

  • SQLite connection (store checkpoint and lock in DB + fail on pull if anything to be pulled)

3.6.8. Use case: log with safe real-time remote backup

  • log rotation, ability to delete or compress early part of the log:

    • multi-part file

    • keeps a table with all parts

    • keep first part as schema definition + checkpoint

    • skip deleted parts when reading

    • option to compress a part at rotation time

  • Asynchronous Server Connection (for tamper-proof log backup)

    • does not wait for confirmation after push

    • can batch frequent pushes (do not send new push until after receiving the previous push confirmation)

    • keeps working even if server dies

3.6.9. Performance

3.6.10. joedb_admin

  • serve with boost::beast.

  • work as a client to a joedb_server.

  • customizable GUI, similar to the icga database editor.

3.6.11. Other Ideas

  • One separate class for each exception, like joedb::exception::Out_Of_Date.

  • Is it possible to replace macros by templates?

  • ability to indicate minimum joedb version in joedbc (and joedbi?)

  • better readable interface:

    • a separate table abstraction (that could be used for query output)

    • cursors on tables

  • compiled Readable

  • Deal properly with inf and nan everywhere (logdump, joedb_admin, …)

  • Note that SQL does not support inf and nan. Use NULL instead.

  • Raw commands in interpreter?

  • import from SQL

  • namespace for each subdir?