Rewrite README

This commit is contained in:
Andrea Cavalli 2021-02-03 14:08:32 +01:00
parent 5c98465637
commit 09ec134b51
3 changed files with 61 additions and 48 deletions

View File

@ -1,55 +1,68 @@
CavalliumDB Engine CavalliumDB Engine
================== ==================
A very simple wrapper for RocksDB and Lucene, with gRPC and direct connections. A very simple reactive wrapper for RocksDB and Lucene.
This is not a database, but only a wrapper for Lucene Core and RocksDB, with a bit of abstraction. This is not a database, but only a wrapper for Lucene Core and RocksDB, with a bit of abstraction.
# Features # Features
## RocksDB Key-Value NoSQL database engine - **RocksDB key-value database engine**
- Snapshots - Snapshots
- Multi-column databases - Multi-column database
- WAL and corruption recovery strategies - Write-ahead log and corruption recovery
- Multiple data types: - Multiple data types:
- Bytes (Singleton) - Single value (Singleton)
- Maps of bytes (Dictionary) - Map (Dictionary)
- Maps of maps of bytes (Deep dictionary) - Composable nested map (Deep dictionary)
- Sets of bytes (Dictionary without values) - Customizable data serializers
- Maps of sets of bytes (Deep dictionary without values) - Values codecs
- Update-on-write value versioning using versioned codecs
## Apache Lucene Core indexing library - **Apache Lucene Core indexing library**
- Documents structure - Snapshots
- Sorting - Documents structure
- Sorting
- Ascending and descending - Ascending and descending
- Numeric or non-numeric - Numeric or non-numeric
- Searching - Searching
- Nested search terms - Nested search terms
- Combined search terms - Combined search terms
- Fuzzy text search - Fuzzy text search
- Coordinates, integers, longs, strings, text - Coordinates, integers, longs, strings, text
- Indicization and analysis - Indicization and analysis
- N-gram - N-gram
- Edge N-gram - Edge N-gram
- English words - English words
- Stemming - Stemming
- Stopwords removal - Stopwords removal
- Results filtering - Results filtering
- Snapshots
# F.A.Q. # F.A.Q.
## Why is it so difficult? - **Why is it so difficult to use?**
This is not a DMBS.
This is an engine on which a DBMS can be built upon. For this reason it's very difficult to use it directly without using it through abstraction layers. This is not a DBMS.
## Can I use objects in the database? This is an engine on which a DBMS can be built upon; for this reason it's very difficult to use directly without building another abstraction layer on top.
Yes you must serialize/deserialize them using a library of your choice.
## Why there is a snapshot function for each database part? - **Can I use objects instead of byte arrays?**
Since RocksDB and lucene indices are different instances, every instance has its own snapshot function.
To have a single snapshot you must implement it as a collection of sub-snapshots in your DBMS. Yes, you must serialize/deserialize them using a library of your choice.
## Is CavalliumDB Engine suitable for your project? CodecSerializer allows you to implement versioned data using a codec for each data version.
No. Note that it uses 1 to 4 bytes more for each value to store the version.
This engine is largely undocumented, and it doesn't provide extensive tests on its methods.
- **Why there is a snapshot function for each database part?**
Since RocksDB and lucene indices are different software, you can't take a snapshot of everything in the same instant.
A single snapshot must be implemented as a collection of all the snapshots.
- **Is CavalliumDB Engine suitable for your project?**
No.
This engine is largely undocumented, and it doesn't provide extensive tests.
# Examples
In `src/example/java` you can find some quick implementations of each core feature.

View File

@ -48,13 +48,13 @@ public class ParallelCollectorStreamSearcher implements LuceneStreamSearcher {
if (!realFields.isEmpty()) { if (!realFields.isEmpty()) {
logger.error("Present fields:"); logger.error("Present fields:");
for (IndexableField field : realFields) { for (IndexableField field : realFields) {
logger.error(" - " + field.name()); logger.error(" - {}", field.name());
} }
} }
} else { } else {
var field = d.getField(keyFieldName); var field = d.getField(keyFieldName);
if (field == null) { if (field == null) {
logger.error("Can't get key of document docId:" + docId); logger.error("Can't get key of document docId: {}", docId);
} else { } else {
resultsConsumer.accept(new LLKeyScore(field.stringValue(), score)); resultsConsumer.accept(new LLKeyScore(field.stringValue(), score));
} }

View File

@ -41,9 +41,9 @@ public class SimpleStreamSearcher implements LuceneStreamSearcher {
logger.error("The document docId: {}, score: {} is empty.", docId, score); logger.error("The document docId: {}, score: {} is empty.", docId, score);
var realFields = indexSearcher.doc(docId).getFields(); var realFields = indexSearcher.doc(docId).getFields();
if (!realFields.isEmpty()) { if (!realFields.isEmpty()) {
System.err.println("Present fields:"); logger.error("Present fields:");
for (IndexableField field : realFields) { for (IndexableField field : realFields) {
System.err.println(" - " + field.name()); logger.error(" - {}", field.name());
} }
} }
} else { } else {