Rewrite README

This commit is contained in:
Andrea Cavalli 2021-02-03 14:08:32 +01:00
parent 5c98465637
commit 09ec134b51
3 changed files with 61 additions and 48 deletions

101
README.md
View File

@ -1,55 +1,68 @@
CavalliumDB Engine
==================
A very simple wrapper for RocksDB and Lucene, with gRPC and direct connections.
A very simple reactive wrapper for RocksDB and Lucene.
This is not a database, but only a wrapper for Lucene Core and RocksDB, with a bit of abstraction.
# Features
## RocksDB Key-Value NoSQL database engine
- Snapshots
- Multi-column databases
- WAL and corruption recovery strategies
- Multiple data types:
- Bytes (Singleton)
- Maps of bytes (Dictionary)
- Maps of maps of bytes (Deep dictionary)
- Sets of bytes (Dictionary without values)
- Maps of sets of bytes (Deep dictionary without values)
## Apache Lucene Core indexing library
- Documents structure
- Sorting
- Ascending and descending
- Numeric or non-numeric
- Searching
- Nested search terms
- Combined search terms
- Fuzzy text search
- Coordinates, integers, longs, strings, text
- Indicization and analysis
- N-gram
- Edge N-gram
- English words
- Stemming
- Stopwords removal
- Results filtering
- Snapshots
- **RocksDB key-value database engine**
- Snapshots
- Multi-column database
- Write-ahead log and corruption recovery
- Multiple data types:
- Single value (Singleton)
- Map (Dictionary)
- Composable nested map (Deep dictionary)
- Customizable data serializers
- Values codecs
- Update-on-write value versioning using versioned codecs
- **Apache Lucene Core indexing library**
- Snapshots
- Documents structure
- Sorting
- Ascending and descending
- Numeric or non-numeric
- Searching
- Nested search terms
- Combined search terms
- Fuzzy text search
- Coordinates, integers, longs, strings, text
- Indicization and analysis
- N-gram
- Edge N-gram
- English words
- Stemming
- Stopwords removal
- Results filtering
# F.A.Q.
## Why is it so difficult?
This is not a DMBS.
- **Why is it so difficult to use?**
This is not a DBMS.
This is an engine on which a DBMS can be built upon; for this reason it's very difficult to use directly without building another abstraction layer on top.
- **Can I use objects instead of byte arrays?**
Yes, you must serialize/deserialize them using a library of your choice.
CodecSerializer allows you to implement versioned data using a codec for each data version.
Note that it uses 1 to 4 bytes more for each value to store the version.
- **Why there is a snapshot function for each database part?**
Since RocksDB and lucene indices are different software, you can't take a snapshot of everything in the same instant.
A single snapshot must be implemented as a collection of all the snapshots.
- **Is CavalliumDB Engine suitable for your project?**
No.
This engine is largely undocumented, and it doesn't provide extensive tests.
This is an engine on which a DBMS can be built upon. For this reason it's very difficult to use it directly without using it through abstraction layers.
# Examples
## Can I use objects in the database?
Yes you must serialize/deserialize them using a library of your choice.
## Why there is a snapshot function for each database part?
Since RocksDB and lucene indices are different instances, every instance has its own snapshot function.
To have a single snapshot you must implement it as a collection of sub-snapshots in your DBMS.
## Is CavalliumDB Engine suitable for your project?
No.
This engine is largely undocumented, and it doesn't provide extensive tests on its methods.
In `src/example/java` you can find some quick implementations of each core feature.

View File

@ -48,13 +48,13 @@ public class ParallelCollectorStreamSearcher implements LuceneStreamSearcher {
if (!realFields.isEmpty()) {
logger.error("Present fields:");
for (IndexableField field : realFields) {
logger.error(" - " + field.name());
logger.error(" - {}", field.name());
}
}
} else {
var field = d.getField(keyFieldName);
if (field == null) {
logger.error("Can't get key of document docId:" + docId);
logger.error("Can't get key of document docId: {}", docId);
} else {
resultsConsumer.accept(new LLKeyScore(field.stringValue(), score));
}

View File

@ -41,9 +41,9 @@ public class SimpleStreamSearcher implements LuceneStreamSearcher {
logger.error("The document docId: {}, score: {} is empty.", docId, score);
var realFields = indexSearcher.doc(docId).getFields();
if (!realFields.isEmpty()) {
System.err.println("Present fields:");
logger.error("Present fields:");
for (IndexableField field : realFields) {
System.err.println(" - " + field.name());
logger.error(" - {}", field.name());
}
}
} else {