aboutsummaryrefslogtreecommitdiff
path: root/README.md
diff options
context:
space:
mode:
Diffstat (limited to 'README.md')
-rw-r--r--README.md99
1 files changed, 99 insertions, 0 deletions
diff --git a/README.md b/README.md
new file mode 100644
index 0000000..d13037d
--- /dev/null
+++ b/README.md
@@ -0,0 +1,99 @@
+JDBC Queue (in lack of a better name)
+=====================================
+
+JDBC queue is a library for writing transactional, messaging-oriented software. It does this by managing `tasks` in a
+set of `queues` in a plain SQL database. It is currently only been tested with PostgreSQL but should be fairly portable.
+
+It consists of three major parts:
+
+* A core **queue** part which implements CRUD access to the queues, tasks and configuration.
+* An **async** part works on top of a single queue. It controls a consumer thread and dispatches tasks to an normal
+ Java Executor.
+* A **spring** layer that integrates connection and transaction handing with the standard Spring tools.
+
+The **queue** interface is indented to be used by:
+
+* Management code that want to get queue statistics or reconfigure the queues
+* Cron jobs that want to consume everything that has been scheduled
+* Applications that are run just to insert a small number of tasks
+
+The **async** layer provides the JMS like interface for each queue. It creates a consumer thread that polls the database
+at a specified interval, marks the task for processing and passes it on to the executor. By using a multi-threaded
+executor it can scale up quite easily.
+
+The **spring** layer makes sure that the parts plays along nicely with the existing JDBC/JPA/Hibernate code that you
+already have.
+
+Features
+========
+
+Transactionality: each task is performed in an SQL transaction ensuring consistency between the task table and the
+other tables used when processing the task.
+
+A task has:
+
+* state
+* parent
+* created_date
+* last_updated
+* completed_date
+
+Each task has an optional parent reference: this allows you to trace the messages around in your system to see what
+effects each task had.
+
+"queue system" allowing multiple queue systems to be run in a single JVM
+
+Push: Intra-JVM notification of new elements on a queue for instant processing.
+
+Implementation
+==============
+
+
+
+Performance
+===========
+
+Use this library if you want correctness and managebility over speed.
+
+Possible improvements
+=====================
+
+**Batch processing of tasks in a single transaction**: let the consumer thread fetch a batch of N tasks, set all of
+them to PROCESSING in a transaction and send the batch to a processor thread which will process all of them in one
+transaction.
+
+This will significantly reduce the number of transactions required thus increasing speed. A possible issue is that if
+one of the tasks fails it will abort the entire transaction. If this happens consistenly it can keep all of the tasks
+from completion so some sort of mechanism to only pick tasks that haven't failed before might be useful.
+
+**Error handling strategies**: Currently there is no retrying or anything smart around tasks that fail. This definitely
+needs to be improved.
+
+A generic class that re-schedules a task for execution and can be used as a TimerTask might be useful.
+
+Support locking rows instead of extra states: This might significantly improve performance and write pressure on the
+db.
+
+**Configurable state machine**: Right now the possible states a task can be in is hard-coded.
+
+**Utilities to do routing**: this library does not intend to compete with normal JMS servers or specialized tools like
+Apache Camel but it might still be useful to have some tools with the package:
+
+* A consumer that can be configured to replicate the task to a set of other queues creating a classic MQ topic.
+* A consumer that can be configured to replicate the task from this database to another. As this will span two
+ transactions the operation has to be idempotent, but that should be doable. It might be useful to add some fields to
+ a task that points to the remote task.
+* A conumer that take tasks that has failed too many times and move them to a dead letter queue.
+
+**Optional push notification between JVMs**: use a simple MQ with in-memory storage to provide push notification after
+new tasks has been committed to the database. This will allow the system to behave like a RPC-like system, just with
+proper transactional semantics. The normal database poller can be set to poll at a much lower interval to pick up
+old messages whose notification was lost.
+
+**Schema dependent features**: JDBC queue does not depend on a very specific schema, it mainly requires two tables
+with a certain set of columns. Features like the parent reference might not be useful for all applications so it might
+be useful for a queue system to look in the task database to see if the column is there and fail if someone tries to
+create a task with a parent reference that is not valid.
+
+This might also be implemented in a more simple fasion when creating the QueueSystem so the app doesn't have to
+discover anything.