From 7caa5b1f1e08f99cfe4465f091f47e2966d78aa7 Mon Sep 17 00:00:00 2001 From: Trygve Laugstøl Date: Sun, 23 Jun 2013 09:37:57 +0200 Subject: o Initial import of JDBC queue. --- README.md | 99 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 99 insertions(+) create mode 100644 README.md (limited to 'README.md') diff --git a/README.md b/README.md new file mode 100644 index 0000000..d13037d --- /dev/null +++ b/README.md @@ -0,0 +1,99 @@ +JDBC Queue (in lack of a better name) +===================================== + +JDBC queue is a library for writing transactional, messaging-oriented software. It does this by managing `tasks` in a +set of `queues` in a plain SQL database. It is currently only been tested with PostgreSQL but should be fairly portable. + +It consists of three major parts: + +* A core **queue** part which implements CRUD access to the queues, tasks and configuration. +* An **async** part works on top of a single queue. It controls a consumer thread and dispatches tasks to an normal + Java Executor. +* A **spring** layer that integrates connection and transaction handing with the standard Spring tools. + +The **queue** interface is indented to be used by: + +* Management code that want to get queue statistics or reconfigure the queues +* Cron jobs that want to consume everything that has been scheduled +* Applications that are run just to insert a small number of tasks + +The **async** layer provides the JMS like interface for each queue. It creates a consumer thread that polls the database +at a specified interval, marks the task for processing and passes it on to the executor. By using a multi-threaded +executor it can scale up quite easily. + +The **spring** layer makes sure that the parts plays along nicely with the existing JDBC/JPA/Hibernate code that you +already have. + +Features +======== + +Transactionality: each task is performed in an SQL transaction ensuring consistency between the task table and the +other tables used when processing the task. + +A task has: + +* state +* parent +* created_date +* last_updated +* completed_date + +Each task has an optional parent reference: this allows you to trace the messages around in your system to see what +effects each task had. + +"queue system" allowing multiple queue systems to be run in a single JVM + +Push: Intra-JVM notification of new elements on a queue for instant processing. + +Implementation +============== + + + +Performance +=========== + +Use this library if you want correctness and managebility over speed. + +Possible improvements +===================== + +**Batch processing of tasks in a single transaction**: let the consumer thread fetch a batch of N tasks, set all of +them to PROCESSING in a transaction and send the batch to a processor thread which will process all of them in one +transaction. + +This will significantly reduce the number of transactions required thus increasing speed. A possible issue is that if +one of the tasks fails it will abort the entire transaction. If this happens consistenly it can keep all of the tasks +from completion so some sort of mechanism to only pick tasks that haven't failed before might be useful. + +**Error handling strategies**: Currently there is no retrying or anything smart around tasks that fail. This definitely +needs to be improved. + +A generic class that re-schedules a task for execution and can be used as a TimerTask might be useful. + +Support locking rows instead of extra states: This might significantly improve performance and write pressure on the +db. + +**Configurable state machine**: Right now the possible states a task can be in is hard-coded. + +**Utilities to do routing**: this library does not intend to compete with normal JMS servers or specialized tools like +Apache Camel but it might still be useful to have some tools with the package: + +* A consumer that can be configured to replicate the task to a set of other queues creating a classic MQ topic. +* A consumer that can be configured to replicate the task from this database to another. As this will span two + transactions the operation has to be idempotent, but that should be doable. It might be useful to add some fields to + a task that points to the remote task. +* A conumer that take tasks that has failed too many times and move them to a dead letter queue. + +**Optional push notification between JVMs**: use a simple MQ with in-memory storage to provide push notification after +new tasks has been committed to the database. This will allow the system to behave like a RPC-like system, just with +proper transactional semantics. The normal database poller can be set to poll at a much lower interval to pick up +old messages whose notification was lost. + +**Schema dependent features**: JDBC queue does not depend on a very specific schema, it mainly requires two tables +with a certain set of columns. Features like the parent reference might not be useful for all applications so it might +be useful for a queue system to look in the task database to see if the column is there and fail if someone tries to +create a task with a parent reference that is not valid. + +This might also be implemented in a more simple fasion when creating the QueueSystem so the app doesn't have to +discover anything. -- cgit v1.2.3