A white-paper: The LiveDist (live data-base distributor) product

A whitepaper: The LiveDist product

n Galiel-3.14 Ltd. has intellectual property rights relating to implementations of the technology described in this white-paper. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form, or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior written consent of the publisher.

n This presentation is provided “AS IS” without warranty of any kind, either express of implied, including, but not limited to, the implied warranties of merchantability, fitness for a particular purpose, or non-infringement. This presentation could include technical inaccuracies or typographical errors. Changes are periodically added to the information herein. These changes will be incorporated in new editions of the presentation. Galiel-3.14 Ltd’ May make improvements and/or change in any technology, product, or program described in this presentation at any time.

Highlights:

· Delivers information to all required clients, in less than 0.5 seconds.

· Can scale to support a practically unlimited number of clients.

· Preserve transactional integrity.

· Enables new clients to join the network automatically and swiftly.

· Dynamic change of subscription.

· No administration.

· The marginal cost of server side hardware for each additional 300 clients is about $1000.

· For additional information please visit our site:
http://www.galiel314.com or http://www.livedist.com

General description:

LiveDist is a robust C# web-application product, used for swiftly and selectively distributing database modifications to a large number of clients, while preserving transactional integrity.
The updates made by each committed transaction are distributed in 0.2 - 0.5 seconds to a
practically unlimited number of interested clients.

LiveDist is written in C#. LiveDist servers run on dedicated low-cost machines and not on the application servers.

LiveDist intercepts database modifications in one of two mechanisms:

· By using database triggers (currently only supported for Oracle), which accumulate log of inserts, updates, and delete operations, in a shadow-table. The data is periodically fetched from the shadow-table, packed into a single buffer, and passed to LiveDist (by inserting the buffer into LiveDist’s log-table). This mechanism is transparent to the application, but has an overhead of executing the triggers and writing into the shadow-table. The overhead does not depend on the number of clients.

· When the application has access to the log of all database changes in the current transaction, the application can directly pass that log to LiveDist (via C# API), bypassing the triggers and the shadow-table.

LiveDist clients receive distributed data in one of two mechanisms:

· LiveDist clients use HTTP to access the distribution servers. The delta information is XML encoded (the column names are configurable), and thus the client can be written in any language and on any platform. Browsers, palm computers, and cellular phones can also be clients of LiveDist.

· Alternatively the delta information can be encoded in a more efficient binary encoding, and the client uses a C# API (provided by the product) for receiving distribution data and for decoding the binary data.

Main features of the LiveDist product:

· Fully scalable, can grow to support an unlimited number of clients, with no reduction in performance:

Most of the LiveDist software is deployed as C#-web-components (http-handlers). In each site, the web-components can be deployed upon any number of machines, utilizing ordinary HTTP load balancing. The machines have symmetrical functionality, and thus the system is linearly scaled when more machines are added. Additionally, no load is added on the application servers and the database servers, when more users are added. Thus the product is linearly scalable.

· Runs on low-cost hardware with no expensive third-party software:

The LiveDist servers require only CPU power and memory, with no use of internal databases or large files. Thus they are not required to run on expensive server machines. They can run on low-cost hardware, each machine supporting 300-1000 users (depending on load). Additionally, no expensive third party software is required (only standard C# is used). The used container is IIS.

· Supported databases:

The current release of the product supports SQL-server and Oracle database engines.

· Short latency in delta-distribution:

Database modifications are distributed, to all connected clients, in less than 0.5 seconds (when the communication bandwidth is not a bottleneck). There is no degradation in latency when some of the clients are loading snapshots (as long as the communication bandwidth is not too small). Also, there is no degradation in latency when more clients are added. When the communication to a client fails, other clients do not suffer degradation in performance.
The product automatically recovers from all possible failures.

· Fast and concurrent cached snapshots:

When a new client first joins the system, a snapshot of a portion of the database (for all topics of interest declared by the client) is downloaded to the client machine. After the loading of the snapshot is completed, normal delta distribution begins (starting from the point in time when the snapshot has begun). Clients with local database can bypass the snapshot. When a client changes its topics subscription, a partial snapshot for the newly acquired topics of interest is downloaded to the client. The partial snapshot is synchronized with normal delta distribution, to preserve integrity. The cache servers contain live copies of customer chosen database tables. When a client needs to download a snapshot, it is already prepared in the memory of the cache servers (for cached tables). Thus no load is imposed on the application servers or on the database, to prepare the snapshot. The cache server also eliminates the lengthy process of fetching all snapshot rows from the database, by keeping a live copy of database tables in its own memory.

· Small overhead on the application servers and the database servers:

When using triggers, the overhead on the database servers is that of writing into the shadow-table. That overhead depends on the updates rate. Each modification fires a trigger, which executes an insert into the shadow-table (a small table without any indices). Periodically, and in a separate transaction, all rows from the shadow-table are read, packed into a single-buffer, inserted into a log-table, and removed from the shadow-table.

When not using triggers, there is almost no overhead on the database-servers. Just before the end of each transaction, a buffer containing all the modifications done in the current transaction is appended (in a single INSERT operation) into the log-table.

In both cases, there is almost no overhead on the application servers. All newly appended rows are periodically passed to the delta server, and that is the entire overhead.

Everything else is done inside the delta servers and the cache servers (running on other machines).

· No locking:

No database rows are ever locked by LiveDist.

· No 2-phase commit is required:

The passing of delta information to LiveDist does not require 2-phase commit transactions (in either mechanism). Note: using 2-phase commit causes transactions to take longer.

· Selective distribution using topic-subscription:

Each row can contain a field, which holds a comma-separated list of topics (alternatively the triggers can call application supplied routines for calculating the list of topics, which is based on the value of the row's columns). The row is distributed (published) to all the listed topics. The application fills in the value of this field. The application can update the topic-list (adding new topics to the list, and removing existing topics from the list). Each client, at anytime, is subscribed to a list of topics-of-interest. All rows, which are distributed to at least one of the topics-of-interest, are distributed to the client. It is possible to distribute all rows of a table to the same topic-list, in which case no column or function is required.

· New clients can join at any time:

The LiveDist product does not assume a stable network. On the contrary, it assumes that new clients can join the network at any time, and that some known clients can leave the network or become disconnected any time. The product guarantees stable performance to existing clients, even though the topology is continuously changing.

· Clients are frequently change their topic subscription:

It is quite normal for clients to change their topics-of-interest from time to time.
The cache server enables these changes to be fast and not traumatic.

· Supports multi-site distribution (an hierarchy of sites model):

LiveDist supports a hierarchy of sites. Information is passed only once from a mother-site to every child-site. In each site we can run the servers on any number of machines, and thus each site is separately scalable. The support of this topology enables better utilization of the physical network.

· Highly available, guaranteed delivery, and automatic recovery from all possible failures:

When deploying the delta servers and the cache servers on multiple machines, the machines backup each other. When any machine fails, the others continue to function, and the clients continue to be served unnoticeably. When a new (or repaired) machine is added to the cluster, it automatically joins the load balancing task. When all LiveDist servers fail, the recovery is automatically done from the log-table in the database.

· Pluggable security framework:

The LiveDist product integrates with the application security subsystem, to authenticate clients, to encrypt messages between clients and servers, and to determine which topics are allowed to each client. LiveDist does not provide any security solution. Instead it interacts with the security solution used by the application.

· No code generation, fully configured by XML file:

The product does not need to change, when the data-dictionary is changed. The data-dictionary defines which columns from which tables to distribute. The data-dictionary is defined using an XML configuration file. No code generation is required when the data-dictionary (schema) is changed.

· Supports on-the-fly data-dictionary modifications:

When the data-dictionary is changed, the XML configuration-file is updated. The LiveDist product senses that change (at a single location), and automatically adapts to the new data-dictionary. The system needs no reboot, or even suspension.

· Zero administration:

No administrator or operator is required, when using LiveDist. The software components are deployed once on all participating machines. After deployment, LiveDist servers can be considered as black boxes, and they require no administration.

Technical requirements:

Database:

· Oracle 8i or above

· SQL-server 2000 or above

On Web-server machines (one for each 300-1000 users):

· .Net 2.0 or abover.

· IIS.

· 2 Giga-Bytes memory.

On client machines:

Nothing, when client is receiving XML buffers (over HTTP connection).

.Net runtime environment version 2.0 or above (when client is receiving C# objects).

No special memory or CPU requirements.