Databus provides a timeline-consistent stream of change capture events for a database. It enables applications to watch a database, view and process updates in near real-time. Databus provides a complete after-image of every new/changed record as well as deletes, while maintaining timeline consistency and transactional boundaries. The application integration is decoupled from the source database, and each application integration is isolated, which allows for parallel development and rapid innovation.
Databus has a few key parts:
To use databus, the consuming application simply maintains a high watermark, and periodically requests all changes since that point in time using the Databus client. Each consuming application maintains its own high watermark, which provides isolation from one another.
We use Databus extensively to propagate profile, connection, company updates, and many other databases at LinkedIn. For example, if a member adds a position, the standardization service will generate a canonical version of the company, which will be added to the profile and the people search index. Connection and group updates are propagated into recommendation systems.
There are many other examples across the site, as our data is heavily inter-connected!