Delivery Guarantees
Source Connector
The source connector guarantees "at least once" delivery of the latest version of each document.
In some situations, the source connector may "rewind" the document stream and restart from an earlier point in history.
Except for rewinding, the source connector preserves the order of changes within the same Couchbase partition (or "virtual bucket"), and changes to a document are published in the order they occurred.
The order between changes in different Couchbase partitions is not guaranteed.
When a document is removed from Couchbase due to expiration, the connector does not publish the deletion until the expired document is purged from Couchbase.
The source connector does not guarantee every version of a document will be reflected in the Kafka topic.
For example, if a Couchbase document value changes rapidly from V1 → V2 → V3 , or if the connector is not running when the change occurs, the connector is only guaranteed to publish the latest value (V3 ).
In other words, changes to the same document may be "deduplicated", in which case only the latest version is published.
|
For use cases that require every document version be reflected in the event stream, some possible options are:
-
Configure the source bucket and/or collections to preserve historical document versions (requires Couchbase Server 7.2 or later). The connector publishes every historical document version it encounters. Be aware that Couchbase Server does not store historical document versions forever. Document changes that are evicted from history become eligible for deduplication. See the Change History page.
-
Alternatively, treat the documents as immutable, and write a new document for each application event. In other words, make
V1
,V2
, andV3
separate documents. -
A third option is to include a version history inside each document. For example, the first version of a document would be just
[V1]
. The second version would addV2
, resulting in[V1,V2]
. After addingV3
, it would be[V1,V2,V3]
.
Sink Connector
The sink connector guarantees "at least once" delivery of each record in the Kafka topic. In other words, the sink connector guarantees it will write every Kafka record to Couchbase, unless the record is intentionally discarded by a custom sink handler.
To improve the likelihood that every write survives a Couchbase Server failover event, consider configuring the couchbase.durability config property.
|
If the sink connector fails to write a record to Couchbase, it will retry up to the timeout specified by the couchbase.retry.timeout
config property.
If the timeout expires before the write succeeds, the connector terminates.
When you restart the connector, it resumes reading from the Kafka topic at an offset prior to where it stopped. As a result, some records might get written to Couchbase more than once.
Unless the sink connector is reprocessing records after a restart, it preserves the order of changes within the same Kafka partition, and records are written to Couchbase in the order they were published to the topic.
The order between writes for records in different Kafka partitions is not guaranteed.
Avoid random partition assignment
To guarantee documents in Couchbase are eventually consistent with the records in the Kafka topic, avoid random Kafka partition assignment when publishing to the topic.
Always use the default partition assignment strategy, or some other strategy that assigns records with the same key to the same partition.
|