System functions v5.6
Perform PGD management primarily by using functions you call from SQL.
All functions in PGD are exposed in the bdr
schema. Schema qualify any calls to these
functions instead of putting bdr
in the
search_path
.
Version information functions
bdr.bdr_version
This function retrieves the textual representation of the version of the BDR extension currently in use.
bdr.bdr_version_num
This function retrieves the version number of the BDR extension that is currently in use. Version numbers are monotonically increasing, allowing this value to be used for less-than and greater-than comparisons.
The following formula returns the version number consisting of major version, minor version, and patch release into a single numerical value:
System information functions
bdr.get_relation_stats
Returns the relation information.
bdr.get_subscription_stats
Returns the current subscription statistics.
System and progress information parameters
PGD exposes some parameters that you can query directly in SQL using, for example,
SHOW
or the current_setting()
function. You can also use PQparameterStatus
(or equivalent) from a client application.
bdr.local_node_id
When you initialize a session, this is set to the node id the client is connected to. This allows an application to figure out the node it's connected to, even behind a transparent proxy.
It's also used with Connection pools and proxies.
bdr.last_committed_lsn
After every COMMIT
of an asynchronous transaction, this parameter is updated to
point to the end of the commit record on the origin node. Combining it with bdr.wait_for_apply_queue
,
allows applications
to perform causal reads across multiple nodes, that is, to wait until a transaction
becomes remotely visible.
transaction_id
If a CAMO transaction is in progress, transaction_id
is updated to show
the assigned transaction id. You can query this parameter only by using
using PQparameterStatus
or equivalent. See Application use
for a usage example.
bdr.is_node_connected
Synopsis
Returns boolean by checking if the walsender for a given peer is active on this node.
bdr.is_node_ready
Synopsis
Returns boolean by checking if the lag is lower than the given span or
lower than the timeout
for TO ASYNC
otherwise.
Consensus function
bdr.consensus_disable
Disables the consensus worker on the local node until server restart or until
it's reenabled using bdr.consensus_enable
(whichever happens first).
Warning
Disabling consensus disables some features of PGD and affects availability of the EDB Postgres Distributed cluster if left disabled for a long time. Use this function only when working with Technical Support.
bdr.consensus_enable
Reenabled disabled consensus worker on local node.
bdr.consensus_proto_version
Returns currently used consensus protocol version by the local node.
Needed by the PGD group reconfiguration internal mechanisms.
bdr.consensus_snapshot_export
Synopsis
Generate a new PGD consensus snapshot from the currently committed-and-applied state of the local node and return it as bytea.
By default, a snapshot for the highest supported Raft version is
exported. But you can override that by passing an explicit version
number.
The exporting node doesn't have to be the current Raft leader, and it doesn't
need to be completely up to date with the latest state on the leader. However, bdr.consensus_snapshot_import()
might not accept such a snapshot.
The new snapshot isn't automatically stored to the local node's
bdr.local_consensus_snapshot
table. It's only returned to the caller.
The generated snapshot might be passed to bdr.consensus_snapshot_import()
on
any other nodes in the same PGD node group that's behind the exporting node's
Raft log position.
The local PGD consensus worker must be disabled for this function to work. Typical usage is:
While the PGD consensus worker is disabled:
- DDL locking attempts on the node fail or time out.
- galloc sequences don't get new values.
- Eager and CAMO transactions pause or error.
- Other functionality that needs the distributed consensus system is disrupted. The required downtime is generally very brief.
Depending on the use case, it might be practical to extract a snapshot that
already exists from the snapshot
field of the bdr.local_consensus_snapshot
table and use that instead. Doing so doesn't require you to stop the consensus worker.
bdr.consensus_snapshot_import
Synopsis
Import a consensus snapshot that was exported by
bdr.consensus_snapshot_export()
, usually from another node in the same PGD
node group.
It's also possible to use a snapshot extracted directly from the snapshot
field of the bdr.local_consensus_snapshot
table on another node.
This function is useful for resetting a PGD node's catalog state to a known good state in case of corruption or user error.
You can import the snapshot if the importing node's apply_index
is less than
or equal to the snapshot-exporting node's commit_index
when the
snapshot was generated. (See bdr.get_raft_status()
.) A node that can't accept
the snapshot because its log is already too far ahead raises an error
and makes no changes. The imported snapshot doesn't have to be completely
up to date, as once the snapshot is imported the node fetches the remaining
changes from the current leader.
The PGD consensus worker must be disabled on the importing node for this
function to work. See notes on bdr.consensus_snapshot_export()
for details.
It's possible to use this function to force the local node to generate a new Raft snapshot by running:
This approach might also truncate the Raft logs up to the current applied log position.
bdr.consensus_snapshot_verify
Synopsis
Verify the given consensus snapshot that was exported by
bdr.consensus_snapshot_export()
. The snapshot header contains the
version with which it was generated and the node tries to verify it
against the same version.
The snapshot might have been exported on the same node or any other node in the cluster. If the node verifying the snapshot doesn't support the version of the exported snapshot, then an error is raised.
bdr.get_consensus_status
Returns status information about the current consensus (Raft) worker.
bdr.get_raft_status
Returns status information about the current consensus (Raft) worker.
Alias for bdr.get_consensus_status
.
bdr.raft_leadership_transfer
Synopsis
Request the node identified by node_name
to be the Raft leader. The
request can be initiated from any of the PGD nodes and is
internally forwarded to the current leader to transfer the leadership to
the designated node. The designated node must be an ACTIVE PGD node
with full voting rights.
If wait_for_completion
is false, the request is served on
a best-effort basis. If the node can't become a leader in the
bdr.raft_global_lection_timeout
period, then some other capable node
becomes the leader again. Also, the leadership can change over the
period of time per Raft protocol. A true
return result indicates
only that the request was submitted successfully.
If wait_for_completion
is true
, then the function waits until
the given node becomes the new leader and possibly waits infinitely if
the requested node fails to become Raft leader (for example, due to network
issues). We therefore recommend that you always set a statement_timeout
with wait_for_completion
to prevent an infinite loop.
The node_group_name
is optional and can be used to specify the name of the node group where the
leadership transfer happens. If not specified, it defaults to NULL, which
is interpreted as the top-level group in the cluster. If the node_group_name
is
specified, the function transfers leadership only within the specified node
group.
Utility functions
bdr.wait_slot_confirm_lsn
Allows you to wait until the last write on this session was replayed to one or all nodes.
Waits until a slot passes a certain LSN. If no position is supplied, the current write position is used on the local node.
If no slot name is passed, it waits until all PGD slots pass the LSN.
The function polls every 1000 ms for changes from other nodes.
If a slot is dropped concurrently, the wait ends for that slot.
If a node is currently down and isn't updating its slot, then the wait continues.
You might want to set statement_timeout
to complete earlier in that case.
If you are using Optimized Topology, we recommend using bdr.wait_node_confirm_lsn
instead.
)
Synopsis
Notes
Requires bdr_application
privileges to use.
Parameters
Parameter | Description |
---|---|
slot_name | Name of the replication slot to wait for. If NULL, waits for all PGD slots. |
target_lsn | LSN to wait for. If NULL, uses the current write LSN on the local node. |
bdr.wait_node_confirm_lsn
Wait until a node passes a certain LSN.
This function allows you to wait until the last write on this session was replayed to one or all nodes.
Upon being called, the function waits for a node to pass a certain LSN.
If no LSN is supplied, the current wal_flush_lsn (using the pg_current_wal_flush_lsn()
function) position is used on the local node.
Supplying a node name parameter tells the function to wait for that node to pass the LSN.
If no node name is supplied (by passing NULL), the function waits until all the nodes pass the LSN,
We recommend using this function if you are using Optimized Topology instead of bdr.wait_slot_confirm_lsn
.
This is because in an Optimized Topology, not all nodes have replication slots, so the function bdr.wait_slot_confirm_lsn
might not work as expected. bdr.wait_node_confirm_lsn
is designed to work with nodes that don't have replication slots, using alternative straegies to determine the progress of a node.
If a node is currently down, isn't updating or is simply not able to be connected to, the wait will continue indefinately. To avoid this, set the statement_timeout to the maximum amount of time you are prepared to wait.
Synopsis
Parameters
Parameter | Description |
---|---|
node_name | Name of the node to wait for. If NULL, waits for all nodes. |
target_lsn | LSN to wait for. If NULL, uses the current wal_flush_lsn on the local node. |
Notes
Requires bdr_application
privileges to use.
bdr.wait_for_apply_queue
The function bdr.wait_for_apply_queue
allows a PGD node to wait for
the local application of certain transactions originating from a given
PGD node. It returns only after all transactions from that peer
node are applied locally. An application or a proxy can use this
function to prevent stale reads.
For convenience, PGD provides a variant of this function for CAMO and the CAMO partner node. See bdr.wait_for_camo_partner_queue.
In case a specific LSN is given, that's the point in the recovery
stream from which the peer waits. You can use this
with bdr.last_committed_lsn
retrieved from that peer node on a
previous or concurrent connection.
If the given target_lsn
is NULL, this function checks the local
receive buffer and uses the LSN of the last transaction received from
the given peer node, effectively waiting for all transactions already
received to be applied. This is especially useful in case the peer
node has failed and it's not known which transactions were sent.
In this case, transactions that are still in transit or
buffered on the sender side aren't waited for.
Synopsis
Parameters
Parameter | Description |
---|---|
peer_node_name | The name of the peer node from which incoming transactions are expected to be queued and to wait for. If NULL, waits for all peer node's apply queue to be consumed. |
target_lsn | The LSN in the replication stream from the peer node to wait for, usually learned by way of bdr.last_committed_lsn from the peer node. |
bdr.get_node_sub_receive_lsn
You can use this function on a subscriber to get the last LSN that was received from the given origin. It can be either unfiltered or filtered to take into account only relevant LSN increments for transactions to be applied.
The difference between the output of this function and the output of
bdr.get_node_sub_apply_lsn()
measures the size of the corresponding
apply queue.