System Tables
Last updated
Last updated
These are the system tables defined on the dolt
schema. These refer to version control
information that is not schema or table specific, like branches, logs, etc.
dolt.branches
dolt.branches
(also usable as dolt_branches
) contains information about branches known
to the database.
Schema
Example Query
Get all the branches.
remote
and branch
show the remote host and branch that each branch is tracking.
dirty
is TRUE
when there are uncommitted changed on the branch.
dolt.remote_branches
dolt.remote_branches
(also usable as dolt_remote_branches
) contains information about branches on remotes
you have fetched. It has a similar schema as dolt_branches
, but the remote
, branch
, and dirty
columns
don't make sense in this context and are not included. Only remote branches are included in this table.
Schema
Example Query
Get all local and remote branches in a single query. Remote branches
will have the prefix remotes/<remoteName>
in their names.
dolt.docs
dolt.docs
(also usable as dolt_docs
) stores the contents of Dolt docs (LICENSE.md
,README.md
).
Schema
Example Query
dolt.remotes
dolt.remotes
(also usable as dolt_remotes
) returns the remote subcontents of the repo_state.json
, similar
to running dolt remote -v
from the Dolt command line.
Schema
Example Query
dolt.tags
dolt.tags
(also usable as dolt_tags
) shows information for all active tags in the current database.
Schema
Example Query
Create a tag using the dolt_tag()
function.
Get all the tags.
dolt.commit_ancestors
The dolt.commit_ancestors
(also usable as dolt_commit_ancestors
) table records the ancestors for every commit in the database. Each commit has one or two
ancestors, two in the case of a merge commit.
Schema
Each commit hash has one or two entries in the table, depending on whether it has one or two parent commits. The root
commit of the database has a NULL
parent. For merge commits, the merge base will have parent_index
0, and the commit
merged will have parent_index
1.
Example Query
If we want to get the parent of our most recently created commit, we can use the dolt.commit_ancestors
system table.
dolt.commits
The dolt.commits
(also usable as dolt_commits
) system table shows ALL commits in a Dolt database.
Schema
Example Query
We can query for the two commits before December 2nd, 2024, across all commits in the database
(regardless of what is checked out to HEAD
) with this query:
dolt.log
Schema
Example Query
The following query shows the commits reachable from the current checked out head and created by user postgres
since November 27, 2024, 7:30pm:
dolt.diff
The dolt.diff
(also usable as dolt_diff
) system table shows which tables in the current database were changed in each commit reachable from the active branch's HEAD. When multiple tables are changed in a single commit, there is one row in the dolt.diff
system table for each table, all with the same commit hash. Any staged or unstaged changes in the working set are included with the value WORKING
for their commit_hash
. After identifying the tables that changed in a commit, the dolt_diff_$TABLENAME
system tables can be used to determine the data that changed in each table.
Schema
The DOLT.DIFF
system table has the following columns
Query Details
dolt.diff
displays the changes from the current branch HEAD, including any working set changes. If a commit did not
make any changes to tables (e.g. an empty commit), it is not included in the dolt.diff
results.
Example Query
The following query uses the dolt.diff
system table to find all commits, and the tables they changed,
between November 28 and December 3, 2024.
From these results, we can see there were two commits to this database in the above time frame. Commit mc4ogko
was a schema-only change to the public.employees
table, while j5b0a0b
was a data-only change to the same table.
To dig deeper into these changes, we can query
the dolt_diff_$TABLENAME
system tables specific to each of the changed tables, like this:
dolt.column_diff
The dolt.column_diff
(also usable as dolt_column_diff
) system table shows which columns and tables in the current database were changed in each commit
reachable from the active branch's HEAD. When multiple columns are changed in a single commit, there is one row in thedolt.column_diff
system table for each column, all with the same commit hash. Any staged changes in the working set
are included with the value STAGED
for their commit_hash
. Any unstaged changes in the working set are included with
the value WORKING
for their commit_hash
.
Schema
The dolt.column_diff
system table has the following columns:
Query Details
dolt.column_diff
displays the changes from the current branch HEAD, including any working set changes. If a commit did not
make any changes to tables (e.g. an empty commit), it is not included in the dolt.column_diff
results.
Example Query
The following query uses the dolt.column_diff
system table to find commits and tables where the column start_date
was updated.
If we narrow in on the employees
table we can count the number of commits that updated each column
over the course of all our commits.
From these results, we can see that fields describing the employee names are being updated far more frequently than the fields holding start and end date information.
dolt.conflicts
dolt.conflicts
(also usable as dolt_conflicts
) is a system table that has a row for every table in the working set that has an unresolved merge
conflict.
Schema
dolt.schema_conflicts
dolt.schema_conflicts
(also usable as dolt_schema_conflicts
) is a system table that has a row for every table in the working
set that has an unresolved schema conflict.
Schema
Example Query
dolt.merge_status
The dolt.merge_status
system table tells a user if a merge is active.
Schema
Let's create a simple conflict:
Output of SELECT * from dolt.merge_status;
:
dolt.status
dolt.status
(also usable as dolt_status
) returns the status of the database session, analogous to
running dolt status
from the Dolt command line.
Schema
Example Query
dolt.constraint_violations
The dolt.constraint_violations
(also usable as dolt_constraint_violations
) system table contains one row for every table that has a constraint violation
introduced by a merge. Dolt enforces constraints (such as foreign keys) during normal SQL operations, but it's possible
that a merge puts one or more tables in a state where constraints no longer hold. For example, a row deleted in the
merge base could be referenced via a foreign key constraint by an added row in the merged commit. Usedolt.constraint_violations
to discover such violations.
Schema
dolt.rebase
Schema
The action
field can take one of the following rebase actions:
pick
- apply a commit as is and keep its commit message.
drop
- do not apply a commit. The row in the dolt.rebase
table can also be deleted to drop a commit from the rebase plan.
reword
- apply a commit, and use the updated commit message from the commit_message
field in the dolt.rebase
table for its commit message. Note that if you edit the commit_message
but do not use the reword
action, the original commit message will still be used.
squash
- apply a commit, but include its changes in the previous commit instead of creating a new commit. The commit message of the previous commit will be altered to include the previous commit message as well as the commit message from the squashed commit. Note that the rebase plan MUST include a pick
or reword
action in the plan before a squash
action.
fixup
- apply a commit, but include its changes in the previous commit instead of creating a new commit. The commit message of the previous commit will NOT be changed, and the commit message from the fixup commit will be discarded. Note that the rebase plan MUST include a pick
or reword
action in the plan before a fixup
action.
Example Query
To squash all commits into a single commit and include the commit messages from all commits, the following query can be used:
To reword a commit with commit hash '123aef456f', be sure to set the action to reword
and to update the commit_message
field:
To drop the second commit in the rebase plan, you can use the drop
action:
Or you can simply delete that row from the dolt.rebase
table:
These are the system tables that are on each user-defined schema. They refer to version control information that is schema or table specific, like table diffs, procedures, views, etc.
dolt_schemas
dolt_schemas
stores SQL schema fragments for a dolt database that
are versioned alongside the database itself. Certain DDL statements
will modify this table and the value of this table in a SQL session
will affect what database entities exist in the session.
The values in this table are implementation details associated with the storage of certain schema elements. It is recommended to use built-in SQL statements for examining and modifying schemas, rather than using this table directly.
Schema
Currently, all VIEW
, TRIGGER
and EVENT
definitions are stored in the dolt_schemas
table.
The column type
defines whether the fragment is view
, trigger
or event
.
The column name
is the fragment name as supplied in the CREATE
statement.
The column fragment
stores the CREATE
statement of the fragment. The columnjson
is any additional important information such as CreateAt
field
for the fragment.
The values in this table are partly implementation details associated with the implementation of the underlying database objects.
Example Query
Then you can view them in dolt_schemas
:
dolt_statistics
dolt_statistics
includes currently collected
database statistics. This information is stored outside of the
commit graph and is not subject to versioning semantics.
Schema
dolt_blame_$tablename
Schema
The dolt_blame_$tablename
system view has the following columns:
The remaining columns are dependent on the schema of the user table.
Every column from the primary key of your table will be included in the dolt_blame_$tablename
system table.
Query Details
Executing a SELECT *
query for a dolt_blame_$tablename
system view will show you the primary key columns
for every row in the underlying user table and the commit metadata for the last commit that modified that row.
Note that if a table has any uncommitted changes in the working set,
those will not be displayed in the dolt_blame_$tablename
system view.
dolt_blame_$tablename
is only available for tables with a primary key.
Attempting to query dolt_blame_$tablename
for a table without a primary key will return an error message.
Example Query
Consider the following example table employees
:
To find who set the current values, we can query the dolt_blame_employees
table:
dolt_history_$TABLENAME
For every user table named $TABLENAME
, there is a read-only system table named dolt_history_$TABLENAME
that can be queried to find a row's value at every commit in the current branch's history.
Schema
Every Dolt history table contains columns for commit_hash
, committer
, and commit_date
, plus every column
from the user table's schema at the current checked out branch.
Example Schema
Consider a table named mytable
with the following schema:
The schema for dolt_history_states
would be:
Example Query
Assume a database with the mytable
table above and the following commit graph:
When the feature
branch is checked out, the following query returns the results below, showing
the row at every ancestor commit reachable from our current branch.
dolt_commit_diff_$TABLENAME
For every user table named $TABLENAME
, there is a read-only system table named dolt_commit_diff_$TABLENAME
that can be queried to see a diff of the data in the specified table between any two commits in the database.
For example, you can use this system table to view the diff between two commits on different branches.
The schema of the returned data from this system table is based on the schema of the underlying user table
at the currently checked out branch.
You must provide from_commit
and to_commit
in all queries to this system table in order to specify the
to and from points for the diff of your table data. Each returned row describes how a row in the underlying
user table has changed from the from_commit
ref to the to_commit
ref by showing the old and new values.
Schema
The remaining columns are dependent on the schema of the user table at the currently checked out branch.
For every column X
in your table at the currently checked out branch, there are columns in the result set namedfrom_X
and to_X
with the same type as X
in the current schema.
The from_commit
and to_commit
parameters must both be specified in the query, or an error is returned.
Example Schema
Consider a simple example with a table that has one column:
Based on the table's schema above, the schema of the dolt_commit_diff_$TABLENAME
will be:
Query Details
Now consider the following branch structure:
We can use the above table to represent two types of diffs: a two-point diff and a three-point diff. In a two-point diff we want to see the difference in rows between Point C and Point G.
We can also compute a three-point diff using this table. In a three-point diff we want to see how our feature branch has diverged from our common ancestor E, without including the changes from F and G on main.
Additional Notes
dolt_diff_$TABLENAME
For every user table named $TABLENAME
, there is a read-only system
table named dolt_diff_$TABLENAME
that returns a list of diffs showing
how rows have changed over time on the current branch.
Each row in the result set represents a row that has changed between two adjacent
commits on the current branch – if a row has been updated in 10 commits, then 10
individual rows are returned, showing each of the 10 individual updates.
Schema
Every Dolt diff table will have the columns
The remaining columns are dependent on the schema of the user
table at the current branch. For every column X
in your table at the current branch there will
be columns in the result set named from_X
and to_X
with the same type as X
.
Example Schema
For a table named states
with the following schema:
The schema for dolt_diff_states
would be:
Query Details
For each row the field diff_type
will be one of the values added
,modified
, or removed
. You can filter which rows appear in the
result set to one or more of those diff_type
values in order to
limit which types of changes will be returned.
Example Query
The following query will retrieve all commits that modified existing rows in the employees
table.
dolt_conflicts_$TABLENAME
For each table $TABLENAME
in conflict after a merge, there is a
corresponding system table named dolt_conflicts_$TABLENAME
. The
schema of each such table contains three columns for each column in
the actual table, representing each row in conflict for each of ours,
theirs, and base values.
Consider a table mytable
with this schema:
If we attempt a merge that creates conflicts in this table, I can examine them with the following query:
To mark conflicts as resolved, delete them from the corresponding
table. To effectively keep all our
values, I would simply run:
If I wanted to keep all their
values, I would first run this statement:
For convenience, you can also modify the our_
columns of thedolt_conflicts_mytable
to update the corresponding row in mytable
. The above
replace statement can be rewritten as:
And of course you can use any combination of ours
, theirs
andbase
rows in these statements.
dolt_workspace_$TABLENAME
This system table shows you which rows have been changed in your workspace and if they are staged.
It is the union of rows changed from HEAD to STAGED, and STAGED to WORKING. Any table listed indolt_status
table will have a non-empty corresponding dolt_workspace_$TABLENAME
table. Changes
listed are all relative to the HEAD of the current branch.
Schema
The schema of the source table is going to affect the schema of the workspace table. The first three column are always the same, then the schema of the source table is used to create "to_" and "from_" columns.
Each row in the dolt_workspace_$TABLENAME
corresponds to a single row update in the table.
There are two ways you can alter the state of your workspace using these tables.
The staged
column can be toggled for any row. If changing from false to true, the row values will be moved
to staging. If there are already staged changes for that row, they will be overwritten. If changing from true to
false, the row values will be unstaged. If there are other changes in the workspace for that row, the workspace
change will be preserved and the staged change will be dropped.
Any row which has staged = false
can be deleted. This will result in reverting the change to the row in the source table.
Example Query
Notes
The dolt_workspace_$TABLENAME
tables are generated based on the session state when inspected,
so they can not be considered stable on a branch which has multiple editors.
dolt_constraint_violations_$TABLENAME
For each table $TABLENAME
with a constraint violation after a merge, there is a corresponding system table nameddolt_constraint_violations_$TABLENAME
. Each row in the table represents a constraint violation that must be resolved
via INSERT
, UPDATE
, or DELETE
statements. Resolve each constraint violation before committing the result of the
merge that introduced them.
Schema
For a hypothetical table mytable
with the following schema:
dolt_constraint_violations_mytable
will have the following schema:
Each row in the table represents a row in the primary table that is in violation of one or more constraint violations.
The violation_info
field is a JSON payload describing the violation. The violation_type
field is one of these four strings:
"foreign key", "unique index", "check constraint", or "not null".
As with dolt_conflicts
, delete rows from the corresponding dolt_constraint_violations
table to signal to Doltgres that
you have resolved any such violations before committing.
Configuration Tables can be staged and versioned just like user tables. They always exist, even in an empty database.
dolt_ignore
dolt_ignore
stores a list of "table name patterns", and a boolean flag for each pattern indicating whether tables that match the patterns should not be staged for commit.
This only affects the staging of new tables. Tables that have already been staged or committed are not affected the contents of dolt_ignore
, and changes to those tables can still be staged.
Schema
Notes
The format of patterns is a simplified version of gitignore’s patterns:
An asterisk "*" or ampersand "%" matches any number of characters.
The character "?" matches any one character.
All other characters match exactly.
If a table name matches multiple patterns with different values for ignored
, the most specific pattern is chosen (a pattern A is more specific than a pattern B if all names that match A also match pattern B, but not vice versa.) If no pattern is most specific, then attempting to stage that table will result in an error.
Tables that match patterns in dolt_ignore
can be force-committed by passing the --force
flag to SELECT dolt_add()
.
dolt_diff
won't display ignored tables unless the additional --ignored
flag is passed.
Example Query
Because the branch information is global to all clients, not just your
session, dolt.branches
system table is read-only. Branches can be created
or deleted with the .
To find the current active branch use .
dolt.branches
only contains information about local branches. For
branches on a remote you have fetched, see.
The dolt.remotes
table is currently read only. Use the to add, update or delete remotes.
function can be used to INSERT and DELETE tags on the dolt.tags
table.
This is similar, but different from the dolt.log
and the dolt log
.dolt.log
shows you commit history for all commit ancestors reachable from the current HEAD
of the
checked out branch, whereas dolt.commits
shows all commits from the entire database, no matter which branch is checked out.
The dolt.log
(also usable as dolt_log
) system table contains the commit log for all commits reachable from the current HEAD
.
This is the same data returned by the .
Query this table when resolving conflicts in a SQL session. For more information on resolving merge conflicts in SQL, see docs for the tables.
Query this table when resolving schema conflicts in a SQL session. For more information on resolving schema conflicts during merge, see the docs on .
dolt.rebase
(also usable as dolt_rebase
) is only present while an interactive rebase is in progress, and only on the branch where the rebase is being executed. For example, when rebasing the feature1
branch, the rebase will be executed on the dolt_rebase_feature1
branch, and the dolt.rebase
system table will exist on that branch while the rebase is in-progress. The dolt.rebase
system table starts off with the default rebase plan, which is to pick
all of the commits identified for the rebase. Users can adjust the rebase plan by updating the dolt.rebase
table to change the rebase action, reword a commit message, or even add new rows with additional commits to be applied as part of the rebase. For more details about rebasing, see .
For every user table that has a primary key, there is a queryable system view named dolt_blame_$tablename
which can be queried to see the user and commit responsible for the current value of each row.
This is equivalent to the .
Tables without primary keys will not have an associated dolt_blame_$tablename
.
dolt_commit_diff_$TABLENAME
is the analogue of the dolt diff
CLI command.
It represents the
between the two commits provided.
The dolt_diff_$TABLENAME
system table also exposes diff information, but instead of a two-way diff,
it returns a log of individual diffs between all adjacent commits in the history of the current branch.
In other words, if a row was changed in 10 separate commits, dolt_diff_$TABLENAME
will show 10 separate rows –
one for each individual delta. In contrast, dolt_commit_diff_$TABLENAME
would show a single row that combines
all the individual commit deltas into one diff.
The is an alternative to thedolt_commit_diff_$tablename
system table for cases where a table's schema has changed between the to
and from
commits. Consider the DOLT_DIFF()
table function if you need to see the schema from each of those commits,
instead of using the schema from the currently checked out branch.
computes the closest ancestor E between main
and feature
.
There is one special to_commit
value WORKING
which can be used to
see what changes are in the working set that have yet to be committed
to HEAD. It is often useful to use
to get the commit hash of a branch, or an ancestor commit. The above table
requires both from_commit
and to_commit
to be filled.
Compared to the,
the dolt_diff_$TABLENAME
system table focuses on
how a particular row has evolved over time in the current branch's history.
The major differences are that dolt_commit_diff_$TABLENAME
requires specifying from_commit
and to_commit
,
works on any commits in the database (not just the current branch),
and returns a single combined diff for all changes to a row between those two commits. In the example
above where a row is changed 10 times, dolt_commit_diff_$TABLENAME
would only return a single row
showing the diff, instead of the 10 individual deltas.
A SELECT *
query for a diff table will show you every change
that has occurred to each row for every commit in this branch's
history. Using to_commit
or from_commit
will limit the data to
specific commits. There is one special to_commit
value WORKING
which can be used to see what changes are in the working set that have
yet to be committed to HEAD. It is often useful to use the
function to get the commit hash of a branch, or an ancestor
commit. For example, to get the differences between the last commit and its parent
you could use to_commit=HASHOF('HEAD') and from_commit=HASHOF('HEAD^')
.
These tables can be modified in order to update what changes are staged for commit.
The staged
column will be true
when the changes are going to be committed on the next
call to . Changes which have staged = false
are present in your
workspace which means all queries in your session contain them but they will not be recorded
in the event that is executed.