What are the core concepts of the Clearcase version control system every developer should know?
Core concepts?
centralized(-replicated) VCS: ClearCase is halfway between the centralized VCS world (one or several "centralized" repos or VOBS - Version Object Bases - every developer must access to commit) and the distributed VCS world.
But it also supports a "replicated" mode allowing you to replicate a repo in a distant site (MultiSite ClearCase), sending deltas, and managing ownership. (the license fees attached with that is quite steep though)
This is not a true "decentralized" model, since it is does not allow for parallel concurrent evolutions: The branches are mastered in one VOB or another; you can only check-in to the master VOB for the branches mastered there, though you have readonly access to any branch at any replica.
linear version storage: each file and directory has a linear history; there is no direct relationship between them like the DAG VCS (Directed Acyclic Graph) where the history of a file is linked to the one of a directory linked to a commit.
That means
That also means a merge must find a common base contributor (not always the same as a common ancestor) through history exploration (see next point).
(Git is at the opposite end of that spectrum, being both decentralized and DAG-oriented:
3-way merging: to merge two versions, ClearCase must find a common based contributor in their linear history, which can be fairly long for complex version tree (branch/sub-branch/sub-sub/branch, ...), and the basic ClearCase merge command merges a file or directory, but it is not recursive. It only affects a singe file, or a single directory without its files (ct findmerge
is recursive)
file-centric (as opposed to the other recent VCS more repository centric): that means the commit is file by file, not "set of modified files": the transaction is at the file level. A commit of several files is not atomic.
(Almost every other modern tool is "repository centric", with an atomic commit transaction, but first-generation systems like RCS, SCCS, CVS, and most other older systems do not have that feature.)
id-managed: each file and directory has a unique id, meaning they can be renamed at will: their history will not change since the id remains for the "element". Plus a directory will detect in its history any addition/suppression of file. When a file is "removed" (rmname
), it does not know it: only the directory is notified and creates a new version in its history, with a list of sub-elements not including the file removed.
(Create two files with the same size and content, they will get the same id in Git -- a SHA1 key -- and will be stored only once in the Git repo! Not so in ClearCase.
Plus, If two files with the same path and name are created in two different branches, their id being different means those two files will never be merged: they are called "evil twins")
branches are first-class citizens: most VCS consider a branch and a tag as the same: a single point in the history from which a new linear history can grow (branch) or from where a description is attached (tag).
Not so for ClearCase, where a branch is a way to reference a version number. Any version number starts at 0 (just referenced in ClearCase) to 1, 2, 3, and so on. Each branch can contain a new list of version numbers (0, 1, 2, 3 again).
This is different from other systems where the version number is unique and always growing (like the revisions in SVN), or is just unique (like the SHA1 keys in Git).
path-accessed: to access a certain version of a file/directory, you need to know its extended path (composed of branches and versions). It is called an "extended path name": myFile@@/main/subBranch/Version
.
(Git does refer to everything through id -- SHA1-based --: version [or commit], tree [or version of a directory] and blob [or version of a file, or rather of a content of a file]. So it is "id-accessed" or "id-referenced".
For ClearCase, an id refers to an "element": a directory or a file, whatever its version is.)
both pessimistic lock and optimistic lock: (reserved or unreserved checkouts in ClearCase): even a pessimistic lock (reserved checkout) is not a true pessimistic one, since other users can still checkout that file (albeit in "unreserved mode"): they can change it but will have to wait for the first user to commit his file (checkin) or cancel the request. Then they will merge their checkout version of that same file.
(Note: a "reserved" checkout can release its lock and be made unreserved, either by the owner or the administrator)
cheap branching: a branch does not trigger a copy of all files. It actually triggers nothing: any file not checkout will stay in its original branch. Only modified files will have their new versions stored in the declared branch.
flat-file storage: the VOBs are stored in a proprietary format with simple files. This is not a database with an easy query language.
local or network workspace access:
centralized deported storage: [view] storage is there to keep some data and avoid some or any communication with the central referential.
a workspace can have:
.svn
subdirectories all over the place(Git has no "storage" per se. Its .git
is actually all the repository!)
(meaning the mechanism is actually weaker than the "properties" system of SVN, where properties can have an history;
Git on the other end is not too keen on meta-data)
(SVN is more like "server-based" protection, where the Apache server can get a first level of protection, but must be completed with hooks to have a finer grain of rights.
Git has no direct rights management and must be controlled by hooks during push or pull between repositories)
hooks available: any ClearCase action can be the target of a hook, called trigger. It can be a pre or post operation.
CLI managed: cleartool is the Command Line Interface from which all actions can be made.