What are the basic clearcase concepts every developer should know?

user32262 picture user32262 · Mar 14, 2009 · Viewed 90.2k times · Source

What are the core concepts of the Clearcase version control system every developer should know?

Answer

VonC picture VonC · Mar 14, 2009

Core concepts?

  • centralized(-replicated) VCS: ClearCase is halfway between the centralized VCS world (one or several "centralized" repos or VOBS - Version Object Bases - every developer must access to commit) and the distributed VCS world.
    But it also supports a "replicated" mode allowing you to replicate a repo in a distant site (MultiSite ClearCase), sending deltas, and managing ownership. (the license fees attached with that is quite steep though)
    This is not a true "decentralized" model, since it is does not allow for parallel concurrent evolutions: The branches are mastered in one VOB or another; you can only check-in to the master VOB for the branches mastered there, though you have readonly access to any branch at any replica.

  • linear version storage: each file and directory has a linear history; there is no direct relationship between them like the DAG VCS (Directed Acyclic Graph) where the history of a file is linked to the one of a directory linked to a commit.
    That means

    • when you compare two commits, you have to compare the list of all files and directories to find the delta, since commits are not atomic across files or directories, meaning there is no single name for all the changes to all the files that make up a logical delta.
    • That also means a merge must find a common base contributor (not always the same as a common ancestor) through history exploration (see next point).

      (Git is at the opposite end of that spectrum, being both decentralized and DAG-oriented:

      • if a node of its graph has the same "id" as a node of a different commit, it does not have to explore further: all the sub-graphs are guaranteed to be identical
      • the merge of two branches is actually the merge of the content of two nodes in a DAG: recursive and very quick for finding a common ancestor)

alt text http://publib.boulder.ibm.com/infocenter/cchelp/v7r0m0/topic/com.ibm.rational.clearcase.hlp.doc/cc_main/images/merg_base_contrib.gif

  • 3-way merging: to merge two versions, ClearCase must find a common based contributor in their linear history, which can be fairly long for complex version tree (branch/sub-branch/sub-sub/branch, ...), and the basic ClearCase merge command merges a file or directory, but it is not recursive. It only affects a singe file, or a single directory without its files (ct findmerge is recursive)

  • file-centric (as opposed to the other recent VCS more repository centric): that means the commit is file by file, not "set of modified files": the transaction is at the file level. A commit of several files is not atomic.
    (Almost every other modern tool is "repository centric", with an atomic commit transaction, but first-generation systems like RCS, SCCS, CVS, and most other older systems do not have that feature.)

  • id-managed: each file and directory has a unique id, meaning they can be renamed at will: their history will not change since the id remains for the "element". Plus a directory will detect in its history any addition/suppression of file. When a file is "removed" (rmname), it does not know it: only the directory is notified and creates a new version in its history, with a list of sub-elements not including the file removed.

(Create two files with the same size and content, they will get the same id in Git -- a SHA1 key -- and will be stored only once in the Git repo! Not so in ClearCase.
Plus, If two files with the same path and name are created in two different branches, their id being different means those two files will never be merged: they are called "evil twins")

  • branches are first-class citizens: most VCS consider a branch and a tag as the same: a single point in the history from which a new linear history can grow (branch) or from where a description is attached (tag).
    Not so for ClearCase, where a branch is a way to reference a version number. Any version number starts at 0 (just referenced in ClearCase) to 1, 2, 3, and so on. Each branch can contain a new list of version numbers (0, 1, 2, 3 again).
    This is different from other systems where the version number is unique and always growing (like the revisions in SVN), or is just unique (like the SHA1 keys in Git).

  • path-accessed: to access a certain version of a file/directory, you need to know its extended path (composed of branches and versions). It is called an "extended path name": myFile@@/main/subBranch/Version.

(Git does refer to everything through id -- SHA1-based --: version [or commit], tree [or version of a directory] and blob [or version of a file, or rather of a content of a file]. So it is "id-accessed" or "id-referenced".
For ClearCase, an id refers to an "element": a directory or a file, whatever its version is.)

  • both pessimistic lock and optimistic lock: (reserved or unreserved checkouts in ClearCase): even a pessimistic lock (reserved checkout) is not a true pessimistic one, since other users can still checkout that file (albeit in "unreserved mode"): they can change it but will have to wait for the first user to commit his file (checkin) or cancel the request. Then they will merge their checkout version of that same file.
    (Note: a "reserved" checkout can release its lock and be made unreserved, either by the owner or the administrator)

  • cheap branching: a branch does not trigger a copy of all files. It actually triggers nothing: any file not checkout will stay in its original branch. Only modified files will have their new versions stored in the declared branch.

  • flat-file storage: the VOBs are stored in a proprietary format with simple files. This is not a database with an easy query language.

  • local or network workspace access:

    • local workspace is through checkout to the hard-drive ("update" of a snapshot view).
    • network workspace is through dynamic view, combining versioned files and directories access through network (no local copy, instant access) and local files (the ones which are checked out or the private files). The combination of distant (versioned) and local (private) files allows a dynamic view to appears like a classic hard drive (whereas actually any file "written" is stored in the associated view storage).
  • centralized deported storage: [view] storage is there to keep some data and avoid some or any communication with the central referential.
    a workspace can have:

    • a scattered storage: like the .svn subdirectories all over the place
    • a centralized storage: like the view storage in ClearCase, they contain information about the files displayed by the view, and that storage is unique for a view.
    • a deported storage: the storage is not part of the view/workspace itself, but can be located elsewhere on the computer or even outside on the LAN/WAN

(Git has no "storage" per se. Its .git is actually all the repository!)

  • meta-data oriented: any "key-value" can be attached to a file or a directory, but that couple of data is not historized itself: if the value changes, it overrides the old value.

(meaning the mechanism is actually weaker than the "properties" system of SVN, where properties can have an history;
Git on the other end is not too keen on meta-data)

  • system-based protection: the owner and the rights associated with a file/directory or repositories are based on the rights-management of the underlying system. There is no applicative account in ClearCase, and the group of users are directly based on the Windows or Unix existing group (which is quite challenging in an heterogeneous environment, with Windows clients and a Unix VOB server!)

(SVN is more like "server-based" protection, where the Apache server can get a first level of protection, but must be completed with hooks to have a finer grain of rights.
Git has no direct rights management and must be controlled by hooks during push or pull between repositories)

  • hooks available: any ClearCase action can be the target of a hook, called trigger. It can be a pre or post operation.

  • CLI managed: cleartool is the Command Line Interface from which all actions can be made.