История систем управления версиями

Logical architecture

This section tries to prove that the only logical architecture difference between the two systems, is nomenclature.


2.1. History model

One of the first Git lessons is the repository basic object types: blob, tree and commit. These are the building blocks for the history model. Mercurial also builds up history upon the same three concepts, respectively: file, manifest and changeset.

To identify these objects both systems use a SHA1 hash value, what Mercurial calls nodeid. Additionally, Mercurial also provides a local revision number, a simply incrementing integer, for each changeset, in addition to the reverse count notation provided by Git (like HEAD~4). (Mercurial also includes a powerful query language for specifying revisions called revsets.)

From that, Mercurial’s view of history is, just like Git’s, a DAG or Directed Acyclic Graph of changesets. For instance, the graphical representation of history is the same in the two.

2.2. Branch model

Also like in Git, Mercurial supports branching in different ways. First and foremost, each clone of a repository represents a branch, potentially identical to other clones of the same repositories. This way of branching is sometimes referred to as heavy branches and works almost the same in both systems.

Then Git has its famous lightweight branches, which allow switching between development lines within the same clone of a repository. Take the following history graph as an example:

In Git, branches X and Y are simply references to the e and g commits. If a new commit is appended to e then the reference X would point to such commit, like this:

In Mercurial, the X and Y branches are called heads and they can be referred by their changeset identifier: either local (number) or global (SHA1 hash). In brief, it is like using Git detached heads instead of branch names, but much easier and without the risk of garbage collection (see hg help heads). They can be referred also by a bookmark, which can be pushed and pulled with the -B/--bookmark option.

Finally, Mercurial has another branching functionality called NamedBranches, also known as long lived branches. This kind of branch does not have a Git equivalent. For more information about named branches:

2.3. Tag model

Like with branches both Git and Mercurial support two tag levels: local and global. Local tags are only visible where they were created and do not propagate, so they behave practically the same in both systems.

Global tags is one of the aspects that really differs from Git. Apparently they serve the same purpose, however they are treated differently. In Git, global tags have a dedicated repository object type; these tags are usually referred as annotated tags. In Mercurial, though, they are stored in a special text file called .hgtags residing in the root directory of a repository clone. Because the .hgtags file is versioned as a normal file, all the file modifications are stored as part of the repository history.

Two important things need to be remembered about how .hgtags is handled:

  1. The file only grows and should not be edited, except when it generates a merge conflict.
  2. Because it is revision controlled, there is a corresponding revlog. When looking for tags, only the latest revision of .hgtags is parsed; never mind the checked out copy revision.

Although it is questioned by many people new to Mercurial, this design allows to keep track of all global tagging operations. Nevertheless, it also confuses because it can lead to some puzzling scenarios. For example, consider the following history graph:

In the graph, T is a global tag pointing to changeset c. This tagging action generated changeset d because .hgtags had to be committed. Now, if you clone a new repository using hg clone --rev T, the history graph of the cloned repository would look like this:

Therefore, in the new repository tag T does not exist. The reason behind this is because in the original repository tag T points to changeset c; however, tag T is added by commit d which is a descendant of c. As the clone command limits the history up to changeset c, the addition of the tag is not included in the new repository. Things work similarly when tagging a particular revision using hg tag --rev ...

Regarding tag propagation across repositories, Mercurial has very simple semantics. From the history and WireProtocol point of view, the .hgtags file is treated like the rest of the tracked files, which means that any global tagging operation becomes visible to everyone just like any other commit. It also implies that merge conflicts can occur in .hgtags.

The rationale behind Mercurial’s global tags is briefly justified in this thread (January 2009).

See also: .

Recommendations

3.1. Use a thin shell repository to manage your subrepositories

The most obvious way to construct a project using subrepositories is:

project/      # your main project repository
  somelib/    # your shared library as a nested subrepository

This tends to be suboptimal for a variety of reasons:

  • overly-strict tracking of relationship between project/ and somelib/
  • impossible to check or push project/ if somelib/ source repo becomes unavailable
  • lack of well-defined support for recursive diff, log, and status
  • recursive nature of commit surprising

The recommended structure is of this form:

build/      # thin master repo to manage build environment
  project/  # your main project as a subrepo
  somelib/  # your shared library as a sibling subrepo

Here, all repositories containing ‘real’ code have no subrepositories of their own (ie they are leaf nodes). They can thus be treated as completely ordinary repositories and a developer can largely ignore the additional complexities of subrepositories. Work can continue in these repositories even if their siblings become unavailable. Recursive commits in build/ are only needed to synchronize changes between siblings and to tag releases.

3.2. Use ‘trivial’ subrepo paths where possible

Mercurial accepts both complex and absolute subrepo paths but these may cause a variety of issues:

  • Absolute URLs are subject to change and may make old versions of the project difficult to reconstruct
  • Relative paths of the form «foo = ../foo» will not generally allow clones to be cloned
  • Paths containing drive letters, UNC paths, backslashes, or other Windows-isms will generally not be portable

The most reliable scheme to have all subrepos paths be of the form:

project = project
somelib = somelib

where the source and target are both the same simple directory name.

On hgweb servers, it will be useful to use symlinks or duplicate path entries to allow shared libraries to appear in multiple places.

One workaround with Mercurial 2.0 is to use in .hgsub to map «ideal» paths to the flat namespace used by some hosting providers. For example, a project hosted at https://bitbucket.org/kiilerix/subrepodemo/ could have a .hgsub like this:

sub = sub

https://bitbucket\.org/kiilerix/subrepodemo/sub = https://bitbucket.org/kiilerix/subrepodemo-sub

Similar subpaths magic can be used for pushing to Github with hg-git:

^git\+https://github\.com/(*)/(*)/(*)$ = git+https://github.com/\1/\3.git

Behavioral differences

In most design decisions, Mercurial tries to avoid exposing excessive complexity to the user. This can sometimes lead to the belief that both systems have nothing in common when in practice the difference is subtle, and vice versa. The main difference is that mercurial does not offer an «undo» to what you did without using commands that are referred to as «dangerous», «not what you want» etc in the help pages.

3.1. Communication between repositories

While the branching model is very similar, moving history between different repositories is slightly mismatched.

Git adds the notion of tracking branch, a branch that is used to follow changes from another repository. Tracking branches allow to selectively pull or push branches from or to a remote repository.

Mercurial keeps things simpler in this aspect:

  • When you pull, you bring all remote heads into your local repository. Then you can decide whether to merge or not. Or else, pull and merge automatically using «-u».
  • All pushes that would create new heads (i.e. lightweight branches) stop with a warning, except if the user explicitly forces them.

3.2. Git’s staging area

Git is the only DistributedSCM that exposes the concept of index or staging area. The others may implement and hide it, but in no other case is the user aware or has to deal with it.

Mercurial’s rough equivalent is the DirState, which controls working copy status information to determine the files to be included in the next commit. But in any case, this file is handled automatically. Additionally, it is possible to be more selective at commit time either by specifying the files you want to commit on the command line or by using hg commit --interactive.

If you felt uncomfortable dealing with Git’s index, you are switching for the better.

If you need the index, you can gain its behavior (with many additional options) with mercurial queues (MQ). Simple addition of changes to the index can be imitated by just building up a commit with hg commit --amend (optionally with --secret, see phases).

3.3. Bare repositories

Although this is a minor issue, Mercurial can obviously handle a bare repository; that is, a repository without a working copy. In Git you need a configuration option for that, whereas in Hg you only need to check out the null revision, like this:

hg update null  As push and pull operations do not update the working copy by default, by not updating the working copy you get the same effect of a bare repository. In fact, it is the recommended option for some particular hgwebdir.cgi setups.

Версионирование

Чтобы лучше понять проблему версионирования, рассмотрим пример дизайнера, который закончил работать над проектом и отправил финальную версию заказчику. У дизайнера есть папка, в которой хранится финальная версия проекта:

source/
barbershop_index_final.psd

Всё хорошо, дизайнер закончил работу, но заказчик прислал в ответ правки. Чтобы была возможность вернуться к старой версии проекта, дизайнер создал новый файл , внёс изменения и отправил заказчику:

source/
barbershop_index_final.psd
barbershop_index_final_2.psd

Этим всё не ограничилось, в итоге структура проекта разрослась и стала выглядеть так:

source/
barbershop_index_final.psd
barbershop_index_final_2.psd
…
barbershop_index_final_19.psd
…
barbershop_index_latest_final.psd
barbershop_index_latest_final_Final.psd

Вероятно, многие уже сталкивались с подобным, например, при написании курсовых работ во время учёбы. В профессиональной разработке использование новых файлов для версионирования является плохой практикой. Обычно у разработчиков в папке проекта хранится множество файлов. Также над одним проектом может работать несколько человек. Если каждый разработчик для версионирования будет создавать новый файл, немного изменяя название предыдущей версии, то в скором времени в проекте начнётся хаос и никто не будет понимать какие файлы нужно открывать.

Cloning, making changes, merging, pulling and updating


Let’s start with a user Alice, who has a repository that looks like:

Bob clones this repo, and ends up with a complete, independent, local copy of Alice’s store and a clean checkout of the tipmost revision d in his working directory:

Bob can now work independently of Alice. He then commits two changes e and f:

Alice then makes her own change g in parallel, which causes her repository store to diverge from Bob’s, thus creating a branch:

Bob then pulls Alice’s repo to synchronize. This copies all of Alice’s changes into Bob’s repository store (here, it’s just a single change g). Note that Bob’s working directory is not changed by the pull:

Because Alice’s g is the newest head in Bob’s repository, it’s now the tip.

Bob then does a merge, which combines the last change he was working on (f) with the tip in his repository. Now, his working directory has two parent revisions (f and g):

After examining the result of the merge in his working directory and making sure the merge is perfect, Bob commits the result and ends up with a new merge changeset h in his store:

Now if Alice pulls from Bob, she will get Bob’s changes e, f, and h into her store:

Note that Alice’s working directory was not changed by the pull. She has to do an update to synchronize her working directory to the merge changset h. This changes the parent changeset of her working directory to changeset h and updates the files in her working directory to revision h.

Now Alice and Bob are fully synchronized again.

File contexts

A file context is an object which provides convenient access to various data related to a particular file revision. File contexts can be converted to a string (for printing, etc — the string representation is the «path@shortID»), tested for truth value (False is «nonexistent»), compared for equality, and used as keys in a dictionary.

Some informational methods on file context objects:

  • fctx.filectx(id) — the file context for another revision of the file

  • fctx.filerev() — the revision at which this file was last changed

  • fctx.filenode() — the file ID

  • fctx.fileflags() — the file flags

  • fctx.isexec() — is the file executable

  • fctx.islink() — is the file a symbolic link

  • fctx.filelog() — the file log for the file revision (file logs are not documented here — see the source)

  • fctx.rev() — the revision from which this file context was extracted

  • fctx.changectx() — the change context associated with this file revision

  • fctx.node, fctx.user, fctx.date, fctx.files, fctx.description, fctx.branch, fctx.manifest — the same as the equivalent change context methods, applied to the change context associated with the file revision.

  • fctx.data() — the file data

  • fctx.path() — the file path

  • fctx.size() — the file size

  • fctx.isbinary() — the file is binary

  • fctx.cmp(fctx) — does the file contents differ from another file contents?

  • fctx.annotate(follow=False, linenumber=None) — list of tuples of (ctx, line) for each line in the file, where ctx is the file context of the node where that line was last changed. (The follow and linenumber parameters are not documented here — see the source for details).

Scenarios

Now will be analyzed the most interesting scenarios.

7.1. Scenario A

The first one is the simplest one, a simple branch.

In this scenario there are two interesting interactions:

$ hg up C
$ hg rebase --dest E

Another syntax that would yield the same result is:

$ hg rebase --dest E --base C

7.1.2. rebase on an intermediate revision

$ hg up C
$ hg rebase -d D

7.2. Scenario B

The second scenario involves something more complicated. In this scenario the user cloned from upstream, then merged several times.

$ hg rebase --dest I --source D

Despite being a merge revision D hasn’t been skipped in this case, as opposite to H.

$ hg rebase --dest I --source B

In this case two revisions (D and H) have been skipped.

$ hg rebase --dest B --source C

7.2.4. rebase G onto I

$ hg rebase --dest I --source G

Note: Prior Mercurial 2.3 you need to had —detach option in this situation. otherwise you get this result

7.3. Scenario C

This case represents a quite common situation, a repository with just one (merge) head.

7.3.1. D onto C

$ hg rebase --dest C --source D

Obviously the revision F has been skipped.

7.4. Collapsing

Sometimes it could be useful to be able to rebase changesets onto another branch, obtaining though just one revision.

This can be achieved using the option —collapse.

$ hg rebase --dest B --source C --collapse

or

The base option could have been used here too

$ hg rebase --dest B --base E --collapse

Working on an existing Mercurial project

If you have a URL to a browsable project repository (for example https://www.mercurial-scm.org/repo/hg), you can grab a copy like so:

$ hg clone https://www.mercurial-scm.org/repo/hg mercurial-repo
requesting all changes
adding changesets
adding manifests
adding file changes
added 9633 changesets with 19124 changes to 1271 files
updating to branch default
1084 files updated, 0 files merged, 0 files removed, 0 files unresolved

This will create a new directory called mercurial-repo, grab the complete project history, and check out the most recent changeset on the default branch.

The ‘summary‘ command will summarize the state of the working directory. Command names may be abbreviated, so entering just ‘hg sum‘ is enough:

$ hg sum
parent: 9632:16698d87ad20 tip
 util: use sys.argv if $HG is unset and 'hg' is not in PATH
branch: default
commit: (clean)
update: (current)

Here commit: (clean) means that there no local changes, update: (current) means that the checked out files (in the working directory) are updated to the newest revision in the repository.

What Mercurial can’t do

Many SVN/CVS users expect to host related projects together in one repository. This is really not what Mercurial was made for, so you should try a different way of working. In particular, this means that you cannot check out only one directory of a repository.

If you absolutely need to host multiple projects in a kind of meta-repository though, you could try the Subrepositories feature that was introduced with Mercurial 1.3 or the older ForestExtension.

For a hands-on introduction to using Mercurial, see the Tutorial.

Brazilian Portuguese, Czech, Deutsch, Français, Italiano, Russian, Spanish, Thai, 中文, 日本語, 한국어

Communicating with the user

Most extensions will need to perform some interaction with the user. This is the purpose of the ui parameter to an extension function. The ui parameter is an object with a number of useful methods for interacting with the user.

Writing output:

  • ui.write(*msg) — write a message to the standard output (the message arguments are concatenated). This should only be used if you really want to give the user no way of suppressing the output. ui.status (below) is usually better.

  • ui.status(*msg) — write a message at status level (shown unless —quiet is specified)

  • ui.note(*msg) — write a message at note level (shown if —verbose is specified)

  • ui.debug(*msg) — write a message at debug level (shown if —debug is specified)

  • ui.warn(*msg) — write a warning message to the error stream

  • ui.flush() — flush the output and error streams

Accepting input:

  • ui.prompt(msg, default="y") — prompt the user with MSG and read the response. If we are not in an interactive context, just return DEFAULT.

  • ui.promptchoice(prompt, default=0) — Prompt user with a message, read response, and ensure it matches one of the provided choices. The prompt is formatted as follows:

    «would you like fries with that (Yn)? $$ &Yes $$ &No»

    The index of the choice is returned. Responses are case insensitive. If ui is not interactive, the default is returned.
  • ui.edit(text, user) — open an editor on a file containing TEXT. Return the edited text, with lines starting HG: removed. While the edit is in progress, the HGUSER environment variable is set to USER.

Useful values:

  • ui.geteditor() — the user’s preferred editor

  • ui.username() — the default username to be used in commits

  • ui.shortuser(user) — a short form of user name USER

  • ui.expandpath(loc, default=None) — the location of repository LOC (which may be relative to the CWD, or from the configuration section. If no other value can be found, DEFAULT is returned.

4.1. Collecting output

Output from a ui object is usually to the standard output, sys.stdout. However, it is possible to «divert» all output and collect it for processing by your code. This involves the ui.pushbuffer() and ui.popbuffer() functions. At the start of the code whose output you want to collect, call ui.pushbuffer(). Then, when you have finished the code whose output you wish to collect, call ui.popbuffer(). The popbuffer() call returns all collected output as a string, for you to process as you wish (and potentially pass to ui.write()) in some form, if you just want to edit the output and then send it on.

Here is a sample code snippet adapted from http://selenic.com/pipermail/mercurial/2010-February/030231.html:

from mercurial import ui, hg, commands
u = ui.ui()
repo = hg.repository(u, "/path/to/repo")
u.pushbuffer()
# command / function to call, for example:
commands.log(u, repo)
output = u.popbuffer()
assert type(output) == str

4.2. Reading configuration files

All relevant configuration values should be represented in the UI object — that is, global configuration (/etc/mercurial/hgrc), user configuration (~/.hgrc) and repository configuration (.hg/hgrc). You can easily read from these using the following methods on the ui object:

  • ui.config(section, name, default=None, untrusted=False) — gets a configuration value, or a default value if none is specified

  • ui.configbool(section, name, default=False, untrusted=False) — convert a config value to boolean (Mercurial accepts several different spellings, like True, false and 0)

  • ui.configlist(section, name, default=None, untrusted=False) — try to make a list from the requested config value. The elements are separated by comma or whitespace.

  • ui.configitems(section, untrusted=False) — return all configuration values in the given section

Introduction

Mercurial is designed to offer a small, safe, and easy to use command set which is powerful enough for most users. Advanced users of Mercurial can be aided with the use of Mercurial extensions. Extensions allow the integration of powerful new features directly into the Mercurial core.

Features in extensions may not conform to Mercurial’s usual standards for safety, reliability, and ease of use.

Built-in help on extensions is available with ‘hg help extensions‘. To get help about an enabled extension, run ‘hg help <extension-name>‘.

Note that Mercurial explicitly does not provide a stable API for extension programmers, so it is up to their respective providers/maintainers to adapt them to API changes.

Free services

Following is a list of services that offer hosting at no cost including sites that additionally offer paid services. Among those web sites, most require proprietary JavaScript to register or even to display the content, including code from unrelated third parties (Google reCAPTCHA or similar). Puszcza and Savannah do not require proprietary software to use.

  • Mozdev: Provides free project hosting for Mozilla applications and extensions.

  • OSDN: A free-of-charge service for open source software developers. Some of the features: wiki, bug tracker, shell access and project web site hosting.

  • Puszcza: A hosting service for free (libre) software that uses the software. Operated by Sergey Poznyakoff.

  • Savannah: Free software hosting for people committed to free software, supports Mercurial and the repositories are accessible via hgweb. Uses the software.

  • SourceForge.net: Free hosting for open-source software. Supports Hg and many other project features (wiki, issues, mailing lists, forums, etc)


  • foss.heptapod.net: Free hosting for Free and Open Source Software. A standard Heptapod instance (see below), with shared CI runners and Bitbucket import. Project creation by a Hosting Request process.

Command documentation

As of Mercurial 4.1, here is the official documentation of the rebase command.

move changeset (and descendants) to a different branch

    Rebase uses repeated merging to graft changesets from one part of history
    (the source) onto another (the destination). This can be useful for
    linearizing *local* changes relative to a master development tree.

    Published commits cannot be rebased (see 'hg help phases'). To copy
    commits, see 'hg help graft'.

    If you don't specify a destination changeset ("-d/--dest"), rebase will
    use the same logic as 'hg merge' to pick a destination.  if the current
    branch contains exactly one other head, the other head is merged with by
    default.  Otherwise, an explicit revision with which to merge with must be
    provided.  (destination changeset is not modified by rebasing, but new
    changesets are added as its descendants.)

    Here are the ways to select changesets:

      1. Explicitly select them using "--rev".
      2. Use "--source" to select a root changeset and include all of its
         descendants.
      3. Use "--base" to select a changeset; rebase will find ancestors and
         their descendants which are not also ancestors of the destination.
      4. If you do not specify any of "--rev", "source", or "--base", rebase
         will use "--base ." as above.

    Rebase will destroy original changesets unless you use "--keep". It will
    also move your bookmarks (even if you do).

    Some changesets may be dropped if they do not contribute changes (e.g.
    merges from the destination branch).

    Unlike "merge", rebase will do nothing if you are at the branch tip of a
    named branch with two heads. You will need to explicitly specify source
    and/or destination.

    If you need to use a tool to automate merge/conflict decisions, you can
    specify one with "--tool", see 'hg help merge-tools'. As a caveat: the
    tool will not be used to mediate when a file was deleted, there is no hook
    presently available for this.

    If a rebase is interrupted to manually resolve a conflict, it can be
    continued with --continue/-c or aborted with --abort/-a.

    Returns 0 on success, 1 if nothing to rebase or there are unresolved
    conflicts.

Repositories

There are a number of different repository types, each defined with its own class name, in its own module. All repository types are subclasses of mercurial.repo.repository.

Protocol

Module

Class Name

local

localrepository

http

httprepository

static-http

statichttprepository

ssh

sshrepository

bundle

bundlerepository

Repository objects should be created using module.instance(ui, path, create) where path is an appropriate path/URL to the repository, and create should be True if a new repository is to be created. You can also use the helper method hg.repository(), which selects the appropriate repository class based on the path or URL passed.

Repositories have many methods and attributes, but not all repository types support all of the various options.

Some key methods of (local) repositories:

  • repo — a change context for the changeset changeid. changid can be a descriptor like changeset hash, revision number, ‘tip’, ‘.’, branch names, tags or anything that can be resolved to a changeset hash.

  • repo — a change context for the working directory

  • repo.changelog — the repository changelog

  • repo.root — the path of the repository root

  • repo.status() — returns a tuple of files modified, added, removed, deleted, unknown(?), ignored and clean in the current working directory

TODO: Add more details here.

Tagging revisions

Use Case

Since you can now code separate features more easily, you might want to mark certain revisions as fit for consumption (or similar). For example you might want to mark releases, or just mark off revisions as reviewed.

For this Mercurial offers tags. Tags add a name to a revision and are part of the history. You can tag a change years after it was committed. The tag includes the information when it was added, and tags can be pulled, pushed and merged just like any other committed change.

Note:

A tag must not contain the char «:», since that char is used for specifying multiple revisions — see «hg help revisions».

Note:

To securely mark a revision, you can use the gpg extension to sign the tag.

Workflow

Let’s assume you want to give revision 3 the name «v0.1».

Add the tag

$ hg tag -r 3 v0.1

See all tags

$ hg tags

When you look at the log you’ll now see a line in changeset 3 which marks the Tag. If someone wants to update to the tagged revision, he can just use the name of your tag

$ hg update v0.1

Now he’ll be at the tagged revision and can work from there.

Project/issue tracking

  • TracMercurial — Provides Mercurial integration for Trac

  • TargetProcess Mercurial Plugin — Mercurial integration for TargetProcess Agile Project Management software

  • Redmine — A flexible project management web application with built-in Mercurial support

  • EmForge — workflow-based project management solution has support for Mercurial repositories (see Mercurial Support for details)

  • InDefero — Clone of Google Code with Mercurial (also Git/Subversion) code browser, wiki, issue tracking and more

  • BugzillaExtension — Automatically updating comments of bugzilla bugs when there’s a reference to a bug id inside changesets

  • HgLab — a Mercurial source control management system for Windows with push and pull servers, repository browser and a whole slew of other goodies.

  • Deveo — a code hosting platform that supports Git, Subversion and Mercurial and has issue tracking and Wiki functionalities. Commits in repositories are automatically linked to issues in a given project.


С этим читают