Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Python: Dev

Mercurial migration: progress report (PEP 385)

 

 

First page Previous page 1 2 3 4 Next page Last page  View All Python dev RSS feed   Index | Next | Previous | View Threaded


dirkjan at ochtman

Jul 2, 2009, 1:42 PM

Post #1 of 83 (1897 views)
Permalink
Mercurial migration: progress report (PEP 385)

In response to some rumblings on python-committers and just to request
more feedback, a progress report. I know it's long, I've tried to put
to keep it concise and chunked, though.

- First of all, I've got the basic conversion down, I've done it a few
times now, with progressively better results. You can view some
results at http://hg.python.org/, which has a preliminary cpython
repository. *** The changeset hashes for that repo will change, so you
won't be able to commit or pull from it in the future.***

- Second of all, some planning. I've thought about it a bit, and I
think we should aim for going live with hg on August 1. Given that I'm
on vacation from 8-18 July (and I'm not sure whether I'll be able to
actually work on it during that time, though I imagine I'll be able to
spend some time on it at least), that's quite ambitious, so I'm going
to say it's okay if it slips by a few days. Putting a deadline out
there is a good thing, anyway.

- Third of all, to make this possible, it would be helpful if I got
more feedback on the PEP. Last time I raised it, there was virtually
nothing. This time, I'll include it inline so there's hopefully less
of a barrier to reviewing it.

- Fourth, Mercurial 1.3 was just released! Bet you didn't see that
coming. It's looking like a pretty good release, with an experimental
version of the much-coveted subrepository support (like
svn:externals). This also means that the latest version of
hgsubversion, the tool I used for the conversion, will be more
accessible for converting other projects. You know you want to!

- Fifth, here's a list of things, off the top of my head, that still need doing:

* Get agreement on branch strategy and branch processing (list of
branches + proposed handling at
http://hg.python.org/pymigr/file/tip/all-branches.txt) <--- PLEASE
REVIEW
* Get agreement on tag processing (first come up with a list)
* Set up hg-ssh infra (should be easy)
* Set up hooks (should be mostly straightforward)
* Set up roundup integration (should be made easier by quick revision
map hgweb extension)
* Write docs

- Sixth (this is the good part), less obvious things that have been
done or don't need doing:

* .hgignore generation (I've been convinced it's too hard, the current
version will do)
* revlog reordering (it's painless and a big win)

I'll get through all of these myself, but obviously any help would be
welcome. For any hg users, writing docs should be an easy start. For
others, please review the PEP (below), the branch map in
http://hg.python.org/pymigr/file/tip/all-branches.txt and the author
map at http://hg.python.org/pymigr/file/tip/author-map (not that much
has changed since the start, so if you've looked at it already, feel
free to skip this part). Right now I'm a little stuck on branch
processing, because it's a long running script that needs a bunch of
debugging, but I'll get going on that again.

I think that's all I can think of for now, I'll update the PEP with
new bits soon. Here it is, ready for your review:

==============================================================

Motivation

After having decided to switch to the Mercurial DVCS, the actual
migration still has to be performed. In the case of an important piece
of infrastructure like the version control system for a large,
distributed project like Python, this is a significant effort. This
PEP is an attempt to describe the steps that must be taken for further
discussion. It's somewhat similar to PEP 347 [1], which discussed the
migration to SVN.

To make the most of hg, I (Dirkjan) would like to make a high-fidelity
conversion, such that (a) as much of the svn metadata as possible is
retained, and (b) all metadata is converted to formats that are common
in Mercurial. This way, tools written for Mercurial can be optimally
used. In order to do this, I want to use the hgsubversion [2] software
to do an initial conversion. This hg extension is focused on providing
high-quality conversion from Subversion to Mercurial for use in
two-way correspondence, meaning it doesn't throw away as much
available metadata as other solutions.

Such a conversion also seems like a good time to reconsider the
contents of the repository and determine if some things are still
valuable. In this spirit, the following sections also propose
discarding some of the older metadata.
Timeline

TBD; needs fully working hgsubversion and consensus on this document.
Transition plan
Branch strategy

Mercurial has two basic ways of using branches: cloned branches, where
each branch is kept in a separate repository, and named branches,
where each revision keeps metadata to note on which branch it belongs.
The former makes it easier to distinguish branches, at the expense of
requiring more disk space on the client. The latter makes it a little
easier to switch between branches, but often has somewhat unintuitive
results for people (though this has been getting better in recent
versions of Mercurial).

I'm still a bit on the fence about whether Python should adopt cloned
branches and named branches. Since it usually makes more sense to tag
releases on the maintenance branch, for example, mainline history
would not contain release tags if we used cloned branches. Also,
Mercurial 1.2 and 1.3 have the necessary tools to make named branches
less painful (because they can be properly closed and closed heads are
no longer considered in relevant cases).

A disadvantage might be that the used clones will be a good bit larger
(since they essentially contain all other branches as well). This can
me mitigated by keeping non-release (feature) branches in separate
clones. Also note that it's still possible to clone a single named
branch from a combined clone, by specifying the branch as in hg clone
http://hg.python.org/main/#2.6-maint. Keeping the py3k history in a
separate clone problably also makes sense.

XXX To do: size comparison for selected separation scenarios.
Converting branches

There are quite a lot of branches in SVN's branches directory. I
propose to clean this up a bit, by employing the following the
strategy:

* Keep all release (maintenance) branches
* Discard branches that haven't been touched in 18 months, unless
somone indicates there's still interest in such a branch
* Keep branches that have been touched in the last 18 months,
unless someone indicates the branch can be deprecated

Converting tags

The SVN tags directory contains a lot of old stuff. Some of these are
not, in fact, full tags, but contain only a smaller subset of the
repository. I think we should keep all release tags, and consider
other tags for inclusion based on requests from the developer
community. I'd like to consider unifying the release tag naming scheme
to make some things more consistent, if people feel that won't create
too many problems. For example, Mercurial itself just uses '1.2.1' as
a tag, where CPython would currently use r121.
Author map

In order to provide user names the way they are common in hg (in the
'First Last <user [at] example>' format), we need an author map to map
cvs and svn user names to real names and their email addresses. I have
a complete version of such a map in my migration tools repository [3].
The email addresses in it might be out of date; that's bound to
happen, although it would be nice to try and have as many people as
possible review it for addresses that are out of date. The current
version also still seems to contain some encoding problems.
Generating .hgignore

The .hgignore file can be used in Mercurial repositories to help
ignore files that are not eligible for version control. It does this
by employing several possible forms of pattern matching. The current
Python repository already includes a rudimentary .hgignore file to
help with using the hg mirrors.

It might be useful to have the .hgignore be generated automatically
from svn:ignore properties. This would make sure all historic
revisions also have useful ignore information (though one could argue
ignoring isn't really relevant to just checking out an old revision).
Revlog reordering

As an optional optimization technique, we should consider trying a
reordering pass on the revlogs (internal Mercurial files) resulting
from the conversion. In some cases this results in dramatic decreases
in on-disk repository size.
Other repositories

Richard Tew has indicated that he'd like the Stackless repository to
also be converted. What other projects in the svn.python.org
repository should be converted? Do we want to convert the peps
repository? distutils? others?
Infrastructure
hg-ssh

Developers should access the repositories through ssh, similar to the
current setup. Public keys can be used to grant people access to a
shared hg@ account. A hgwebdir instance should also be set up for easy
browsing and read-only access. If we're using ssh, developers should
trivially be able to start new clones (for longer-term features that
profit from a separate branch).
Hooks

A number of hooks is currently in use. The hg equivalents for these
should be developed and deployed. The following hooks are being used:

* check whitespace: a hook to reject commits in case the
whitespace doesn't match the rules for the Python codebase. Should be
straightforward to re-implement from the current version. We can also
offer a whitespace hook for use with client-side repositories that
people can use; it could either warn about whitespace issues and/or
truncate trailing whitespace from changed lines. Open issue: do we
check only the tip after each push, or do we check every commit in a
changegroup?
* commit mails: we can leverage the notify extension for this
* buildbots: both the regular and the community build masters must
be notified. Fortunately buildbot includes support for hg. I've also
implemented this for Mercurial itself, so I don't expect problems
here.
* check contributors: in the current setup, all changesets bear
the username of committers, who must have signed the contributor
agreement. In a DVCS, the committers are not necessarily the same
people who push, and so we can't check if the committer is a
contributor. We could use a hook to check if the committer is a
contributor if we keep a list of registered contributors.

hgwebdir

A more or less stock hgwebdir installation should be set up. We might
want to come up with a style to match the Python website. It may also
be useful to build a quick extension to augment the URL rev parser so
that it can also take r[0-9]+ args and come up with the matching hg
revision.
After migration
Where to get code

It needs to be decided where the hg repositories will live. I'd like
to propose to keep the hgwebdir instance at hg.python.org. This is an
accepted standard for many organizations, and an easy parallel to
svn.python.org. The 2.7 (trunk) repo might live at
http://hg.python.org/main/, for example, with py3k at
http://hg.python.org/py3k/. For write access, developers will have to
use ssh, which could be ssh://hg [at] hg/main/. A demo
installation will be set up with a preliminary conversion so people
can experiment and review; it can live at
http://hg.python.org/example/.

code.python.org was also proposed as the hostname. Personally, I think
that using the VCS name in the hostname is good because it prevents
confusion: it should be clear that you can't use svn or bzr for
hg.python.org.

hgwebdir can already provide tarballs for every changeset. I think
this obviates the need for daily snapshots; we can just point users to
tip.tar.gz instead, meaning they will get the latest. If desired, we
could even use buildbot results to point to the last good changeset.
Python-specific documentation

hg comes with good built-in documentation (available through hg help)
and a wiki [4] that's full of useful information and recipes. In
addition to that, the parts of the developer FAQ [5] concerning
version control will gain a section on using hg for Python
development. Some of the text will be dependent on the outcome of
debate about this PEP (for example, the branching strategy).

Think first, commit later?

In recent history, old versions of Python have been maintained by a
select group of people backporting patches from trunk to release
branches. While this may not scale so well as the development pace
grows, it also runs into some problems with the current crop of
distributed versioning tools. These tools (I believe similar problems
would exist for either git, bzr, or hg, though some may cope better
than others) are based on the idea of a Directed Acyclic Graph (or
DAG), meaning they keep track of relations of changesets.

Mercurial itself has a stable branch which is a ''strict'' subset of
the unstable branch. This means that generally all fixes for the
stable branch get committed against the tip of the stable branch, then
they get merged into the unstable branch (which already contains the
parent of the new cset). This provides a largely frictionless
environment for moving changes from stable to unstable branches.
Mistakes, where a change that should go on stable goes on unstable
first, do happen, but they're usually easy to fix. That can be done by
copying the change over to the stable branch, then trivial-merging
with unstable -- meaning the merge in fact ignores the parent from the
stable branch).

This strategy means a little more work for regular committers, because
they have to think about whether their change should go on stable or
unstable; they may even have to ask someone else (the RM) before
committing. But it also relieves a dedicated group of committers of
regular backporting duty, in addition to making it easier to work with
the tool.

Now would be a good time to consider changing strategies in this
regard, although it would be relatively easy to switch to such a model
later on.
The future of Subversion

What happens to the Subversion repositories after the migration? Since
the svn server contains a bunch of repositories, not just the CPython
one, it will probably live on for a bit as not every project may want
to migrate or it takes longer for other projects to migrate. To
prevent people from staying behind, we may want to remove migrated
projects from the repository.
Build identification

Python currently provides the sys.subversion tuple to allow Python
code to find out exactly what version of Python it's running against.
The current version looks something like this:

* ('CPython', 'tags/r262', '71600')
* ('CPython', 'trunk', '73128M')

Another value is returned from Py_GetBuildInfo() in the C API, and
available to Python code as part of sys.version:

* 'r262:71600, Jun 2 2009, 09:58:33'
* 'trunk:73128M, Jun 2 2009, 01:24:14'

I propose that the revision identifier will be the short version of
hg's revision hash, for example 'dd3ebf81af43', augmented with '+'
(instead of 'M') if the working directory from which it was built was
modified. This mirrors the output of the hg id command, which is
intended for this kind of usage.

For the tag/branch identifier, I propose that hg will check for tags
on the currently checked out revision, use the tag if there is one
('tip' doesn't count), and uses the branch name otherwise.
sys.subversion becomes

* ('CPython', '2.6.2', 'dd3ebf81af43')
* ('CPython', 'default', 'af694c6a888c+')

and the build info string becomes

* '2.6.2:dd3ebf81af43, Jun 2 2009, 09:58:33'
* 'default:af694c6a888c+, Jun 2 2009, 01:24:14'

This reflects that the default branch in hg is called 'default'
instead of Subversion's 'trunk', and reflects the proposed new tag
format.
References
[1] http://www.python.org/dev/peps/pep-0347/
[2] http://bitbucket.org/durin42/hgsubversion/
[3] http://hg.xavamedia.nl/cpython/pymigr/
[4] http://www.selenic.com/mercurial/wiki/
[5] http://www.python.org/dev/faq/#version-control

=====================================================

Cheers,

Dirkjan
_______________________________________________
Python-Dev mailing list
Python-Dev [at] python
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/list-python-dev%40lists.gossamer-threads.com


skippy.hammond at gmail

Jul 2, 2009, 4:01 PM

Post #2 of 83 (1832 views)
Permalink
Re: Mercurial migration: progress report (PEP 385) [In reply to]

On 3/07/2009 6:42 AM, Dirkjan Ochtman wrote:
> In response to some rumblings on python-committers and just to request
> more feedback, a progress report. I know it's long, I've tried to put
> to keep it concise and chunked, though.

Although this has come up in the past, I don't recall a resolution.

What is your plan to handle svn:eol-style? We have some files in the
tree which need that support and it isn't clear to me how that would
work with the existing win32text extension provided with current
mercurial releases. (I've an outstanding patch to hg which should
address some of these issues, but without the 'rules' being versioned I
fear that would still fall short.)

Even more generally, how will you suggest Windows users work? Will
local files, in general, have windows line endings or unix? If the
latter, will there be hooks in-place to prevent editors on Windows
'accidently' mixing eol styles? If so, this cycles back to the first
question - how would we know which files get treated that way?

Thanks,

Mark
_______________________________________________
Python-Dev mailing list
Python-Dev [at] python
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/list-python-dev%40lists.gossamer-threads.com


benjamin at python

Jul 2, 2009, 4:26 PM

Post #3 of 83 (1834 views)
Permalink
Re: Mercurial migration: progress report (PEP 385) [In reply to]

2009/7/2 Dirkjan Ochtman <dirkjan [at] ochtman>:
> In response to some rumblings on python-committers and just to request
> more feedback, a progress report. I know it's long, I've tried to put
> to keep it concise and chunked, though.

Thanks very much for working on this, Dirkjan. It may seem rather
thankless now, but I'm sure once we wish to Mercurial, the praise will
flow. :)

>
> - Second of all, some planning. I've thought about it a bit, and I
> think we should aim for going live with hg on August 1. Given that I'm
> on vacation from 8-18 July (and I'm not sure whether I'll be able to
> actually work on it during that time, though I imagine I'll be able to
> spend some time on it at least), that's quite ambitious, so I'm going
> to say it's okay if it slips by a few days. Putting a deadline out
> there is a good thing, anyway.

Sounds good.


> - Fifth, here's a list of things, off the top of my head, that still need doing:
>
> * Get agreement on branch strategy and branch processing (list of
> branches + proposed handling at
> http://hg.python.org/pymigr/file/tip/all-branches.txt) <--- PLEASE
> REVIEW

The io-c branch doesn't need to stay.

> * Get agreement on tag processing (first come up with a list)
> * Set up hg-ssh infra (should be easy)
> * Set up hooks (should be mostly straightforward)
> * Set up roundup integration (should be made easier by quick revision
> map hgweb extension)
> * Write docs
>
> - Sixth (this is the good part), less obvious things that have been
> done or don't need doing:

I suppose this includes modifying sys.subversion as described in the PEP?

> The SVN tags directory contains a lot of old stuff. Some of these are
> not, in fact, full tags, but contain only a smaller subset of the
> repository. I think we should keep all release tags, and consider
> other tags for inclusion based on requests from the developer
> community. I'd like to consider unifying the release tag naming scheme
> to make some things more consistent, if people feel that won't create
> too many problems. For example, Mercurial itself just uses '1.2.1' as
> a tag, where CPython would currently use r121.

+1 to unifying tag name style to the current cpython procedure.

> Author map
>
> In order to provide user names the way they are common in hg (in the
> 'First Last <user [at] example>' format), we need an author map to map
> cvs and svn user names to real names and their email addresses. I have
> a complete version of such a map in my migration tools repository [3].
> The email addresses in it might be out of date; that's bound to
> happen, although it would be nice to try and have as many people as
> possible review it for addresses that are out of date. The current
> version also still seems to contain some encoding problems.
> Generating .hgignore

What effect will the encoding problems have? Does hg require ASCII characters?

> Richard Tew has indicated that he'd like the Stackless repository to
> also be converted. What other projects in the svn.python.org
> repository should be converted? Do we want to convert the peps
> repository? distutils? others?

I think everything should be converted unless there's a reason not to.
(such as the maintainer indicates she doesn't what to migrate)


> A number of hooks is currently in use. The hg equivalents for these
> should be developed and deployed. The following hooks are being used:
>
>    * check whitespace: a hook to reject commits in case the
> whitespace doesn't match the rules for the Python codebase. Should be
> straightforward to re-implement from the current version. We can also
> offer a whitespace hook for use with client-side repositories that
> people can use; it could either warn about whitespace issues and/or
> truncate trailing whitespace from changed lines. Open issue: do we
> check only the tip after each push, or do we check every commit in a
> changegroup?

It might as well be on every commit because it will have to normalized
on push anyway.

> code.python.org was also proposed as the hostname. Personally, I think
> that using the VCS name in the hostname is good because it prevents
> confusion: it should be clear that you can't use svn or bzr for
> hg.python.org.

+1 for hg.python.org

>
> Think first, commit later?
>
> In recent history, old versions of Python have been maintained by a
> select group of people backporting patches from trunk to release
> branches. While this may not scale so well as the development pace
> grows, it also runs into some problems with the current crop of
> distributed versioning tools. These tools (I believe similar problems
> would exist for either git, bzr, or hg, though some may cope better
> than others) are based on the idea of a Directed Acyclic Graph (or
> DAG), meaning they keep track of relations of changesets.

The problem is that Python is much more complicated than the average
project. We have many commits that are only applicable one maintenance
branch, or just 2.x, or just 3.x; the trunk and py3k will never be
subsets of each other. Regardless of where we make commits initially,
we need to ability to manage special cases like this easily.



--
Regards,
Benjamin
_______________________________________________
Python-Dev mailing list
Python-Dev [at] python
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/list-python-dev%40lists.gossamer-threads.com


dirkjan at ochtman

Jul 3, 2009, 4:28 AM

Post #4 of 83 (1832 views)
Permalink
Re: Mercurial migration: progress report (PEP 385) [In reply to]

On Fri, Jul 3, 2009 at 01:01, Mark Hammond<skippy.hammond [at] gmail> wrote:
> Although this has come up in the past, I don't recall a resolution.
>
> What is your plan to handle svn:eol-style?  We have some files in the tree
> which need that support and it isn't clear to me how that would work with
> the existing win32text extension provided with current mercurial releases.
>  (I've an outstanding patch to hg which should address some of these issues,
> but without the 'rules' being versioned I fear that would still fall short.)

What files would need what? Are there any files that really need to be
\r\n on Windows and \n on Unix (and possibly \r on Mac)? I remember
one file was discussed separately, but I think the outcome there was
that it could just always be \r\n (since it wasn't used at all on
non-Windows platforms). Anyway, knowing specific requirements (or
where to find them) would help here.

> Even more generally, how will you suggest Windows users work?  Will local
> files, in general, have windows line endings or unix?  If the latter, will
> there be hooks in-place to prevent editors on Windows 'accidently' mixing
> eol styles?  If so, this cycles back to the first question - how would we
> know which files get treated that way?

There will be a server-side hook to check whitespace. People will also
be able to install it for commit-time.

I think just using \n by default everywhere is a good default (though
I almost always use Windows client machine, I do all nearly all of my
development through a terminal on several Linux boxen), except where
it isn't. People who want to use can set up the win32text stuff to get
\r\n on Windows if they feel they need that -- we can provide
information about that in the dev FAQ (although it would be nice if
someone else who was more familiar with it -- like yourself! -- would
write it).

Cheers,

Dirkjan
_______________________________________________
Python-Dev mailing list
Python-Dev [at] python
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/list-python-dev%40lists.gossamer-threads.com


stephen at xemacs

Jul 3, 2009, 6:29 AM

Post #5 of 83 (1820 views)
Permalink
Mercurial migration: progress report (PEP 385) [In reply to]

Dirkjan Ochtman writes:

> Mercurial has two basic ways of using branches: cloned branches, where
> each branch is kept in a separate repository, and named branches,
> where each revision keeps metadata to note on which branch it belongs.
> The former makes it easier to distinguish branches, at the expense of
> requiring more disk space on the client. The latter makes it a little
> easier to switch between branches, but often has somewhat unintuitive
> results for people (though this has been getting better in recent
> versions of Mercurial).

I'll have to try them again now that 1.3 is out, but I found Mercurial
named branches fundamentally unsuited to release management.

> I'm still a bit on the fence about whether Python should adopt cloned
> branches and named branches. Since it usually makes more sense to tag
> releases on the maintenance branch, for example, mainline history
> would not contain release tags if we used cloned branches.

Ditto named branches. The problem is that (unless the internal
implementation has changed very recently) a Mercurial revision can be
on exactly one named branch (or on the trunk).

> A disadvantage might be that the used clones will be a good bit larger
> (since they essentially contain all other branches as well). This can
> me mitigated by keeping non-release (feature) branches in separate
> clones.

Which defeats the purpose of having named branches, really. (I mean
the version control purpose; obviously it still can save disk space.)

> Also note that it's still possible to clone a single named
> branch from a combined clone, by specifying the branch as in hg clone
> http://hg.python.org/main/#2.6-maint.

Unless you're really short on space, though, that's not a big deal.
What would be more important to me (not that I matter for the purpose
of Python, but in XEmacs -- also a Mercurial shop -- I do :-) would be
the other way around: pulling an external branch into a named branch.
I have a feeling that working with such a repository with others would
be a little difficult.

> too many problems. For example, Mercurial itself just uses '1.2.1' as
> a tag, where CPython would currently use r121.

Stick with the CPython notation. At XEmacs, continuity of tags has
made our beta testers happy. (Well, the two of them who bothered to
mention it, anyway. :-)

> code.python.org was also proposed as the hostname. Personally, I think
> that using the VCS name in the hostname is good because it prevents
> confusion: it should be clear that you can't use svn or bzr for
> hg.python.org.

Agreed, although "can't" is a little too strong. It might work (there
are a lot of places where http://ftp.example.com works just fine, for
example), but we don't want people to expect it to, and
"http://REPOHOST.python.org/" should take your browser or your client
to the official repo (which will be the hg repo), not to some index of
repos that happen to live on the same host.

> Mercurial itself has a stable branch which is a ''strict'' subset of
> the unstable branch.

As others (MvL, I think) have commented, this isn't really relevant to
Python which generally has four mainlines going at once. I don't see
why the requirements are going to change with the shift to hg, and I
see no reason why hg won't handle the existing workflow just fine.

Note that PEP 374 was written on the assumption that the existing
workflow will *not* change until the committers have gotten used to
Mercurial, and then it will change in the natural way. Ie, some one
will say, "you know, this bit of red tape isn't needed any more if we
do X, so let's do X", and after a cascade of 100*"+1 " we do X. :-)

Other than that, looks to be shaping up well. (Note: I don't have any
comments on subversion-specific aspects, as XEmacs went directly from
CVS to Mercurial).
_______________________________________________
Python-Dev mailing list
Python-Dev [at] python
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/list-python-dev%40lists.gossamer-threads.com


mhammond at skippinet

Jul 3, 2009, 6:31 AM

Post #6 of 83 (1832 views)
Permalink
Re: Mercurial migration: progress report (PEP 385) [In reply to]

On 3/07/2009 9:28 PM, Dirkjan Ochtman wrote:
> On Fri, Jul 3, 2009 at 01:01, Mark Hammond<skippy.hammond [at] gmail> wrote:
>
>> Although this has come up in the past, I don't recall a resolution.
>>
>> What is your plan to handle svn:eol-style? We have some files in the tree
>> which need that support and it isn't clear to me how that would work with
>> the existing win32text extension provided with current mercurial releases.
>> (I've an outstanding patch to hg which should address some of these issues,
>> but without the 'rules' being versioned I fear that would still fall short.)
>>
>
> What files would need what? Are there any files that really need to be
> \r\n on Windows and \n on Unix (and possibly \r on Mac)? I remember
> one file was discussed separately, but I think the outcome there was
> that it could just always be \r\n (since it wasn't used at all on
> non-Windows platforms). Anyway, knowing specific requirements (or
> where to find them) would help here.
>
>
>> Even more generally, how will you suggest Windows users work? Will local
>> files, in general, have windows line endings or unix? If the latter, will
>> there be hooks in-place to prevent editors on Windows 'accidently' mixing
>> eol styles? If so, this cycles back to the first question - how would we
>> know which files get treated that way?
>>
>
> There will be a server-side hook to check whitespace. People will also
> be able to install it for commit-time.
>
> I think just using \n by default everywhere is a good default (though
> I almost always use Windows client machine, I do all nearly all of my
> development through a terminal on several Linux boxen), except where
> it isn't.
So we must work without effective EOL support? I fear we will end up
like the mozilla hg repo with some files in windows line endings and
some with linux. While my editing tools are good enough to preserve
existing EOL styles, I've found myself accidentally checking in new \r\n
terminated files in a repo which otherwise uses \n line endings. IMO,
SVN's EOL support was better than no EOL support.

> People who want to use can set up the win32text stuff to get
> \r\n on Windows if they feel they need that -- we can provide
> information about that in the dev FAQ (although it would be nice if
> someone else who was more familiar with it -- like yourself! -- would
> write it).
>
This is exactly why I was asking for your advice - I can't work out how
to work effectively with win32text as it stands myself, so remain stuck
accidently checking in files with inappropriate line endings and stuck
working out how to move pywin32's CVS repo with abandoning the very
primitive EOL safety it offers...

Cheers,

Mark


dirkjan at ochtman

Jul 3, 2009, 6:43 AM

Post #7 of 83 (1820 views)
Permalink
Re: Mercurial migration: progress report (PEP 385) [In reply to]

On Fri, Jul 3, 2009 at 15:31, Mark Hammond<mhammond [at] skippinet> wrote:
> So we must work without effective EOL support?  I fear we will end up like
> the mozilla hg repo with some files in windows line endings and some with
> linux.  While my editing tools are good enough to preserve existing EOL
> styles, I've found myself accidentally checking in new \r\n terminated files
> in a repo which otherwise uses \n line endings.  IMO, SVN's EOL support was
> better than no EOL support.

This is why we'll have hooks -- to prevent you from pushing changesets
with inappropriate, to say the least, and, if you're willing to do a
little bit of extra work, to prevent you from committing them.

> This is exactly why I was asking for your advice - I can't work out how to
> work effectively with win32text as it stands myself, so remain stuck
> accidently checking in files with inappropriate line endings and stuck
> working out how to move pywin32's CVS repo with abandoning the very
> primitive EOL safety it offers...

As long as the difference between \r\n- and \n-based files is clear
and can be reasoned about, I don't see why having some of both (I'm
assuming an overwhelming majority will have one, and only a few the
other) is a big problem. But feel free to enlighten me!

Cheers,

Dirkjan
_______________________________________________
Python-Dev mailing list
Python-Dev [at] python
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/list-python-dev%40lists.gossamer-threads.com


dirkjan at ochtman

Jul 3, 2009, 6:49 AM

Post #8 of 83 (1818 views)
Permalink
Re: Mercurial migration: progress report (PEP 385) [In reply to]

On Fri, Jul 3, 2009 at 15:29, Stephen J. Turnbull<stephen [at] xemacs> wrote:
> I'll have to try them again now that 1.3 is out, but I found Mercurial
> named branches fundamentally unsuited to release management.

Can you explain why, please? It's not clear from what you say below.

> Ditto named branches.  The problem is that (unless the internal
> implementation has changed very recently) a Mercurial revision can be
> on exactly one named branch (or on the trunk).

That's still true.

> Which defeats the purpose of having named branches, really.  (I mean
> the version control purpose; obviously it still can save disk space.)

Why does it defeat the purpose? What, in your opinion, is the purpose?

> Unless you're really short on space, though, that's not a big deal.
> What would be more important to me (not that I matter for the purpose
> of Python, but in XEmacs -- also a Mercurial shop -- I do :-) would be
> the other way around: pulling an external branch into a named branch.
> I have a feeling that working with such a repository with others would
> be a little difficult.

Can you give an example?

> Stick with the CPython notation.  At XEmacs, continuity of tags has
> made our beta testers happy.  (Well, the two of them who bothered to
> mention it, anyway. :-)

Right; Benjamin also mentioned that processing the tags just to be
consistent with the recent tagging scheme would probably be the best
solution.

> As others (MvL, I think) have commented, this isn't really relevant to
> Python which generally has four mainlines going at once.  I don't see
> why the requirements are going to change with the shift to hg, and I
> see no reason why hg won't handle the existing workflow just fine.

It will handle it, for sure, but I think it would all go easier if we
could work with stricter subset branches (and leave the effective
cherrypicking for the occasional problem).

Cheers,

Dirkjan
_______________________________________________
Python-Dev mailing list
Python-Dev [at] python
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/list-python-dev%40lists.gossamer-threads.com


ncoghlan at gmail

Jul 3, 2009, 7:04 AM

Post #9 of 83 (1832 views)
Permalink
Re: Mercurial migration: progress report (PEP 385) [In reply to]

Mark Hammond wrote:
> So we must work without effective EOL support? I fear we will end up
> like the mozilla hg repo with some files in windows line endings and
> some with linux. While my editing tools are good enough to preserve
> existing EOL styles, I've found myself accidentally checking in new \r\n
> terminated files in a repo which otherwise uses \n line endings. IMO,
> SVN's EOL support was better than no EOL support.

If Mercury doesn't have automatic whitespace conversion along the lines
of svn:eol-style, then the existing white-space checking script probably
needs to be updated to enforce the appropriate line endings for all files.

If we default to Unix line endings for most files, then the checking
script can be made aware of which files should always have Windows line
endings (I believe the various Visual Studio files need them, batch
files are probably best left with Windows line endings in the
repository, and I expect there are other files in PC and PCbuild that
need them as well).

However, I expect that would still be painful to work with for Windows
developers, even if it prevented any line ending problems from making
their way into the main repository. I believe that is where the
win32text extensions can help. Looking at the Wiki page for win32text
[1], I believe it would be a matter of configuring the extension to
encode and decode all files with the extensions:

*.py
*.pyw
*.h
*.c
*.in
*.rst
*.asdl

That said, I don't see a way to tell win32text to also translate files
which don't have an extension at all (e.g. NEWS or ACKS), and there
doesn't seem to be a way to tell it to *skip* files matching certain
patterns (if there was, we could just tell it to ignore extensions like
.bat, .sln, .vcproj, .vsprops, .dps, .dsw, .wse, .ico, .bmp and convert
everything else)

Cheers,
Nick.

[1] http://mercurial.selenic.com/wiki/Win32TextExtension

--
Nick Coghlan | ncoghlan [at] gmail | Brisbane, Australia
---------------------------------------------------------------
_______________________________________________
Python-Dev mailing list
Python-Dev [at] python
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/list-python-dev%40lists.gossamer-threads.com


g.brandl at gmx

Jul 3, 2009, 8:17 AM

Post #10 of 83 (1828 views)
Permalink
Re: Mercurial migration: progress report (PEP 385) [In reply to]

Dirkjan Ochtman schrieb:

> - Fifth, here's a list of things, off the top of my head, that still need doing:
>
> * Get agreement on branch strategy and branch processing (list of
> branches + proposed handling at
> http://hg.python.org/pymigr/file/tip/all-branches.txt) <--- PLEASE
> REVIEW

Do you have a key to the second column in that file? E.g. the difference
between "strip" and "discard" is not clear to me. "strip partial"?

Why are there branch names starting with "../"?

[PEP 385]

> ==============================================================
>
> Motivation
>
> After having decided to switch to the Mercurial DVCS, the actual
> migration still has to be performed. In the case of an important piece
> of infrastructure like the version control system for a large,
> distributed project like Python, this is a significant effort. This
> PEP is an attempt to describe the steps that must be taken for further
> discussion. It's somewhat similar to PEP 347 [1], which discussed the
> migration to SVN.
>
> To make the most of hg, I (Dirkjan) would like to make a high-fidelity
> conversion, such that (a) as much of the svn metadata as possible is
> retained, and (b) all metadata is converted to formats that are common
> in Mercurial. This way, tools written for Mercurial can be optimally
> used. In order to do this, I want to use the hgsubversion [2] software
> to do an initial conversion. This hg extension is focused on providing
> high-quality conversion from Subversion to Mercurial for use in
> two-way correspondence, meaning it doesn't throw away as much
> available metadata as other solutions.
>
> Such a conversion also seems like a good time to reconsider the
> contents of the repository and determine if some things are still
> valuable. In this spirit, the following sections also propose
> discarding some of the older metadata.
> Timeline
>
> TBD; needs fully working hgsubversion and consensus on this document.
> Transition plan
> Branch strategy
>
> Mercurial has two basic ways of using branches: cloned branches, where
> each branch is kept in a separate repository, and named branches,
> where each revision keeps metadata to note on which branch it belongs.
> The former makes it easier to distinguish branches, at the expense of
> requiring more disk space on the client. The latter makes it a little
> easier to switch between branches, but often has somewhat unintuitive
> results for people (though this has been getting better in recent
> versions of Mercurial).
>
> I'm still a bit on the fence about whether Python should adopt cloned
> branches and named branches. Since it usually makes more sense to tag
> releases on the maintenance branch, for example, mainline history
> would not contain release tags if we used cloned branches. Also,
> Mercurial 1.2 and 1.3 have the necessary tools to make named branches
> less painful (because they can be properly closed and closed heads are
> no longer considered in relevant cases).
>
> A disadvantage might be that the used clones will be a good bit larger
> (since they essentially contain all other branches as well). This can
> me mitigated by keeping non-release (feature) branches in separate
> clones. Also note that it's still possible to clone a single named
> branch from a combined clone, by specifying the branch as in hg clone
> http://hg.python.org/main/#2.6-maint. Keeping the py3k history in a
> separate clone problably also makes sense.

* Does it work with "hg pull" etc. too, afterwards?
* Does it support more than one branch?

> XXX To do: size comparison for selected separation scenarios.
> Converting branches
>
> There are quite a lot of branches in SVN's branches directory. I
> propose to clean this up a bit, by employing the following the
> strategy:
>
> * Keep all release (maintenance) branches
> * Discard branches that haven't been touched in 18 months, unless
> somone indicates there's still interest in such a branch
> * Keep branches that have been touched in the last 18 months,
> unless someone indicates the branch can be deprecated

I would just kill all feature branches unless someone indicates it is
still used. There are very few active feature branches.

(I guess in the case a branch gets killed erroneously it could still be
re-created after the conversion?)

> Converting tags
>
> The SVN tags directory contains a lot of old stuff. Some of these are
> not, in fact, full tags, but contain only a smaller subset of the
> repository. I think we should keep all release tags, and consider
> other tags for inclusion based on requests from the developer
> community. I'd like to consider unifying the release tag naming scheme
> to make some things more consistent, if people feel that won't create
> too many problems. For example, Mercurial itself just uses '1.2.1' as
> a tag, where CPython would currently use r121.

+1 for readable tag names.
+1 for throwing out old questionable tag names.

> Generating .hgignore
>
> The .hgignore file can be used in Mercurial repositories to help
> ignore files that are not eligible for version control. It does this
> by employing several possible forms of pattern matching. The current
> Python repository already includes a rudimentary .hgignore file to
> help with using the hg mirrors.
>
> It might be useful to have the .hgignore be generated automatically
> from svn:ignore properties. This would make sure all historic
> revisions also have useful ignore information (though one could argue
> ignoring isn't really relevant to just checking out an old revision).

I guess that's not necessary. People can just add stuff to .hgignore
when they see something that should be there.

> hg-ssh
>
> Developers should access the repositories through ssh, similar to the
> current setup. Public keys can be used to grant people access to a
> shared hg@ account. A hgwebdir instance should also be set up for easy
> browsing and read-only access. If we're using ssh, developers should
> trivially be able to start new clones (for longer-term features that
> profit from a separate branch).

+1.

> Hooks
>
> A number of hooks is currently in use. The hg equivalents for these
> should be developed and deployed. The following hooks are being used:
>
> * check whitespace: a hook to reject commits in case the
> whitespace doesn't match the rules for the Python codebase. Should be
> straightforward to re-implement from the current version. We can also
> offer a whitespace hook for use with client-side repositories that
> people can use; it could either warn about whitespace issues and/or
> truncate trailing whitespace from changed lines. Open issue: do we
> check only the tip after each push, or do we check every commit in a
> changegroup?

Only checking the tip would make it possible for people to revert their
whitespace commits, but then -- if they have the local hook -- they
shouldn't do that anyway.

> * commit mails: we can leverage the notify extension for this

As long as it can send diffs...

> * check contributors: in the current setup, all changesets bear
> the username of committers, who must have signed the contributor
> agreement. In a DVCS, the committers are not necessarily the same
> people who push, and so we can't check if the committer is a
> contributor. We could use a hook to check if the committer is a
> contributor if we keep a list of registered contributors.

That gets very ugly as soon as you start pulling from repos that just
fix a small typo or so.

> code.python.org was also proposed as the hostname. Personally, I think
> that using the VCS name in the hostname is good because it prevents
> confusion: it should be clear that you can't use svn or bzr for
> hg.python.org.

Yes, and it mirrors svn.python.org.

> Mercurial itself has a stable branch which is a ''strict'' subset of
> the unstable branch. This means that generally all fixes for the
> stable branch get committed against the tip of the stable branch, then
> they get merged into the unstable branch (which already contains the
> parent of the new cset). This provides a largely frictionless
> environment for moving changes from stable to unstable branches.
> Mistakes, where a change that should go on stable goes on unstable
> first, do happen, but they're usually easy to fix. That can be done by
> copying the change over to the stable branch, then trivial-merging
> with unstable -- meaning the merge in fact ignores the parent from the
> stable branch).
>
> This strategy means a little more work for regular committers, because
> they have to think about whether their change should go on stable or
> unstable; they may even have to ask someone else (the RM) before
> committing. But it also relieves a dedicated group of committers of
> regular backporting duty, in addition to making it easier to work with
> the tool.

Strong +1 for that.

> I propose that the revision identifier will be the short version of
> hg's revision hash, for example 'dd3ebf81af43', augmented with '+'
> (instead of 'M') if the working directory from which it was built was
> modified. This mirrors the output of the hg id command, which is
> intended for this kind of usage.
>
> For the tag/branch identifier, I propose that hg will check for tags
> on the currently checked out revision, use the tag if there is one
> ('tip' doesn't count), and uses the branch name otherwise.
> sys.subversion becomes
>
> * ('CPython', '2.6.2', 'dd3ebf81af43')
> * ('CPython', 'default', 'af694c6a888c+')
>
> and the build info string becomes
>
> * '2.6.2:dd3ebf81af43, Jun 2 2009, 09:58:33'
> * 'default:af694c6a888c+, Jun 2 2009, 01:24:14'
>
> This reflects that the default branch in hg is called 'default'
> instead of Subversion's 'trunk', and reflects the proposed new tag
> format.

Looks good to me.

cheers,
Georg

_______________________________________________
Python-Dev mailing list
Python-Dev [at] python
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/list-python-dev%40lists.gossamer-threads.com


brett at python

Jul 3, 2009, 11:04 AM

Post #11 of 83 (1815 views)
Permalink
Re: Mercurial migration: progress report (PEP 385) [In reply to]

On Thu, Jul 2, 2009 at 13:42, Dirkjan Ochtman <dirkjan [at] ochtman> wrote:

> In response to some rumblings on python-committers and just to request
> more feedback, a progress report. I know it's long, I've tried to put
> to keep it concise and chunked, though.
>
> - First of all, I've got the basic conversion down, I've done it a few
> times now, with progressively better results. You can view some
> results at http://hg.python.org/, which has a preliminary cpython
> repository. *** The changeset hashes for that repo will change, so you
> won't be able to commit or pull from it in the future.***
>
> - Second of all, some planning. I've thought about it a bit, and I
> think we should aim for going live with hg on August 1. Given that I'm
> on vacation from 8-18 July (and I'm not sure whether I'll be able to
> actually work on it during that time, though I imagine I'll be able to
> spend some time on it at least), that's quite ambitious, so I'm going
> to say it's okay if it slips by a few days. Putting a deadline out
> there is a good thing, anyway.
>

Fine by me as long as people realize that if anything is questionable then
the switch will not happen. Getting this right takes precedence over any
deadline. And obviously we will need to do at least one live conversion on
python.org hardware to make sure everything will work smoothly.


>
> - Third of all, to make this possible, it would be helpful if I got
> more feedback on the PEP. Last time I raised it, there was virtually
> nothing. This time, I'll include it inline so there's hopefully less
> of a barrier to reviewing it.
>
> - Fourth, Mercurial 1.3 was just released! Bet you didn't see that
> coming. It's looking like a pretty good release, with an experimental
> version of the much-coveted subrepository support (like
> svn:externals). This also means that the latest version of
> hgsubversion, the tool I used for the conversion, will be more
> accessible for converting other projects. You know you want to!
>

And will make the idea of splitting out the standard library and tests a
reasonable thing to do.


>
> - Fifth, here's a list of things, off the top of my head, that still need
> doing:
>
> * Get agreement on branch strategy and branch processing (list of
> branches + proposed handling at
> http://hg.python.org/pymigr/file/tip/all-branches.txt) <--- PLEASE
> REVIEW
> * Get agreement on tag processing (first come up with a list)
> * Set up hg-ssh infra (should be easy)
> * Set up hooks (should be mostly straightforward)
> * Set up roundup integration (should be made easier by quick revision
> map hgweb extension)
> * Write docs
>
> - Sixth (this is the good part), less obvious things that have been
> done or don't need doing:
>
> * .hgignore generation (I've been convinced it's too hard, the current
> version will do)


Yeah, we can do this manually.


>
> * revlog reordering (it's painless and a big win)
>
> I'll get through all of these myself, but obviously any help would be
> welcome. For any hg users, writing docs should be an easy start. For
> others, please review the PEP (below), the branch map in
> http://hg.python.org/pymigr/file/tip/all-branches.txt and the author
> map at http://hg.python.org/pymigr/file/tip/author-map (not that much
> has changed since the start, so if you've looked at it already, feel
> free to skip this part). Right now I'm a little stuck on branch
> processing, because it's a long running script that needs a bunch of
> debugging, but I'll get going on that again.
>
> I think that's all I can think of for now, I'll update the PEP with
> new bits soon. Here it is, ready for your review:
>
> ==============================================================
>
> Motivation
>
> After having decided to switch to the Mercurial DVCS, the actual
> migration still has to be performed. In the case of an important piece
> of infrastructure like the version control system for a large,
> distributed project like Python, this is a significant effort. This
> PEP is an attempt to describe the steps that must be taken for further
> discussion. It's somewhat similar to PEP 347 [1], which discussed the
> migration to SVN.
>
> To make the most of hg, I (Dirkjan) would like to make a high-fidelity
> conversion, such that (a) as much of the svn metadata as possible is
> retained, and (b) all metadata is converted to formats that are common
> in Mercurial. This way, tools written for Mercurial can be optimally
> used. In order to do this, I want to use the hgsubversion [2] software
> to do an initial conversion. This hg extension is focused on providing
> high-quality conversion from Subversion to Mercurial for use in
> two-way correspondence, meaning it doesn't throw away as much
> available metadata as other solutions.
>
> Such a conversion also seems like a good time to reconsider the
> contents of the repository and determine if some things are still
> valuable. In this spirit, the following sections also propose
> discarding some of the older metadata.
> Timeline
>
> TBD; needs fully working hgsubversion and consensus on this document.
> Transition plan
> Branch strategy
>
> Mercurial has two basic ways of using branches: cloned branches, where
> each branch is kept in a separate repository, and named branches,
> where each revision keeps metadata to note on which branch it belongs.
> The former makes it easier to distinguish branches, at the expense of
> requiring more disk space on the client. The latter makes it a little
> easier to switch between branches, but often has somewhat unintuitive
> results for people (though this has been getting better in recent
> versions of Mercurial).
>
> I'm still a bit on the fence about whether Python should adopt cloned
> branches and named branches. Since it usually makes more sense to tag
> releases on the maintenance branch, for example, mainline history
> would not contain release tags if we used cloned branches. Also,
> Mercurial 1.2 and 1.3 have the necessary tools to make named branches
> less painful (because they can be properly closed and closed heads are
> no longer considered in relevant cases).
>
> A disadvantage might be that the used clones will be a good bit larger
> (since they essentially contain all other branches as well). This can
> me mitigated by keeping non-release (feature) branches in separate
> clones. Also note that it's still possible to clone a single named
> branch from a combined clone, by specifying the branch as in hg clone
> http://hg.python.org/main/#2.6-maint. Keeping the py3k history in a
> separate clone problably also makes sense.
>

While I really like the idea of using named branches for each release so
that there is a single py3k branch that contains all relevant history for
every release, I think we should start simple and go with cloned branches.
That way the workflow does not radically shift from what we do now for svn
to start. Once the conversion is done and people are comfortable with hg we
can then discuss moving towards a named branch approach.


>
> XXX To do: size comparison for selected separation scenarios.
> Converting branches
>
> There are quite a lot of branches in SVN's branches directory. I
> propose to clean this up a bit, by employing the following the
> strategy:
>
> * Keep all release (maintenance) branches
> * Discard branches that haven't been touched in 18 months, unless
> somone indicates there's still interest in such a branch
> * Keep branches that have been touched in the last 18 months,
> unless someone indicates the branch can be deprecated
>

Sounds reasonable to me. We can just make a list and send it to
python-committers to make final decisions of what should stick around.


>
> Converting tags
>
> The SVN tags directory contains a lot of old stuff. Some of these are
> not, in fact, full tags, but contain only a smaller subset of the
> repository. I think we should keep all release tags, and consider
> other tags for inclusion based on requests from the developer
> community. I'd like to consider unifying the release tag naming scheme
> to make some things more consistent, if people feel that won't create
> too many problems. For example, Mercurial itself just uses '1.2.1' as
> a tag, where CPython would currently use r121.


I don't use tags so I don't really care, but in the name of easy transition
I say we don't change the naming scheme (although I have no issue dropping
obscure tags).


>
> Author map
>
> In order to provide user names the way they are common in hg (in the
> 'First Last <user [at] example>' format), we need an author map to map
> cvs and svn user names to real names and their email addresses. I have
> a complete version of such a map in my migration tools repository [3].
> The email addresses in it might be out of date; that's bound to
> happen, although it would be nice to try and have as many people as
> possible review it for addresses that are out of date. The current
> version also still seems to contain some encoding problems.


Something else that can go out to python-committers before the switch.


>
> Generating .hgignore
>
> The .hgignore file can be used in Mercurial repositories to help
> ignore files that are not eligible for version control. It does this
> by employing several possible forms of pattern matching. The current
> Python repository already includes a rudimentary .hgignore file to
> help with using the hg mirrors.
>
> It might be useful to have the .hgignore be generated automatically
> from svn:ignore properties. This would make sure all historic
> revisions also have useful ignore information (though one could argue
> ignoring isn't really relevant to just checking out an old revision).


Don't bother with anything automatic. We can change the .hgignore file by
hand. We all know glob and regex syntax. =)


>
> Revlog reordering
>
> As an optional optimization technique, we should consider trying a
> reordering pass on the revlogs (internal Mercurial files) resulting
> from the conversion. In some cases this results in dramatic decreases
> in on-disk repository size.


Fine by me.


>
> Other repositories
>
> Richard Tew has indicated that he'd like the Stackless repository to
> also be converted. What other projects in the svn.python.org
> repository should be converted? Do we want to convert the peps
> repository? distutils? others?


I don't think there is a single project we host -- all two of them -- that
have not said they want to convert. So I say convert everything and let's
turn off the svn server by the end of the year.


>
> Infrastructure
> hg-ssh
>
> Developers should access the repositories through ssh, similar to the
> current setup. Public keys can be used to grant people access to a
> shared hg@ account. A hgwebdir instance should also be set up for easy
> browsing and read-only access. If we're using ssh, developers should
> trivially be able to start new clones (for longer-term features that
> profit from a separate branch).
> Hooks
>
> A number of hooks is currently in use. The hg equivalents for these
> should be developed and deployed. The following hooks are being used:
>
> * check whitespace: a hook to reject commits in case the
> whitespace doesn't match the rules for the Python codebase. Should be
> straightforward to re-implement from the current version. We can also
> offer a whitespace hook for use with client-side repositories that
> people can use; it could either warn about whitespace issues and/or
> truncate trailing whitespace from changed lines. Open issue: do we
> check only the tip after each push, or do we check every commit in a
> changegroup?
> * commit mails: we can leverage the notify extension for this
> * buildbots: both the regular and the community build masters must
> be notified. Fortunately buildbot includes support for hg. I've also
> implemented this for Mercurial itself, so I don't expect problems
> here.
> * check contributors: in the current setup, all changesets bear
> the username of committers, who must have signed the contributor
> agreement. In a DVCS, the committers are not necessarily the same
> people who push, and so we can't check if the committer is a
> contributor. We could use a hook to check if the committer is a
> contributor if we keep a list of registered contributors.


Can we check these scripts into the repository itself? That way there is a
chance of reuse as hg commands, e.g. ``hg pydev-ci`` as a replacement for
``make patchcheck``.


>
>
> hgwebdir
>
> A more or less stock hgwebdir installation should be set up. We might
> want to come up with a style to match the Python website. It may also
> be useful to build a quick extension to augment the URL rev parser so
> that it can also take r[0-9]+ args and come up with the matching hg
> revision.
> After migration
> Where to get code
>
> It needs to be decided where the hg repositories will live. I'd like
> to propose to keep the hgwebdir instance at hg.python.org. This is an
> accepted standard for many organizations, and an easy parallel to
> svn.python.org. The 2.7 (trunk) repo might live at
> http://hg.python.org/main/, for example, with py3k at
> http://hg.python.org/py3k/. For write access, developers will have to
> use ssh, which could be ssh://hg [at] hg/main/. A demo
> installation will be set up with a preliminary conversion so people
> can experiment and review; it can live at
> http://hg.python.org/example/.
>
> code.python.org was also proposed as the hostname. Personally, I think
> that using the VCS name in the hostname is good because it prevents
> confusion: it should be clear that you can't use svn or bzr for
> hg.python.org.
>

How about hg.python.org for the official branches and we keep
code.python.org for personal branches of the developers like we have done
with the bzr experiments?


>
> hgwebdir can already provide tarballs for every changeset. I think
> this obviates the need for daily snapshots; we can just point users to
> tip.tar.gz instead, meaning they will get the latest. If desired, we
> could even use buildbot results to point to the last good changeset.


I like the stable buildbot tarball idea.


>
> Python-specific documentation
>
> hg comes with good built-in documentation (available through hg help)
> and a wiki [4] that's full of useful information and recipes. In
> addition to that, the parts of the developer FAQ [5] concerning
> version control will gain a section on using hg for Python
> development. Some of the text will be dependent on the outcome of
> debate about this PEP (for example, the branching strategy).
>
> Think first, commit later?
>
> In recent history, old versions of Python have been maintained by a
> select group of people backporting patches from trunk to release
> branches. While this may not scale so well as the development pace
> grows, it also runs into some problems with the current crop of
> distributed versioning tools. These tools (I believe similar problems
> would exist for either git, bzr, or hg, though some may cope better
> than others) are based on the idea of a Directed Acyclic Graph (or
> DAG), meaning they keep track of relations of changesets.
>
> Mercurial itself has a stable branch which is a ''strict'' subset of
> the unstable branch. This means that generally all fixes for the
> stable branch get committed against the tip of the stable branch, then
> they get merged into the unstable branch (which already contains the
> parent of the new cset). This provides a largely frictionless
> environment for moving changes from stable to unstable branches.
> Mistakes, where a change that should go on stable goes on unstable
> first, do happen, but they're usually easy to fix. That can be done by
> copying the change over to the stable branch, then trivial-merging
> with unstable -- meaning the merge in fact ignores the parent from the
> stable branch).
>
> This strategy means a little more work for regular committers, because
> they have to think about whether their change should go on stable or
> unstable; they may even have to ask someone else (the RM) before
> committing. But it also relieves a dedicated group of committers of
> regular backporting duty, in addition to making it easier to work with
> the tool.
>
> Now would be a good time to consider changing strategies in this
> regard, although it would be relatively easy to switch to such a model
> later on.


As I have said, we should change our workflow habits after the switch and
people are comfortable with hg.


>
> The future of Subversion
>
> What happens to the Subversion repositories after the migration? Since
> the svn server contains a bunch of repositories, not just the CPython
> one, it will probably live on for a bit as not every project may want
> to migrate or it takes longer for other projects to migrate. To
> prevent people from staying behind, we may want to remove migrated
> projects from the repository.
> Build identification
>
> Python currently provides the sys.subversion tuple to allow Python
> code to find out exactly what version of Python it's running against.
> The current version looks something like this:
>
> * ('CPython', 'tags/r262', '71600')
> * ('CPython', 'trunk', '73128M')
>
> Another value is returned from Py_GetBuildInfo() in the C API, and
> available to Python code as part of sys.version:
>
> * 'r262:71600, Jun 2 2009, 09:58:33'
> * 'trunk:73128M, Jun 2 2009, 01:24:14'
>
> I propose that the revision identifier will be the short version of
> hg's revision hash, for example 'dd3ebf81af43', augmented with '+'
> (instead of 'M') if the working directory from which it was built was
> modified. This mirrors the output of the hg id command, which is
> intended for this kind of usage.
>
> For the tag/branch identifier, I propose that hg will check for tags
> on the currently checked out revision, use the tag if there is one
> ('tip' doesn't count), and uses the branch name otherwise.
> sys.subversion becomes
>
> * ('CPython', '2.6.2', 'dd3ebf81af43')
> * ('CPython', 'default', 'af694c6a888c+')
>
> and the build info string becomes
>
> * '2.6.2:dd3ebf81af43, Jun 2 2009, 09:58:33'
> * 'default:af694c6a888c+, Jun 2 2009, 01:24:14'
>
> This reflects that the default branch in hg is called 'default'
> instead of Subversion's 'trunk', and reflects the proposed new tag
> format.


Should we consider adding a sys.revision attribute and begin the deprecation
of sys.subversion?


>
> References
> [1] http://www.python.org/dev/peps/pep-0347/
> [2] http://bitbucket.org/durin42/hgsubversion/
> [3] http://hg.xavamedia.nl/cpython/pymigr/
> [4] http://www.selenic.com/mercurial/wiki/
> [5] http://www.python.org/dev/faq/#version-control
>
> =====================================================
>
> Cheers,
>
> Dirkjan
> _______________________________________________
> Python-Dev mailing list
> Python-Dev [at] python
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> http://mail.python.org/mailman/options/python-dev/brett%40python.org
>


tjreedy at udel

Jul 3, 2009, 11:22 AM

Post #12 of 83 (1810 views)
Permalink
Re: Mercurial migration: progress report (PEP 385) [In reply to]

Dirkjan Ochtman wrote:

> It needs to be decided where the hg repositories will live. I'd like
> to propose to keep the hgwebdir instance at hg.python.org. This is an
> accepted standard for many organizations, and an easy parallel to
> svn.python.org. The 2.7 (trunk) repo might live at
> http://hg.python.org/main/, for example, with py3k at
> http://hg.python.org/py3k/.

I would very much like the 'k' dropped from the py3 name. It was a funny
joke when py3 was vaporware, now it is excess baggage which only puzzles
non-insiders and newcomers.

I think the two repos should be either symmetrically named

hg.python.org/py2
hg.python.org/py3

If one must be designated 'main', it should be py3.

Continuing to call py2 'main' will continue to discourage use of py3.

Terry Jan Reedy

_______________________________________________
Python-Dev mailing list
Python-Dev [at] python
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/list-python-dev%40lists.gossamer-threads.com


python at mrabarnett

Jul 3, 2009, 11:41 AM

Post #13 of 83 (1812 views)
Permalink
Re: Mercurial migration: progress report (PEP 385) [In reply to]

Terry Reedy wrote:
> Dirkjan Ochtman wrote:
>
>> It needs to be decided where the hg repositories will live. I'd like
>> to propose to keep the hgwebdir instance at hg.python.org. This is an
>> accepted standard for many organizations, and an easy parallel to
>> svn.python.org. The 2.7 (trunk) repo might live at
>> http://hg.python.org/main/, for example, with py3k at
>> http://hg.python.org/py3k/.
>
> I would very much like the 'k' dropped from the py3 name. It was a funny
> joke when py3 was vaporware, now it is excess baggage which only puzzles
> non-insiders and newcomers.
>
> I think the two repos should be either symmetrically named
>
> hg.python.org/py2
> hg.python.org/py3
>
> If one must be designated 'main', it should be py3.
>
> Continuing to call py2 'main' will continue to discourage use of py3.
>
We could regard py3k as the phase from the original concept of Python 3
to its 'prototype', Python 3.0. Python 3.1 would be the first
'real/usable' version.
_______________________________________________
Python-Dev mailing list
Python-Dev [at] python
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/list-python-dev%40lists.gossamer-threads.com


aahz at pythoncraft

Jul 3, 2009, 12:06 PM

Post #14 of 83 (1805 views)
Permalink
Re: Mercurial migration: progress report (PEP 385) [In reply to]

On Fri, Jul 03, 2009, Brett Cannon wrote:
>
> Should we consider adding a sys.revision attribute and begin the deprecation
> of sys.subversion?

+1
--
Aahz (aahz [at] pythoncraft) <*> http://www.pythoncraft.com/

"as long as we like the same operating system, things are cool." --piranha
_______________________________________________
Python-Dev mailing list
Python-Dev [at] python
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/list-python-dev%40lists.gossamer-threads.com


martin at v

Jul 3, 2009, 1:36 PM

Post #15 of 83 (1804 views)
Permalink
Re: Mercurial migration: progress report (PEP 385) [In reply to]

> So we must work without effective EOL support?

If that's the case (i.e. no effective EOL support, the way svn
supported it), then I think the PEP should make that clear (e.g.
in a discussion section).

For the server-side hooks: it would be good to know exactly
what they check, wrt. line endings.

To find out what files should not be stored with LF line endings,
do "svn pg -R svn:eol-style .|grep CRLF".

For win32text, it would probably be good if the FAQ would provide
the relevant configuration instructions; it would be really helpful
if somebody familiar with Windows and hg could provide detailed
instructions well in advance of August 1.

If we don't have anybody familiar with Windows and hg, we have a
really serious problem.

Regards,
Martin
_______________________________________________
Python-Dev mailing list
Python-Dev [at] python
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/list-python-dev%40lists.gossamer-threads.com


martin at v

Jul 3, 2009, 1:38 PM

Post #16 of 83 (1804 views)
Permalink
Re: Mercurial migration: progress report (PEP 385) [In reply to]

>> This is exactly why I was asking for your advice - I can't work out how to
>> work effectively with win32text as it stands myself, so remain stuck
>> accidently checking in files with inappropriate line endings and stuck
>> working out how to move pywin32's CVS repo with abandoning the very
>> primitive EOL safety it offers...
>
> As long as the difference between \r\n- and \n-based files is clear
> and can be reasoned about, I don't see why having some of both (I'm
> assuming an overwhelming majority will have one, and only a few the
> other) is a big problem. But feel free to enlighten me!

If "both" means "both the server side test, and win32text", then Mark
already gave the answer: he cannot use win32test because he does not
know how to operate it (and doesn't bother studying its source code
to understand it in detail).

Regards,
Martin
_______________________________________________
Python-Dev mailing list
Python-Dev [at] python
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/list-python-dev%40lists.gossamer-threads.com


tjreedy at udel

Jul 3, 2009, 1:45 PM

Post #17 of 83 (1804 views)
Permalink
Re: Mercurial migration: progress report (PEP 385) [In reply to]

MRAB wrote:
> Terry Reedy wrote:
>> Dirkjan Ochtman wrote:
>>
>>> It needs to be decided where the hg repositories will live. I'd like
>>> to propose to keep the hgwebdir instance at hg.python.org. This is an
>>> accepted standard for many organizations, and an easy parallel to
>>> svn.python.org. The 2.7 (trunk) repo might live at
>>> http://hg.python.org/main/, for example, with py3k at
>>> http://hg.python.org/py3k/.
>>
>> I would very much like the 'k' dropped from the py3 name. It was a
>> funny joke when py3 was vaporware, now it is excess baggage which only
>> puzzles non-insiders and newcomers.
>>
>> I think the two repos should be either symmetrically named
>>
>> hg.python.org/py2
>> hg.python.org/py3
>>
>> If one must be designated 'main', it should be py3.
>>
>> Continuing to call py2 'main' will continue to discourage use of py3.
>>
> We could regard py3k as the phase from the original concept of Python 3
> to its 'prototype', Python 3.0.

Right. And that phase is over, especially with Barry posting today on
python-list that there will be no more 3.0.x releases ever.

> Python 3.1 would be the first 'real/usable' version.

Right. 'is'. as Barry also posted.

tjr

_______________________________________________
Python-Dev mailing list
Python-Dev [at] python
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/list-python-dev%40lists.gossamer-threads.com


martin at v

Jul 3, 2009, 1:49 PM

Post #18 of 83 (1802 views)
Permalink
Re: Mercurial migration: progress report (PEP 385) [In reply to]

> It will handle it, for sure, but I think it would all go easier if we
> could work with stricter subset branches (and leave the effective
> cherrypicking for the occasional problem).

So I think the PEP should propose a workflow (or: merge flow) if you
think we would be better off with a different one.

In proposing such a workflow, consider these requirements:
- we current have four active "maintenance" branches (i.e. where
the entire code basis evolves): trunk, 3k, 2.6, and 3.1 (3.0
also until this morning).
- in addition, we have two security branches currently: 2.4 and
2.5, although 2.4 will be closed soon.
- our committers consistently refuse to merge changes across
branches themselves, and likely continue to do so unless there
is some feature of hg that I missed (e.g. one were merging
would happen without any user specifically asking for it)

Regards,
Martin
_______________________________________________
Python-Dev mailing list
Python-Dev [at] python
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/list-python-dev%40lists.gossamer-threads.com


martin at v

Jul 3, 2009, 2:00 PM

Post #19 of 83 (1806 views)
Permalink
Re: Mercurial migration: progress report (PEP 385) [In reply to]

> Should we consider adding a sys.revision attribute and begin the
> deprecation of sys.subversion?

I wouldn't mind killing sys.subversion "right away" (i.e. in trunk
and 3k - obviously it has to stay in 2.6 and 3.1, and all the older
branches).

I'm -1 on calling it "sys.revision", as this makes it difficult to
tell what the actual versioning system was, and hence how the
data should be interpreted. It will already be a problem for 2.6,
when 2.6.3 will currently have a sys.subversion[2] of 'dd3ebf81af43',
which will surely crash existing applications.

I'm not sure what the motivation for a sys.revision is; it's
probably similar to the desire of calling the machine code.python.org
(instead of hg.python.org). It gives the illusion of being agnostic
of the actual RCS being used. However, this is a complete illusion:
anybody using it (either code.python.org, or sys.revision), *cannot*
be agnostic of the specific technology.

Regards,
Martin
_______________________________________________
Python-Dev mailing list
Python-Dev [at] python
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/list-python-dev%40lists.gossamer-threads.com


brett at python

Jul 3, 2009, 2:39 PM

Post #20 of 83 (1797 views)
Permalink
Re: Mercurial migration: progress report (PEP 385) [In reply to]

On Fri, Jul 3, 2009 at 14:00, "Martin v. Löwis" <martin [at] v> wrote:

> > Should we consider adding a sys.revision attribute and begin the
> > deprecation of sys.subversion?
>
> I wouldn't mind killing sys.subversion "right away" (i.e. in trunk
> and 3k - obviously it has to stay in 2.6 and 3.1, and all the older
> branches).
>
> I'm -1 on calling it "sys.revision", as this makes it difficult to
> tell what the actual versioning system was, and hence how the
> data should be interpreted. It will already be a problem for 2.6,
> when 2.6.3 will currently have a sys.subversion[2] of 'dd3ebf81af43',
> which will surely crash existing applications.
>
> I'm not sure what the motivation for a sys.revision is; it's
> probably similar to the desire of calling the machine code.python.org
> (instead of hg.python.org). It gives the illusion of being agnostic
> of the actual RCS being used. However, this is a complete illusion:
> anybody using it (either code.python.org, or sys.revision), *cannot*
> be agnostic of the specific technology.


We could add another value in the tuple that specifies the VCS: ('CPython',
'branches/release25-maint', '61464', 'svn'). I agree that VCSs are not
universally the same, but the concept of a revision is universal.

-Brett


brett at python

Jul 3, 2009, 2:41 PM

Post #21 of 83 (1799 views)
Permalink
Re: Mercurial migration: progress report (PEP 385) [In reply to]

On Fri, Jul 3, 2009 at 11:22, Terry Reedy <tjreedy [at] udel> wrote:

> Dirkjan Ochtman wrote:
>
> It needs to be decided where the hg repositories will live. I'd like
>> to propose to keep the hgwebdir instance at hg.python.org. This is an
>> accepted standard for many organizations, and an easy parallel to
>> svn.python.org. The 2.7 (trunk) repo might live at
>> http://hg.python.org/main/, for example, with py3k at
>> http://hg.python.org/py3k/.
>>
>
> I would very much like the 'k' dropped from the py3 name. It was a funny
> joke when py3 was vaporware, now it is excess baggage which only puzzles
> non-insiders and newcomers.
>

Is it really that confusing? I have never heard of anyone asking "what is
py3k?" Plus I like keeping that bit of Python history around. I know I still
use py3k as shorthand for Python 3.x. And we are not that serious of a
bunch. =)


>
> I think the two repos should be either symmetrically named
>
> hg.python.org/py2
> hg.python.org/py3
>

If we make it universal I say it should be '2.x' and '3.x'. The whole 'py'
prefix is redundant.


>
> If one must be designated 'main', it should be py3.
>
> Continuing to call py2 'main' will continue to discourage use of py3.


Yeah, 2.x shouldn't be special anymore.


martin at v

Jul 3, 2009, 2:52 PM

Post #22 of 83 (1800 views)
Permalink
Re: Mercurial migration: progress report (PEP 385) [In reply to]

> We could add another value in the tuple that specifies the VCS:
> ('CPython', 'branches/release25-maint', '61464', 'svn'). I agree that
> VCSs are not universally the same, but the concept of a revision is
> universal.

Actually, I think that's not the case. For bzr, the usual way of
identifying a revision is by revision number, which, however, is not
unique within a project, as each branch will use contiguous integers
for numbers. There are also unique identifications - so a bzr revision
has actually two numbers.

More general, in a DVCS, it is not possible to access the revision being
referred to by such a tuple. For sys.subversion, if [0]=='CPython', then
you could go to svn.python.org. For a DVCS, the revision being
identified may not be publically available, or may not live on a host
that you can infer from your proposed sys.revision.

For cloned branches, I wonder how sys.revision[1] would be computed.

Regards,
Martin
_______________________________________________
Python-Dev mailing list
Python-Dev [at] python
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/list-python-dev%40lists.gossamer-threads.com


brett at python

Jul 3, 2009, 2:53 PM

Post #23 of 83 (1795 views)
Permalink
Re: Mercurial migration: progress report (PEP 385) [In reply to]

On Fri, Jul 3, 2009 at 14:52, "Martin v. Löwis" <martin [at] v> wrote:

> > We could add another value in the tuple that specifies the VCS:
> > ('CPython', 'branches/release25-maint', '61464', 'svn'). I agree that
> > VCSs are not universally the same, but the concept of a revision is
> > universal.
>
> Actually, I think that's not the case. For bzr, the usual way of
> identifying a revision is by revision number, which, however, is not
> unique within a project, as each branch will use contiguous integers
> for numbers. There are also unique identifications - so a bzr revision
> has actually two numbers.
>
> More general, in a DVCS, it is not possible to access the revision being
> referred to by such a tuple. For sys.subversion, if [0]=='CPython', then
> you could go to svn.python.org. For a DVCS, the revision being
> identified may not be publically available, or may not live on a host
> that you can infer from your proposed sys.revision.
>
> For cloned branches, I wonder how sys.revision[1] would be computed.


So are you saying we should drop the idea of a revision value altogether, or
just embrace the differences and add a sys.mercurial attribute?

-Brett


martin at v

Jul 3, 2009, 2:59 PM

Post #24 of 83 (1796 views)
Permalink
Re: Mercurial migration: progress report (PEP 385) [In reply to]

> So are you saying we should drop the idea of a revision value
> altogether, or just embrace the differences and add a sys.mercurial
> attribute?

That's what I would propose. It should be a best-effort(*) approach at
providing all information that is needed to really find the source
used for the specific version.

Regards,
Martin

(*) even for svn it was best-effort only in case there were local
modifications.
_______________________________________________
Python-Dev mailing list
Python-Dev [at] python
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/list-python-dev%40lists.gossamer-threads.com


dirkjan at ochtman

Jul 3, 2009, 3:00 PM

Post #25 of 83 (1797 views)
Permalink
Re: Mercurial migration: progress report (PEP 385) [In reply to]

On Fri, Jul 3, 2009 at 23:41, Brett Cannon<brett [at] python> wrote:
> If we make it universal I say it should be '2.x' and '3.x'. The whole 'py'
> prefix is redundant.

Right, I was aiming for /python/2.x and /python/3.x as well.

Actually, I currently have /cpython to also make CPython less special
among it's peers, but that idea was met with some resistance on
#python-dev.

Cheers,

Dirkjan
_______________________________________________
Python-Dev mailing list
Python-Dev [at] python
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/list-python-dev%40lists.gossamer-threads.com

First page Previous page 1 2 3 4 Next page Last page  View All Python dev RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.