Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Gentoo: OSX

Package testing -- Automated initiative

 

 

Gentoo osx RSS feed   Index | Next | Previous | View Threaded


grobian at gentoo

Aug 14, 2005, 6:21 AM

Post #1 of 4 (692 views)
Permalink
Package testing -- Automated initiative

Introduction
============

Recently, once again we were confronted with a package marked as
ppc-macos stable, while it didn't compile at all, let alone run. It is
believed more of these packages are in portage, and need to be found and
fixed. Keeping the cause of why they are marked stable up to another
discussion, and out of the scope of this discussion, I will focus on how
to track these packages down and report them to us.

In the secondary line, all 'unstable' packages, marked ~ppc-macos should
be tested as well, since they can be faulty as well. Since for OSX much
is in ~ppc-macos, many users consider it a normal procedure to switch to
the unstable side of portage, hence some extra need for careful testing
of ~ppc-macos also.


Proposed Global Structure
=========================

Testing should be done on a regular basis, both push and pull based.
This means that the testing machine would start testing packages itself
if it is out of work, and on the other hand starts testing packages as
soon as they are being added/changed in CVS. It may need no great
imagination to see that the latter 'push-based' activity has priority
over the 'pull-based' work.

Starting over, will for the test machine mean that it starts cleaning
out its world file. Cleaning this file out to a bare minimum is an
important aspect of getting a test environment that reflects the
situation on new user's machines. If an ebuild uses a package without
having it in it's DEPEND, this may get noticed only when starting on a
clean machine. This, however, will add a big delay in testing as many
packages will need to be built prior the right package can be installed.

The testing machine will have a queue file, which it reads packages to
emerge from. If the queue file is empty, i.e. when there is no push
based work, the machine will generate work by starting to compile
uncompiled packages, or emptying the tree.

Because ~ppc-macos and ppc-macos packages interfere with each other -- a
~ppc-macos package overwrites a ppc-macos package -- both stable and
unstable have to be dealt with separately, i.e. they should both have
their own environment either via two separate machines, or through the
use of a chroot jail.


Queues
------

In order not to drag in a full DBMS (in the end Portage already is one)
queues are just simple flat files consisting of absolute package names,
one per line. Table wise locking granularity is handled by the OS as
one process opens the file in write mode. Consumers -- the testing box
in this case -- read the first line and delete it, while producers
simply add one line (or more) to the end of the file.

The queue itself, is more a set than a list. This means that packages
that are in the queue, should be unique. If a package is added that is
already in the queue, it is dropped such that the original queue
position of the package is maintained.


CVS Producer
------------

To catch up automatically with changes made to the tree, it is necessary
to act upon any commit to the tree for an ebuild file. A possibility to
do this would be via processing of CVS commit messages, sent out as
email by the CVS server. It is a task of the producer to find out
whether the ebuild found applies to the testing machine (ppc-macos) and
add the package/ebuild to the queue.


Consumer (testing process)
--------------------------

The test machine reads a line from the queue, and basically executes
'emerge ${PACKAGE}'. However, before doing this, first it figures out
which use flags can be used (emerge -pv) and which dependencies will be
pulled (emerge -pt). If portage returns the message all ebuilds that
could satisfy X have been masked, the emerge is cancelled, the line is
removed from the queue and an email message will be sent out.

All dependencies are put in the right order and emerged as normal
packages, that is: all dependencies are pushed at the front of the
queue, thereby keeping uniqueness of the queue and removing duplicates
that appear later on in the queue. After this, the consumer is
restarted and reads again from the queue. This should result in usually
merging only one package at a time, and as such quite isolated cases,
which should improve the error email notification service.

Compile testing a package is supposed to be a thorough test that tries
all possible combinations of the package's USE flags. As this might be
somewhat endless as some packages are rather big and have zillions of
USE flags, it may be necessary to have a special "don't do it" file.
Since all dependencies were put at the front of the queue, there should
normally be no dependencies that the package pulls.
If compilation fails for a certain USE-flag combination this is reported
by sending out an email, and compilation of the next USE-flag
combination is attempted.

When everything goes fine, no email notification is being sent out. A
convenient log structure would, however, make it possible to see which
packages and USE-flag combinations successfully passed through.
Providing this log via a web-page would be a useful thing. Again
backing this with a DBMS to allow easy searching, versioning and stuff
is considered to be overhead, though crafting logs in SQL's "INSERT
INTO" format might enable another machine to display the output data.
Perhaps the communication methods needs a section on itself.


Recap and Conclusion
====================

By setting up a testing system, it is possible to greatly improve the
Quality of Service of the portage tree for an architecture by exhaustive
testing of both packages already in there, as well as packages added or
modified. Automated testing should not release developers from testing
themselves, but should help in pointing out problems that may arise on
moving grounds such as portage where packages are constantly updated and
dependencies might get broken.


ToDo
====

- Not only check dependencies of the respective package, but also
consider packages that depend on the respective package, thus rebuilding
all packages that depend on the package to check if anything is broken
by the update.
- Is there a gleptomaniac in the room? This would be useful for x86
also, of course. In that case it may be necessary to make sure the
packages are split over multiple machines.
- The message system needs more customisation options, especially
backing things by a DBMS would allow for many nice bugzilla-like
preferences for email generation as well as web-based versioned
info/report pages
- To make the system even bigger, a central DBMS powered server might
take a leading role and ... {editor note: wait, stop it right now,
you're going too fast right now}


By The Way
==========

- Kito offers his lil' chico as machine for this automated testing
initiative.
- Comments are welcome, as well as expressions of worry on my mental state.
- Implementation of described system will need some better specified
system and needs some coding (the dirty work) in some language...


--
Fabian Groffen
eBuild && Porting
Gentoo for Mac OS X

--
gentoo-osx [at] gentoo mailing list


kito at gentoo

Aug 14, 2005, 7:00 AM

Post #2 of 4 (652 views)
Permalink
Re: Package testing -- Automated initiative [In reply to]

Adding -dev to CC: in case someone has any meaningful input or has
already tackled this problem...

On Aug 14, 2005, at 8:21 AM, Grobian wrote:

> Introduction
> ============
>
> Recently, once again we were confronted with a package marked as
> ppc-macos stable, while it didn't compile at all, let alone run.
> It is
> believed more of these packages are in portage, and need to be
> found and
> fixed. Keeping the cause of why they are marked stable up to another
> discussion, and out of the scope of this discussion, I will focus
> on how
> to track these packages down and report them to us.
>
> In the secondary line, all 'unstable' packages, marked ~ppc-macos
> should
> be tested as well, since they can be faulty as well. Since for OSX
> much
> is in ~ppc-macos, many users consider it a normal procedure to
> switch to
> the unstable side of portage, hence some extra need for careful
> testing
> of ~ppc-macos also.
>
>
> Proposed Global Structure
> =========================
>
> Testing should be done on a regular basis, both push and pull based.
> This means that the testing machine would start testing packages
> itself
> if it is out of work, and on the other hand starts testing packages as
> soon as they are being added/changed in CVS. It may need no great
> imagination to see that the latter 'push-based' activity has priority
> over the 'pull-based' work.

I'm not sure direct interaction with CVS would be needed, usually
only takes a short time for cvs commits to hit the rsync mirrors
(hence the volatile nature of the tree)

>
> Starting over, will for the test machine mean that it starts cleaning
> out its world file. Cleaning this file out to a bare minimum is an
> important aspect of getting a test environment that reflects the
> situation on new user's machines. If an ebuild uses a package without
> having it in it's DEPEND, this may get noticed only when starting on a
> clean machine. This, however, will add a big delay in testing as many
> packages will need to be built prior the right package can be
> installed.
>
> The testing machine will have a queue file, which it reads packages to
> emerge from. If the queue file is empty, i.e. when there is no push
> based work, the machine will generate work by starting to compile
> uncompiled packages, or emptying the tree.
>
> Because ~ppc-macos and ppc-macos packages interfere with each other
> -- a ~ppc-macos package overwrites a ppc-macos package -- both
> stable and unstable have to be dealt with separately, i.e. they
> should both have their own environment either via two separate
> machines, or through the use of a chroot jail.

I think seperate chroots are definitely the way to go. We can just
store a 'pristine' chroot in iso or dmg or whatever on the build
server and copy when needed.

>
>
> Queues
> ------
>
> In order not to drag in a full DBMS (in the end Portage already is
> one) queues are just simple flat files consisting of absolute
> package names, one per line. Table wise locking granularity is
> handled by the OS as one process opens the file in write mode.
> Consumers -- the testing box in this case -- read the first line
> and delete it, while producers simply add one line (or more) to the
> end of the file.
>
> The queue itself, is more a set than a list. This means that
> packages that are in the queue, should be unique. If a package is
> added that is already in the queue, it is dropped such that the
> original queue position of the package is maintained.

Maybe a 'proper' dbms wouldn't be such a bad idea, could also store
build logs, timestamps, etc. there and make it easier for multiple
build hosts to push/pull from a centralized server.

>
>
> CVS Producer
> ------------
>
> To catch up automatically with changes made to the tree, it is
> necessary to act upon any commit to the tree for an ebuild file. A
> possibility to do this would be via processing of CVS commit
> messages, sent out as email by the CVS server. It is a task of the
> producer to find out whether the ebuild found applies to the
> testing machine (ppc-macos) and add the package/ebuild to the queue.
>
>
> Consumer (testing process)
> --------------------------
>
> The test machine reads a line from the queue, and basically
> executes 'emerge ${PACKAGE}'. However, before doing this, first it
> figures out which use flags can be used (emerge -pv) and which
> dependencies will be pulled (emerge -pt). If portage returns the
> message all ebuilds that could satisfy X have been masked, the
> emerge is cancelled, the line is removed from the queue and an
> email message will be sent out.
>
> All dependencies are put in the right order and emerged as normal
> packages, that is: all dependencies are pushed at the front of the
> queue, thereby keeping uniqueness of the queue and removing
> duplicates that appear later on in the queue. After this, the
> consumer is restarted and reads again from the queue. This should
> result in usually merging only one package at a time, and as such
> quite isolated cases, which should improve the error email
> notification service.
>
> Compile testing a package is supposed to be a thorough test that
> tries all possible combinations of the package's USE flags. As
> this might be somewhat endless as some packages are rather big and
> have zillions of USE flags, it may be necessary to have a special
> "don't do it" file.
> Since all dependencies were put at the front of the queue, there
> should normally be no dependencies that the package pulls.
> If compilation fails for a certain USE-flag combination this is
> reported by sending out an email, and compilation of the next USE-
> flag combination is attempted.
>
> When everything goes fine, no email notification is being sent
> out. A convenient log structure would, however, make it possible
> to see which packages and USE-flag combinations successfully passed
> through. Providing this log via a web-page would be a useful
> thing. Again backing this with a DBMS to allow easy searching,
> versioning and stuff is considered to be overhead, though crafting
> logs in SQL's "INSERT INTO" format might enable another machine to
> display the output data. Perhaps the communication methods needs a
> section on itself.
>
>
> Recap and Conclusion
> ====================
>
> By setting up a testing system, it is possible to greatly improve
> the Quality of Service of the portage tree for an architecture by
> exhaustive testing of both packages already in there, as well as
> packages added or modified. Automated testing should not release
> developers from testing themselves, but should help in pointing out
> problems that may arise on moving grounds such as portage where
> packages are constantly updated and dependencies might get broken.
>
>
> ToDo
> ====
>
> - Not only check dependencies of the respective package, but also
> consider packages that depend on the respective package, thus
> rebuilding all packages that depend on the package to check if
> anything is broken by the update.
> - Is there a gleptomaniac in the room? This would be useful for
> x86 also, of course. In that case it may be necessary to make sure
> the packages are split over multiple machines.
> - The message system needs more customisation options, especially
> backing things by a DBMS would allow for many nice bugzilla-like
> preferences for email generation as well as web-based versioned
> info/report pages
> - To make the system even bigger, a central DBMS powered server
> might take a leading role and ... {editor note: wait, stop it right
> now, you're going too fast right now}
>
>
> By The Way
> ==========
>
> - Kito offers his lil' chico as machine for this automated testing
> initiative.
> - Comments are welcome, as well as expressions of worry on my
> mental state.
> - Implementation of described system will need some better
> specified system and needs some coding (the dirty work) in some
> language...
>
>
> --
> Fabian Groffen
> eBuild && Porting
> Gentoo for Mac OS X
>
> --
> gentoo-osx [at] gentoo mailing list
>
>

--
gentoo-osx [at] gentoo mailing list


fthain at telegraphics

Aug 14, 2005, 9:09 AM

Post #3 of 4 (655 views)
Permalink
Re: Package testing -- Automated initiative [In reply to]

>
> When everything goes fine, no email notification is being sent out. A
> convenient log structure would, however, make it possible to see which
> packages and USE-flag combinations successfully passed through.
> Providing this log via a web-page would be a useful thing.

Would tinderbox help?

> - Comments are welcome, as well as expressions of worry on my mental state.

Good thinking!

The chroot idea is a good one because the process lends itself to
parallelism. That is, you might have one test box/chroot for, (maybe in
order of importance)

- unstable empty tree (all deps every time)
- stable empty tree builds (same)
- unstable cumulative tree builds
- stable cumulative tree builds

I see the last ones as being fairly important, because the cumulative
(emerge -Du) trees will have the best throughput, for quicky finding any
glaring, slap-forehead kind of bugs/bad keywords (i.e. low fruit).

The cumulative tree machines would also be an efficient choice for your
reverse-dependency idea (perhaps to only one level of indirection).

-f
--
gentoo-osx [at] gentoo mailing list


grobian at gentoo

Aug 14, 2005, 12:16 PM

Post #4 of 4 (654 views)
Permalink
Re: Package testing -- Automated initiative [In reply to]

Finn Thain wrote:
>> When everything goes fine, no email notification is being sent out. A
>> convenient log structure would, however, make it possible to see which
>> packages and USE-flag combinations successfully passed through.
>> Providing this log via a web-page would be a useful thing.
>
> Would tinderbox help?

As far as I know about tinderbox, it is more than just a building
system. It is a complete procedure where the tree is being closed
during compilation time, then only reopened when everything compiles.

>
>> - Comments are welcome, as well as expressions of worry on my mental state.
>
> Good thinking!
>
> The chroot idea is a good one because the process lends itself to
> parallelism. That is, you might have one test box/chroot for, (maybe in
> order of importance)
>
> - unstable empty tree (all deps every time)
> - stable empty tree builds (same)
> - unstable cumulative tree builds
> - stable cumulative tree builds

This is indeed a good plan, as this allows some more responsive and
thorough testing to occur next to each other.

> I see the last ones as being fairly important, because the cumulative
> (emerge -Du) trees will have the best throughput, for quicky finding any
> glaring, slap-forehead kind of bugs/bad keywords (i.e. low fruit).
>
> The cumulative tree machines would also be an efficient choice for your
> reverse-dependency idea (perhaps to only one level of indirection).

Good points, thanks!

--
Fabian Groffen
eBuild && Porting
Gentoo for Mac OS X
--
gentoo-osx [at] gentoo mailing list

Gentoo osx RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.