Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Gentoo: Dev

Questions about SystemD and OpenRC

 

 

First page Previous page 1 2 3 4 5 Next page Last page  View All Gentoo dev RSS feed   Index | Next | Previous | View Threaded


rich0 at gentoo

Aug 16, 2012, 6:26 PM

Post #101 of 103 (412 views)
Permalink
Re: Re: Questions about SystemD and OpenRC [In reply to]

On Thu, Aug 16, 2012 at 4:05 PM, Michael Mol <mikemol [at] gmail> wrote:
> The limited-visibility build feature discussed a week or so ago would
> go a long way in detecting unexpressed build dependencies.

I can't say that is a coincidence, but my intent would be to include
@system as implicit dependencies, at least until we change that policy
(though the morbidly curious could use that as a test in a tinderbox
to find packages in @system that are good candidates for removal).

I haven't gotten to test it, but after studying sandbox it shouldn't
be hard to just hack together a manual test by removing read access to
root from the config files and adding in a bazillion files. That
should at least let me profile performance/etc. I'm not convinced
that there isn't room for improvement, but if it works well as-is then
automating this shouldn't be hard at all. If portage has the
dependency tree in RAM then you just need to dump all the edb listings
for those packages plus @system and feed those into sandbox. That
just requires reading a bunch of text files and no searching, so it
should be pretty quick. As far as I can tell the relevant calls to
check for read access are already being made in sandbox already, and
obviously they aren't taking forever. We just have to see if the
search gets slow if the access list has tens of thousands of entries
(if it does, that is just a simple matter of optimization, but being
in-RAM I can't see how tens of thousands of entries is going to slow
down a modern CPU even if it is just an unsorted list).

Rich


mikemol at gmail

Aug 16, 2012, 7:02 PM

Post #102 of 103 (409 views)
Permalink
Re: Re: Questions about SystemD and OpenRC [In reply to]

On Thu, Aug 16, 2012 at 9:26 PM, Rich Freeman <rich0 [at] gentoo> wrote:
> On Thu, Aug 16, 2012 at 4:05 PM, Michael Mol <mikemol [at] gmail> wrote:
>> The limited-visibility build feature discussed a week or so ago would
>> go a long way in detecting unexpressed build dependencies.
>
> I can't say that is a coincidence, but my intent would be to include
> @system as implicit dependencies, at least until we change that policy
> (though the morbidly curious could use that as a test in a tinderbox
> to find packages in @system that are good candidates for removal).
>
> I haven't gotten to test it, but after studying sandbox it shouldn't
> be hard to just hack together a manual test by removing read access to
> root from the config files and adding in a bazillion files. That
> should at least let me profile performance/etc. I'm not convinced
> that there isn't room for improvement, but if it works well as-is then
> automating this shouldn't be hard at all. If portage has the
> dependency tree in RAM then you just need to dump all the edb listings
> for those packages plus @system and feed those into sandbox. That
> just requires reading a bunch of text files and no searching, so it
> should be pretty quick. As far as I can tell the relevant calls to
> check for read access are already being made in sandbox already, and
> obviously they aren't taking forever. We just have to see if the
> search gets slow if the access list has tens of thousands of entries
> (if it does, that is just a simple matter of optimization, but being
> in-RAM I can't see how tens of thousands of entries is going to slow
> down a modern CPU even if it is just an unsorted list).

Yeah, I presumed you'd have @system as a set of implicit dependencies.
The obvious approaches would be to either temporarily remove a package
from @system, tell the portage to ignore a package while doing limited
visibility, or copy @system to a different, temporary set and remove
things piecemeal from there.

That last might make the most sense. "--implicit-dependencies ---
defaults to @system. Additional instances append to the set of
implicit dependencies. Use, e.g. -${ATOM} or - [at] syste to override
default include."

--
:wq


gmt at malth

Aug 18, 2012, 1:50 PM

Post #103 of 103 (411 views)
Permalink
Re: Re: Questions about SystemD and OpenRC [In reply to]

On 8/16/2012 6:26 PM, Rich Freeman wrote:
> On Thu, Aug 16, 2012 at 4:05 PM, Michael Mol <mikemol [at] gmail> wrote:
>> The limited-visibility build feature discussed a week or so ago would
>> go a long way in detecting unexpressed build dependencies.

[snip]

> If portage has the
> dependency tree in RAM then you just need to dump all the edb listings
> for those packages plus @system and feed those into sandbox.

> That just requires reading a bunch of text files and no searching, so it
> should be pretty quick.

Portage could hypothetically compile such a list while it crawls the
package dependency tree, but I suspect the cost will not be small as you
predict.

> As far as I can tell the relevant calls to
> check for read access are already being made in sandbox already, and
> obviously they aren't taking forever. We just have to see if the
> search gets slow if the access list has tens of thousands of entries
> (if it does, that is just a simple matter of optimization, but being
> in-RAM I can't see how tens of thousands of entries is going to slow
> down a modern CPU even if it is just an unsorted list).

I appreciate your optimism but I think you're underestimating the cost.
Can't speak for others, but my portage db's churn too much for comfort
as is. Once we start multiplying per-package-dependency iteration by
the files-per-package iteration, that's going to be O(a-shit-load).

Of course, where there's a will there's a way. I'd be surprised if some
kind of delayed-evaluation + caching scheme wouldn't suffice, or,
barring that, perhaps it's time to create an indexed-database-based
drop-in replacement for the current portage db code.

I've enclosed some scripts you may find helpful in looking at the
numbers. They are kind-of kludgey (originally intended for
in-house-only use and modified for present purposes) but may help shed
some light, if they aren't too buggy, that is...

"dumpworld" slices and dices "emerge -ep" output to provide a list of
atoms in the complete dependency tree of a given list of atoms (add
'@system' to get the complete tree, dumpworld won't do so).

"dumpfiles" operates only on packages installed in the local system
(non-installed atoms are silently dropped), and requires/assumes that
'emerge -ep world' would not change anything if it is to give accurate
information. It takes a list of atoms, transforms them into the
complete lists of atoms in their dependency tree via dumpworld, merges
the lists together, and finds the number of files associated with each
atom in portage. Any collisions will be counted twice, since it doesn't
keep track. It also doesn't add '@system' unless you do. By default it
emits:

o A list of package atoms and the files owned by each atom (stderr)
o total atoms and files
o average filename length

What is, perhaps, more discouraging than the numbers it reports is how
long it takes to run (note: although I suspect an optimized python
implementation could be made to do this faster by a moderate constant
factor, I'm not sure if the big-oh performance characteristics can be
significantly improved without database structure changes like the ones
mentioned above).

My disturbingly bloated and slow workstation gives these answers (note:
here it's even slower because it's running in an emulator):

greg [at] fedora64vm ~ $ time bash -c 'dumpfiles @system 2>/dev/null'
TOTAL: 402967 files (in 816 ebuilds, average path length: 66)


real 15m33.719s
user 13m18.909s
sys 2m8.436s
greg [at] fedora64vm ~ $ time bash -c 'dumpfiles chromium 2>/dev/null'
TOTAL: 401300 files (in 807 ebuilds, average path length: 66)


real 15m28.900s
user 13m15.126s
sys 2m8.088s

My workstation is surely an "outlier" as I have a lot of dependencies
and files due to multilib, split-debug, and USE+=$( a lot ). It's also
got slow hardware Raid6 and the emulator only gives it 2G of ram to work
with. But I'm a real portage user; I'm sure there's other ones out
there, if not many, with similar constraints.

-gmt
Attachments: dumpfiles (2.06 KB)
  dumpworld (0.57 KB)

First page Previous page 1 2 3 4 5 Next page Last page  View All Gentoo dev RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.