Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Linux: Kernel

rc6 keeps hanging and blanking displays where rc4-mm1 works fine.

 

 

First page Previous page 1 2 Next page Last page  View All Linux kernel RSS feed   Index | Next | Previous | View Threaded


helge.hafting at aitel

Aug 12, 2005, 3:01 AM

Post #1 of 32 (1304 views)
Permalink
rc6 keeps hanging and blanking displays where rc4-mm1 works fine.

Helge Hafting wrote:

> Dave Airlie wrote:
>
>>
>> I switched back to 2.6.13-rc4-mm1 at this point for another reason,
>> my X display aquired a nasty tendency to go blank for no reason
>> during work,
>> something I could fix by changing resolution baqck and forth. X
>> also tended to get
>> stuck for a minute now and then - a problem I haven't seen since
>> early 2.6.
>>
>>
>>
>> which head the radeon or MGA or both?
>
>
> The radeon 9200SE-pci gets stuck. The MGA-agp seems to be fine. I
> have compiled
> dri support for both, but I can't use it at the moment. I think that is
> caused by having ubuntu's xorg installed on debian. I needed xorg
> in order to run an xserver that doesn't use any tty - this way I can use
> two keyboards and have two simultaneous users. Debians xorg wasn't ready
> at the moment. The setup is fine with 2.6.13-rc4-mm1 x86-64, no
> problems there.

The problem still exists in 2.6.13-rc6. Usually, all I get is a
suddenly black display,
solveable by resizing. But the machine will occationally hang, forcing
me to
use the reset button. I lost my mbox file to this (from an ext3 fs, on
raid-1 on scsi.)

It is hard to say wether the fs problem merely is an effect of hanging
with rc6.
With rc5, there definitely was some sort of io/scsi problem as one disk
was "lost" until I booted a working kernel.

Currently, it seems like I won't be able to use 2.6.13.

Helge Hafting
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


alan at lxorguk

Aug 12, 2005, 3:32 AM

Post #2 of 32 (1299 views)
Permalink
Re: rc6 keeps hanging and blanking displays where rc4-mm1 works fine. [In reply to]

On Gwe, 2005-08-12 at 12:01 +0200, Helge Hafting wrote:
> solveable by resizing. But the machine will occationally hang, forcing
> me to
> use the reset button. I lost my mbox file to this (from an ext3 fs, on
> raid-1 on scsi.)

Unless you are using data=journal and have turned write cache off on
your IDE drives that is expected. Metadata journalling protects your
file system intgerity. Data journalling is more expensive but will
protect your file integrity if the disk layer is also correctly set up.
Unfortunately the IDE layer defaults the wrong way and despite many
complaints has not been changed. In later 2.6 with modern drives you can
also enable barrier mode on the IDE layer which gives better results
than turning off the write cache.

Alan

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


torvalds at osdl

Aug 12, 2005, 9:51 AM

Post #3 of 32 (1280 views)
Permalink
Re: rc6 keeps hanging and blanking displays where rc4-mm1 works fine. [In reply to]

On Fri, 12 Aug 2005, Helge Hafting wrote:
>
> > at the moment. The setup is fine with 2.6.13-rc4-mm1 x86-64, no
> > problems there.
>
> The problem still exists in 2.6.13-rc6. Usually, all I get is a
> suddenly black display, solveable by resizing.

Is there any chance you could try bisecting the problem? Either just
binary-searching the patches or by using the git bisect helper scripts?

Obviously the git approach needs a "good" kernel in git, but if
2.6.13-rc4-mm1 is ok, then I assume that 2.6.13-rc4 is ok too? That's a
fair number of changes:

git-rev-list v2.6.13-rc4..v2.6.13-rc6 | wc
340 340 13940

but if you can tighten it up a bit (you already had trouble at rc5, I
think), it shouldn't require testing more than a few kernels.

Git has had bisection support for a while, but the helper scripts to use
it sanely are fairly new, so I think you'd need the git-0.99.4 release for
those. But then you'd just do

git bisect start
git bisect bad v2.6.13-rc5
git bisect good v2.6.13-rc4

and start bisecting (that will check out a mid-way point automatically,
you build it, and then do "git bisect bad" or "git bisect good" depending
on whether the result is bad or good - it will continue to try to find
half-way points until it has found the point that turns from good to
bad..)

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


helge.hafting at aitel

Aug 15, 2005, 5:37 AM

Post #4 of 32 (1282 views)
Permalink
Re: rc6 keeps hanging and blanking displays where rc4-mm1 works fine. [In reply to]

Linus Torvalds wrote:

>On Fri, 12 Aug 2005, Helge Hafting wrote:
>
>
>>>at the moment. The setup is fine with 2.6.13-rc4-mm1 x86-64, no
>>>problems there.
>>>
>>>
>>The problem still exists in 2.6.13-rc6. Usually, all I get is a
>>suddenly black display, solveable by resizing.
>>
>>
>
>Is there any chance you could try bisecting the problem? Either just
>binary-searching the patches or by using the git bisect helper scripts?
>
>Obviously the git approach needs a "good" kernel in git, but if
>2.6.13-rc4-mm1 is ok, then I assume that 2.6.13-rc4 is ok too? That's a
>fair number of changes:
>
> git-rev-list v2.6.13-rc4..v2.6.13-rc6 | wc
> 340 340 13940
>
>but if you can tighten it up a bit (you already had trouble at rc5, I
>think), it shouldn't require testing more than a few kernels.
>
>Git has had bisection support for a while, but the helper scripts to use
>it sanely are fairly new, so I think you'd need the git-0.99.4 release for
>those. But then you'd just do
>
> git bisect start
> git bisect bad v2.6.13-rc5
> git bisect good v2.6.13-rc4
>
>and start bisecting (that will check out a mid-way point automatically,
>you build it, and then do "git bisect bad" or "git bisect good" depending
>on whether the result is bad or good - it will continue to try to find
>half-way points until it has found the point that turns from good to
>bad..)
>
> Linus
>
>
Ok, I have downlaoded git and started the first compile.
Git will tell when the correct point is found (assuming I
do the "git bisect bad/good" right), by itself?

Is there any way to make git tell exactly where between rc4 and rc5
each kernel is, so I can name the bzimages accordingly?

It takes some time to trigger the bug, so I could possibly end up with
a falsely ok kernel. Is there a simple way to restart the search from
that point,
or will I have to start over with rc4 and rc5 and say
git bisect good/bad until I reach the point of mistake?

Helge Hafting


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


bzolnier at gmail

Aug 15, 2005, 5:53 AM

Post #5 of 32 (1272 views)
Permalink
Re: rc6 keeps hanging and blanking displays where rc4-mm1 works fine. [In reply to]

On 8/12/05, Alan Cox <alan [at] lxorguk> wrote:
> On Gwe, 2005-08-12 at 12:01 +0200, Helge Hafting wrote:
> > solveable by resizing. But the machine will occationally hang, forcing
> > me to
> > use the reset button. I lost my mbox file to this (from an ext3 fs, on
> > raid-1 on scsi.)
>
> Unless you are using data=journal and have turned write cache off on
> your IDE drives that is expected. Metadata journalling protects your
> file system intgerity. Data journalling is more expensive but will
> protect your file integrity if the disk layer is also correctly set up.
> Unfortunately the IDE layer defaults the wrong way and despite many
> complaints has not been changed. In later 2.6 with modern drives you can

Changing defaults is not that easy, disabling write-cache shortens HDD
life considerably (discussed on LKML).

Recommend solution is to disable write-cache w/ hdparm or use barrier mode.

> also enable barrier mode on the IDE layer which gives better results
> than turning off the write cache.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


bzolnier at gmail

Aug 15, 2005, 6:00 AM

Post #6 of 32 (1292 views)
Permalink
Re: rc6 keeps hanging and blanking displays where rc4-mm1 works fine. [In reply to]

On 8/15/05, Bartlomiej Zolnierkiewicz <bzolnier [at] gmail> wrote:
> On 8/12/05, Alan Cox <alan [at] lxorguk> wrote:
> > On Gwe, 2005-08-12 at 12:01 +0200, Helge Hafting wrote:
> > > solveable by resizing. But the machine will occationally hang, forcing
> > > me to
> > > use the reset button. I lost my mbox file to this (from an ext3 fs, on
> > > raid-1 on scsi.)
> >
> > Unless you are using data=journal and have turned write cache off on
> > your IDE drives that is expected. Metadata journalling protects your
> > file system intgerity. Data journalling is more expensive but will
> > protect your file integrity if the disk layer is also correctly set up.
> > Unfortunately the IDE layer defaults the wrong way and despite many
> > complaints has not been changed. In later 2.6 with modern drives you can
>
> Changing defaults is not that easy, disabling write-cache shortens HDD
> life considerably (discussed on LKML).
>
> Recommend solution is to disable write-cache w/ hdparm or use barrier mode.
>
> > also enable barrier mode on the IDE layer which gives better results
> > than turning off the write cache.

Moreover Helge is using RAID-1 on SCSI so IDE is out of picture here.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


torvalds at osdl

Aug 15, 2005, 8:50 AM

Post #7 of 32 (1272 views)
Permalink
Re: rc6 keeps hanging and blanking displays where rc4-mm1 works fine. [In reply to]

On Mon, 15 Aug 2005, Helge Hafting wrote:
>
> Ok, I have downlaoded git and started the first compile.
> Git will tell when the correct point is found (assuming I
> do the "git bisect bad/good" right), by itself?

Yes. You should see

Bisecting: xxx revisions left to test after this

and the "xxx" should hopefully decrease by half during each round. And t
the end of it, you should get

<sha1> is first bad commit

followed by the actual patch that caused the problem.

> Is there any way to make git tell exactly where between rc4 and rc5
> each kernel is, so I can name the bzimages accordingly?

You'd have to use the raw commit names, since these things don't have any
symbolic names. You can get that by just doing

cat .git/HEAD

which will give you a 40-character hex string (representing the 160-bit
SHA1 of the top commit). Not very readable, but it's unique, and if you
report that hex string to other git users, they can trivially recreate the
tree you have.

> It takes some time to trigger the bug, so I could possibly end up with
> a falsely ok kernel. Is there a simple way to restart the search from
> that point, or will I have to start over with rc4 and rc5 and say
> git bisect good/bad until I reach the point of mistake?

If you remember/save the good/bad commit ID's, you can restart the whole
process and just feed the correct state for the ID's:

git bisect start
git bisect bad v2.6.13-rc5
git bisect good v2.6.13-rc4
.. here bisect will start narrowing things down ..
git bisect bad <sha1 of known bad>
git bisect good <sha1 of known good>
..

ie you can always feed an arbitrary number of known good/bad points by
naming them by their SHA1 ID (or their symbolic name, as in the
v2.6.13-rcX releases).

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


ryan at michonline

Aug 15, 2005, 10:00 AM

Post #8 of 32 (1270 views)
Permalink
Re: rc6 keeps hanging and blanking displays where rc4-mm1 works fine. [In reply to]

On Mon, Aug 15, 2005 at 08:50:12AM -0700, Linus Torvalds wrote:
> > Is there any way to make git tell exactly where between rc4 and rc5
> > each kernel is, so I can name the bzimages accordingly?
>
> You'd have to use the raw commit names, since these things don't have any
> symbolic names. You can get that by just doing
>
> cat .git/HEAD
>
> which will give you a 40-character hex string (representing the 160-bit
> SHA1 of the top commit). Not very readable, but it's unique, and if you
> report that hex string to other git users, they can trivially recreate the
> tree you have.

The following patch (which Sam has in the kbuild tree for 2.6.14, IIRC)
will make that automatic, or you can just do:

ln -s .git/HEAD localversion-git

(My patch will notice when you are at a tag and not append anything
special in thaat case.)

Index: linux-git/Makefile
===================================================================
--- linux-git.orig/Makefile 2005-07-31 04:30:00.000000000 -0400
+++ linux-git/Makefile 2005-07-31 04:32:16.000000000 -0400
@@ -551,6 +551,26 @@ export KBUILD_IMAGE ?= vmlinux
# images. Default is /boot, but you can set it to other values
export INSTALL_PATH ?= /boot

+# If CONFIG_LOCALVERSION_AUTO is set, we automatically perform some tests
+# and try to determine if the current source tree is a release tree, of any sort,
+# or if is a pure development tree.
+#
+# A 'release tree' is any tree with a git TAG associated
+# with it. The primary goal of this is to make it safe for a native
+# git/CVS/SVN user to build a release tree (i.e, 2.6.9) and also to
+# continue developing against the current Linus tree, without having the Linus
+# tree overwrite the 2.6.9 tree when installed.
+#
+# Currently, only git is supported.
+# Other SCMs can edit scripts/setlocalversion and add the appropriate
+# checks as needed.
+
+
+ifdef CONFIG_LOCALVERSION_AUTO
+ localversion-auto := $(shell $(PERL) $(srctree)/scripts/setlocalversion $(srctree))
+ LOCALVERSION := $(LOCALVERSION)$(localversion-auto)
+endif
+
#
# INSTALL_MOD_PATH specifies a prefix to MODLIB for module directory
# relocations required by build roots. This is not defined in the
Index: linux-git/init/Kconfig
===================================================================
--- linux-git.orig/init/Kconfig 2005-07-31 04:30:00.000000000 -0400
+++ linux-git/init/Kconfig 2005-07-31 04:32:16.000000000 -0400
@@ -77,6 +77,22 @@ config LOCALVERSION
object and source tree, in that order. Your total string can
be a maximum of 64 characters.

+config LOCALVERSION_AUTO
+ bool "Automatically append version information to the version string"
+ default y
+ help
+ This will try to automatically determine if the current tree is a
+ release tree by looking for git tags that
+ belong to the current top of tree revision.
+
+ A string of the format -gxxxxxxxx will be added to the localversion
+ if a git based tree is found. The string generated by this will be
+ appended after any matching localversion* files, and after the value
+ set in CONFIG_LOCALVERSION
+
+ Note: This requires Perl, and a git repository, but not necessarily
+ the git or cogito tools to be installed.
+
config SWAP
bool "Support for paging of anonymous memory (swap)"
depends on MMU
Index: linux-git/scripts/setlocalversion
===================================================================
--- /dev/null 1970-01-01 00:00:00.000000000 +0000
+++ linux-git/scripts/setlocalversion 2005-07-31 04:32:16.000000000 -0400
@@ -0,0 +1,56 @@
+#!/usr/bin/perl
+# Copyright 2004 - Ryan Anderson <ryan [at] michonline> GPL v2
+
+use strict;
+use warnings;
+use Digest::MD5;
+require 5.006;
+
+if (@ARGV != 1) {
+ print <<EOT;
+Usage: setlocalversion <srctree>
+EOT
+ exit(1);
+}
+
+my ($srctree) = @ARGV;
+chdir($srctree);
+
+my @LOCALVERSIONS = ();
+
+# We are going to use the following commands to try and determine if this
+# repository is at a Version boundary (i.e, 2.6.10 vs 2.6.10 + some patches) We
+# currently assume that all meaningful version boundaries are marked by a tag.
+# We don't care what the tag is, just that something exists.
+
+# Git/Cogito store the top-of-tree "commit" in .git/HEAD
+# A list of known tags sits in .git/refs/tags/
+#
+# The simple trick here is to just compare the two of these, and if we get a
+# match, return nothing, otherwise, return a subset of the SHA-1 hash in
+# .git/HEAD
+
+sub do_git_checks {
+ open(H,"<.git/HEAD") or return;
+ my $head = <H>;
+ chomp $head;
+ close(H);
+
+ opendir(D,".git/refs/tags") or return;
+ foreach my $tagfile (grep !/^\.{1,2}$/, readdir(D)) {
+ open(F,"<.git/refs/tags/" . $tagfile) or return;
+ my $tag = <F>;
+ chomp $tag;
+ close(F);
+ return if ($tag eq $head);
+ }
+ closedir(D);
+
+ push @LOCALVERSIONS, "g" . substr($head,0,8);
+}
+
+if ( -d ".git") {
+ do_git_checks();
+}
+
+printf "-%s\n", join("-",@LOCALVERSIONS) if (scalar @LOCALVERSIONS > 0);




--

Ryan Anderson
sometimes Pug Majere
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


helgehaf at aitel

Aug 15, 2005, 10:45 AM

Post #9 of 32 (1283 views)
Permalink
Re: rc6 keeps hanging and blanking displays where rc4-mm1 works fine. [In reply to]

On Mon, Aug 15, 2005 at 08:50:12AM -0700, Linus Torvalds wrote:
>
>
> On Mon, 15 Aug 2005, Helge Hafting wrote:
> >
> > Ok, I have downlaoded git and started the first compile.
> > Git will tell when the correct point is found (assuming I
> > do the "git bisect bad/good" right), by itself?
>
> Yes. You should see
>
> Bisecting: xxx revisions left to test after this
>
> and the "xxx" should hopefully decrease by half during each round. And t
> the end of it, you should get
>
> <sha1> is first bad commit
>
> followed by the actual patch that caused the problem.
>
> > Is there any way to make git tell exactly where between rc4 and rc5
> > each kernel is, so I can name the bzimages accordingly?
>
> You'd have to use the raw commit names, since these things don't have any
> symbolic names. You can get that by just doing
>
> cat .git/HEAD
>
> which will give you a 40-character hex string (representing the 160-bit
> SHA1 of the top commit). Not very readable, but it's unique, and if you
> report that hex string to other git users, they can trivially recreate the
> tree you have.
>
Good. I save those .git/HEAD strings to a separate file.
The first iteration
a46e812620bd7db457ce002544a1a6572c313d8a
seemed to turn out "good". I test further during the compile of
the next one.

Thanks for all the instructions on using git.

Helge Hafting
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


sanjoy at mrao

Aug 15, 2005, 2:48 PM

Post #10 of 32 (1268 views)
Permalink
Re: rc6 keeps hanging and blanking displays where rc4-mm1 works fine. [In reply to]

>> Is there any way to make git tell exactly where between rc4 and rc5
>> each kernel is, so I can name the bzimages accordingly?
>
> You'd have to use the raw commit names, since these things don't have any
> symbolic names. You can get that by just doing
>
> cat .git/HEAD

Also, don't name the local version something like
2.6.13-rc6:e63b6d5ac1e17d0d9e5112bd9c0e5f17199b23da otherwise LILO
complains. For example, this bit in lilo.conf

image=/boot/vmlinuz-2.6.12:b5e43913cfe95a18ad8929585a0bb58e46cf3390
label=bisect1

produces when you run lilo:

:BIOS syntax is no longer supported. Please use a DISK section
Fatal: Not a number: "b5e43913cfe95a18ad8929585a0bb58e46cf3390"

So in my kernel tree used for bisections, 'localversion' contains

-b5e43913cfe95a18ad8929585a0bb58e46cf3390

I don't fully understand when git (doing the checkout that is implict
in git bisect) will overwrite or not overwrite local files, or when it
will create files not in a previous version, or delete files not in a
current version. So, to be sure I'm getting a clean compile from
exactly the source files I want (probably overkill), I use 'git
bisect' to get the SHA1 id's, and then do:

#!/bin/bash
sha1=`cat .git/HEAD`
dest="/usr/src/bisect/$sha1"
cg-export $dest $sha1
cp dot-config-to-test $dest/.config
cd $dest
echo "-$sha1" > localversion
# accept defaults for all new config options:
yes "" | make oldconfig
make -j 4 >& compile.log &
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


helgehaf at aitel

Aug 15, 2005, 3:11 PM

Post #11 of 32 (1265 views)
Permalink
Re: rc6 keeps hanging and blanking displays - bisection complete [In reply to]

On Mon, Aug 15, 2005 at 08:50:12AM -0700, Linus Torvalds wrote:
>
>
> On Mon, 15 Aug 2005, Helge Hafting wrote:
> >
> > Ok, I have downlaoded git and started the first compile.
> > Git will tell when the correct point is found (assuming I
> > do the "git bisect bad/good" right), by itself?
>
> Yes. You should see
>
> Bisecting: xxx revisions left to test after this
>
> and the "xxx" should hopefully decrease by half during each round. And t
> the end of it, you should get
>
> <sha1> is first bad commit
>
> followed by the actual patch that caused the problem.
>
This was interesting. At first, lots of kernels just kept working,
I almost suspected I was doing something wrong. Then the second last kernel
recompiled a lot of DRM stuff - and the crash came back!
The kernel after that worked again, and so the final message was:

561fb765b97f287211a2c73a844c5edb12f44f1d is first bad commit
diff-tree 561fb765b97f287211a2c73a844c5edb12f44f1d (from
6ade43fbbcc3c12f0ddba112351d14d6c82ae476)
Author: Anton Blanchard <anton [at] samba>
Date: Mon Aug 1 21:11:46 2005 -0700

[PATCH] ppc64: topology API fix

Dont include asm-generic/topology.h unconditionally, we end up overriding
all the ppc64 specific functions when NUMA is on.

Signed-off-by: Anton Blanchard <anton [at] samba>
Acked-by: Paul Mackerras <paulus [at] samba>
Signed-off-by: Andrew Morton <akpm [at] osdl>
Signed-off-by: Linus Torvalds <torvalds [at] osdl>

:040000 040000 a760521110f862aecbee74cffa674993b6dca4a3
66b9cb2db119ab029ca7b8f71bd06507fca63921 M include

I'm a little surprised, as a ppc64 fix theoretically shouldn't matter for
x86_64? But perhaps they share something?

I hope this is of help,
Helge Hafting
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


torvalds at osdl

Aug 15, 2005, 3:59 PM

Post #12 of 32 (1263 views)
Permalink
Re: rc6 keeps hanging and blanking displays - bisection complete [In reply to]

On Tue, 16 Aug 2005, Helge Hafting wrote:
>
> This was interesting. At first, lots of kernels just kept working,
> I almost suspected I was doing something wrong. Then the second last kernel
> recompiled a lot of DRM stuff - and the crash came back!
> The kernel after that worked again, and so the final message was:
>
> 561fb765b97f287211a2c73a844c5edb12f44f1d is first bad commit

Ok, that definitely looks bogus.

That commit should not matter at _all_, it only changes ppc64 specific
things.

If the bug is sometimes hard to trigger, maybe one of the "good" kernels
wasn't good after all. That would definitely throw a wrench in the
bisection.

Anyway, with something like this, where there may be false positives
(false "good" kernels), the only thing you can _really_ trust is a kernel
that got marked bad, because that one definitely has the problem. So make
sure that you remember all known-bad kernels.

Btw, we haven't had a lot of testign of the termination condition for "git
bisect", so it's possible it's off by a commit or two. However, the commit
you actually ended up on is literally just two commits before 2.6.13-rc5,
which makes me suspect that it's not the termination condition, as much as
the fact that it really was an earlier kernel that had the problem, but
you bisected it as "good" because the problem just didn't trigger quickly
enough..

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


airlied at gmail

Aug 15, 2005, 4:18 PM

Post #13 of 32 (1275 views)
Permalink
Re: rc6 keeps hanging and blanking displays - bisection complete [In reply to]

>
> I'm a little surprised, as a ppc64 fix theoretically shouldn't matter for
> x86_64? But perhaps they share something?

My guess is that it is maybe the DRM changes that have done it... the
32/64-bit code in 2.6.13-rc6 may have issues, but they've been tested
on a number of configurations (none of them by me... I can't test what
I don't have...)

Can you do me a favour and check 2.6.13-rc6 with the git-drm.patch from -mm?

http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13-rc5/2.6.13-rc5-mm1/broken-out/git-drm.patch

If this is a 32/64-bit issue I think that patch might help, I'm not
convinced I can't see how the DRM would ever start blanking the
screen, it doesn't have any code in that area at all.. but stranger
things have surprised me...

Is there any difference in your Xorg.0.log files before/after this...

There is also an issue at:
http://bugme.osdl.org/show_bug.cgi?id=4965

which was caused by the pci assign resources patch on x86... I'm not
sure if this is similiar..

Dave.

Dave.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


airlied at gmail

Aug 15, 2005, 4:24 PM

Post #14 of 32 (1258 views)
Permalink
Re: rc6 keeps hanging and blanking displays - bisection complete [In reply to]

> > I'm a little surprised, as a ppc64 fix theoretically shouldn't matter for
> > x86_64? But perhaps they share something?
>
> My guess is that it is maybe the DRM changes that have done it... the
> 32/64-bit code in 2.6.13-rc6 may have issues, but they've been tested
> on a number of configurations (none of them by me... I can't test what
> I don't have...)
>

Actually after looking back 2.6.13-rc4-mm1 which you say works doesn't
contain any of the later 32/64-bit changes.. so maybe you can try just
applying the git-drm.patch from that tree to see if it makes a
difference...

I'm getting less and less sure this is caused by the drm, (have you
built with DRM disabled completely??)

Do you have any fb support in-kernel (I know you might have answered
this already but I'm getting a bit lost on this thread...)

Dave.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


helgehaf at aitel

Aug 16, 2005, 12:34 AM

Post #15 of 32 (1281 views)
Permalink
Re: rc6 keeps hanging and blanking displays - bisection complete [In reply to]

On Tue, Aug 16, 2005 at 09:24:25AM +1000, Dave Airlie wrote:
> > > I'm a little surprised, as a ppc64 fix theoretically shouldn't matter for
> > > x86_64? But perhaps they share something?
> >
> > My guess is that it is maybe the DRM changes that have done it... the
> > 32/64-bit code in 2.6.13-rc6 may have issues, but they've been tested
> > on a number of configurations (none of them by me... I can't test what
> > I don't have...)
> >
>
> Actually after looking back 2.6.13-rc4-mm1 which you say works doesn't
> contain any of the later 32/64-bit changes.. so maybe you can try just
> applying the git-drm.patch from that tree to see if it makes a
> difference...
>
> I'm getting less and less sure this is caused by the drm, (have you
> built with DRM disabled completely??)
>
No, but I can try that after work today.

> Do you have any fb support in-kernel (I know you might have answered
> this already but I'm getting a bit lost on this thread...)

There is no fb support at all. I have the vga console,
agp support (which obviously only applies to the agp g550)
drm/dri support for g550 and for the pci radeon.
Could the new patches possibly have issues with the case
where AGP support is compiled into the kernel, but
the card is pci so it isn't supposed to _use_ it?
Also, the two cards aren't used by the same user, it
is two desktops, not one big one.

The X freeze comes fast if I play "cuyo", a nice 2D game
somewhat similiar to tetris. I don't think it
uses DRM, unless x.org 6.8.2 somehow uses it to
speed up 2D operations.

The bisection search:
a46e812620bd7db457ce002544a1a6572c313d8a good
e0b98c79e605f64f263ede53344f283f5e0548be good
fd3113e84e188781aa2935fbc4351d64ccdd171b good
2757a71c3122c7653e3dd8077ad6ca71efb1d450 good
ba17101b41977f124948e0a7797fdcbb59e19f3e good
saw lots of drm recompile for the next one:
561fb765b97f287211a2c73a844c5edb12f44f1d bad

6ade43fbbcc3c12f0ddba112351d14d6c82ae476 good
And then the final one also seemed good.
If the stop condition could be off by one,
wonder what the next patch is?

Helge Hafting
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


helge.hafting at aitel

Aug 16, 2005, 1:46 AM

Post #16 of 32 (1285 views)
Permalink
Re: rc6 keeps hanging and blanking displays - bisection complete [In reply to]

Linus Torvalds wrote:

>On Tue, 16 Aug 2005, Helge Hafting wrote:
>
>
>>This was interesting. At first, lots of kernels just kept working,
>>I almost suspected I was doing something wrong. Then the second last kernel
>>recompiled a lot of DRM stuff - and the crash came back!
>>The kernel after that worked again, and so the final message was:
>>
>>561fb765b97f287211a2c73a844c5edb12f44f1d is first bad commit
>>
>>
>
>Ok, that definitely looks bogus.
>
>That commit should not matter at _all_, it only changes ppc64 specific
>things.
>
>If the bug is sometimes hard to trigger, maybe one of the "good" kernels
>wasn't good after all. That would definitely throw a wrench in the
>bisection.
>
>

The bisection search:
a46e812620bd7db457ce002544a1a6572c313d8a good
e0b98c79e605f64f263ede53344f283f5e0548be good
fd3113e84e188781aa2935fbc4351d64ccdd171b good
2757a71c3122c7653e3dd8077ad6ca71efb1d450 good
ba17101b41977f124948e0a7797fdcbb59e19f3e good, this one has got more testing,
as my default kernel to boot for the moment.

saw lots of drm recompile for the next one:
561fb765b97f287211a2c73a844c5edb12f44f1d bad

6ade43fbbcc3c12f0ddba112351d14d6c82ae476 good
I'll test this one more to see if it is a false positive, and I'll
also test a known bad kernel without DRM.

Helge Hafting


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


helgehaf at aitel

Aug 16, 2005, 9:52 AM

Post #17 of 32 (1283 views)
Permalink
Re: rc6 keeps hanging and blanking displays [In reply to]

On Tue, Aug 16, 2005 at 09:24:25AM +1000, Dave Airlie wrote:
> > > I'm a little surprised, as a ppc64 fix theoretically shouldn't matter for
> > > x86_64? But perhaps they share something?
> >
> > My guess is that it is maybe the DRM changes that have done it... the
> > 32/64-bit code in 2.6.13-rc6 may have issues, but they've been tested
> > on a number of configurations (none of them by me... I can't test what
> > I don't have...)
> >
>
> Actually after looking back 2.6.13-rc4-mm1 which you say works doesn't
> contain any of the later 32/64-bit changes.. so maybe you can try just
> applying the git-drm.patch from that tree to see if it makes a
> difference...
>
> I'm getting less and less sure this is caused by the drm, (have you
> built with DRM disabled completely??)
>
I tried rc6 with DRM turned off. That kernel consistently _died_ when
trying to start xdm. Xorg logs for both cards ended like this:

(II) LoadModule: "pcidata"
(II) Loading /usr/X11R6/lib/modules/libpcidata.a

Of course the last block of the log may be lost, as this crash
blocked even sysrq so it is reasonable to assume that the disk drivers
and filesystems froze up too.

I can retry this with a syncronously mounted /var, if the last lines
of the Xorg logs might be interesting.


Helge Hafting
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


torvalds at osdl

Aug 16, 2005, 10:00 AM

Post #18 of 32 (1263 views)
Permalink
Re: rc6 keeps hanging and blanking displays [In reply to]

On Tue, 16 Aug 2005, Helge Hafting wrote:
>
> I tried rc6 with DRM turned off. That kernel consistently _died_ when
> trying to start xdm. Xorg logs for both cards ended like this:
>
> (II) LoadModule: "pcidata"
> (II) Loading /usr/X11R6/lib/modules/libpcidata.a

Ok, it does sound like your X server is doing something nasty on the PCI
bus.

> I can retry this with a syncronously mounted /var, if the last lines
> of the Xorg logs might be interesting.

It would be even more interesting if you have a serial console, but if
this is the X server stomping on the PCI bus, you might just have a total
lockup - no oops, no nothing.

One thing that might be interesting is to see if the old working kernel
has a different IO-map than the broken ones. A simple

cat /proc/ioports /proc/iomem > iomaps.kernel-version

and diffing the two might be an interesting thing to try. X has been known
to sometimes just try to re-configure things on its own without telling
(or asking) the kernel.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


helgehaf at aitel

Aug 16, 2005, 12:29 PM

Post #19 of 32 (1274 views)
Permalink
Re: rc6 keeps hanging and blanking displays - bisection complete [In reply to]

On Mon, Aug 15, 2005 at 03:59:07PM -0700, Linus Torvalds wrote:
>
>
> On Tue, 16 Aug 2005, Helge Hafting wrote:
> >
> > This was interesting. At first, lots of kernels just kept working,
> > I almost suspected I was doing something wrong. Then the second last kernel
> > recompiled a lot of DRM stuff - and the crash came back!
> > The kernel after that worked again, and so the final message was:
> >
> > 561fb765b97f287211a2c73a844c5edb12f44f1d is first bad commit
>
> Ok, that definitely looks bogus.
>
> That commit should not matter at _all_, it only changes ppc64 specific
> things.
>
> If the bug is sometimes hard to trigger, maybe one of the "good" kernels
> wasn't good after all. That would definitely throw a wrench in the
> bisection.
>
The hang, or at least an X "pause" tends to happen in 5-10 minutes of
playing cuyo. (�2D game). I have now had the last good kernel
(6ade43fbbcc3c12f0ddba112351d14d6c82ae476) running for almost 24
hours, only interrupted by the brief test of drm-less rc6.

Normal use haven't provoked anything. Since DRM sort of works with this
kernell, I tried tuxracer on the radeon. (Trouble is always with
the radeon, never the mga xserver). I played several games, ok
except for the usual lousy 5-9 fps. One time I had a "pause", the 3D-game
just froze for about half a minute. The other xserver kept
displaying firefox (and updating the page too) but I could not
start any processes there. I tried starting an xterm - it did not
appear until tuxracer "unfroze" and continued as if nothing happened.
Perhaps the frozen process held a lock?

Disk io seemed sluggish after that incident, and the load meter in
icewm seemed to indicate more waiting than usual. The logs tells
me of SCSI aborts and a bus reset. I booted into drm-less rc6 after
that.

Some interrupts are shared on this machine:
$ cat /proc/interrupts
CPU0
0: 10113154 IO-APIC-edge timer
1: 371 IO-APIC-edge i8042
2: 0 XT-PIC cascade
4: 5735 IO-APIC-edge serial
8: 0 IO-APIC-edge rtc
12: 11024 IO-APIC-edge i8042
14: 21 IO-APIC-edge ide0
16: 803248 IO-APIC-level sym53c8xx, eth0, mga [at] pc:0000:01:00.0
17: 0 IO-APIC-level Trident Audio
19: 755535 IO-APIC-level radeon [at] pc:0000:00:08.0
20: 5946 IO-APIC-level libata
21: 9448 IO-APIC-level ehci_hcd:usb1, uhci_hcd:usb2, uhci_hcd:usb3,
uhci_hcd:usb4
NMI: 234
LOC: 10111810
ERR: 0
MIS: 0

The troublesome radoen has a irq of its own. The scsi controller
shares irq with the matrox g550, but that card never seem to
cause any trouble, other than saturating the cpu during games. :-)

On to look at iomem and that rc6 crash.

Helge Hafting
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


helgehaf at aitel

Aug 16, 2005, 2:14 PM

Post #20 of 32 (1284 views)
Permalink
Re: rc6 keeps hanging and blanking displays [In reply to]

On Tue, Aug 16, 2005 at 10:00:50AM -0700, Linus Torvalds wrote:
>
>
> On Tue, 16 Aug 2005, Helge Hafting wrote:
> >
> > I tried rc6 with DRM turned off. That kernel consistently _died_ when
> > trying to start xdm. Xorg logs for both cards ended like this:
> >
> > (II) LoadModule: "pcidata"
> > (II) Loading /usr/X11R6/lib/modules/libpcidata.a
>
> Ok, it does sound like your X server is doing something nasty on the PCI
> bus.
>
> > I can retry this with a syncronously mounted /var, if the last lines
> > of the Xorg logs might be interesting.
>
> It would be even more interesting if you have a serial console, but if
> this is the X server stomping on the PCI bus, you might just have a total
> lockup - no oops, no nothing.
>
Tricky - I have nothing to connect to the serial port.

> One thing that might be interesting is to see if the old working kernel
> has a different IO-map than the broken ones. A simple
>
> cat /proc/ioports /proc/iomem > iomaps.kernel-version
>
> and diffing the two might be an interesting thing to try. X has been known
> to sometimes just try to re-configure things on its own without telling
> (or asking) the kernel.

Diffing the iomaps thus obtained for
2.6.13-rc4-6ade43fbbcc3c12f0ddba112351d14d6c82ae476
and 2.6.13-rc6 produce this:
ba112351d14d6c82ae476 iomaps.2.6.13-rc6
17a18
> 5000-5007 : viapro-smbus
52,53c53,54
< 00100000-0041a94c : Kernel code
< 0041a94d-00695337 : Kernel data
---
> 00100000-003fed39 : Kernel code
> 003fed3a-00662f77 : Kernel data

rc6 has a somewhat smaller kernel, and a viapro-smbus.

The X.org logs also got further, with the synchronous mount:

The radeon log ended like this:
[31] -1 0 0x00009000 - 0x000090ff (0x100) IX[B]
[32] -1 0 0x00009800 - 0x000098ff (0x100) IX[B](B)
[33] 0 0 0x000003b0 - 0x000003bb (0xc) IS[B]
[34] 0 0 0x000003c0 - 0x000003df (0x20) IS[B]
(II) Setting vga for screen 0.
(II) RADEON(0): MMIO registers at 0xf6000000
(II) RADEON(0): PCI bus 0 card 8 func 0
(**) RADEON(0): Depth 24, (--) framebuffer bpp 32
(II) RADEON(0): Pixel depth = 24 bits stored in 4 bytes (32 bpp pixmaps)
(==) RADEON(0): Default visual is TrueColor
(**) RADEON(0): Option "EnablePageFlip" "off"
(**) RADEON(0): Option "DynamicClocks" "off"
(II) Loading sub module "vgahw"
(II) LoadModule: "vgahw"
(II) Loading /usr/X11R6/lib/modules/libvgahw.a
(II) Module vgahw: vendor="X.Org Foundation"
compiled for 6.8.2, module version = 0.1.0
ABI class: X.Org Video Driver, version 0.7
(II) RADEON(0): vgaHWGetIOBase: hwp->IOBase is 0x03b0, hwp->PIOOffset is 0x0000
(==) RADEON(0): RGB weight 888
(II) RADEON(0): Using 8 bits per RGB (8 bit DAC)
(II) Loading sub module "int10"
(II) LoadModule: "int10"
(II) Reloading /usr/X11R6/lib/modules/libint10.a
(II) RADEON(0): initializing int10
(**) RADEON(0): Option "InitPrimary" "on"

It stopped here, while it normally goes on with:
(II) Truncating PCI BIOS Length to 53248
(--) RADEON(0): Chipset: "ATI Radeon 9200SE 5964 (AGP)" (ChipID = 0x5964)
(--) RADEON(0): Linear framebuffer at 0xe0000000
(--) RADEON(0): BIOS at 0x1ff00000
(--) RADEON(0): VideoRAM: 131072 kByte (64 bit DDR SDRAM)
(II) RADEON(0): PCI card detected
(II) Loading sub module "ddc"
...

Seems like it died trying to perform int10 initialization?

The matrox log stopped inside a listing of resource ranges after preInit:
[29] -1 0 0x0000ac00 - 0x0000ac0f (0x10) IX[B]
[30] -1 0 0x0000a800 - 0x0000a803 (0x4) IX[B]
[31] -1 0 0x0000a400 - 0x0000a407 (0x8) IX[B]
[32] -1 0 0x0000a000 - 0x0000a003 (0x4) IX[B]
[33] -1 0 0x00009c00 - 0x00009c07 (0x8) IX[B]
[34] -1 0 0x00009400 - 0x000094ff (0x100) IX[B]
[35] -1 0 0x00009000 - 0x000090ff (0x100) IX[B]
[36] 0 0 0x000003b0 - 0x000003bb (0xc) IS[B]

Normally, this continues with:
[37] 0 0 0x000003c0 - 0x000003df (0x20) IS[B](OprU)
(==) MGA(0): Write-combining range (0xf0000000,0x2000000)
(II) MGA(0): vgaHWGetIOBase: hwp->IOBase is 0x03d0, hwp->PIOOffset is 0x0000
(--) MGA(0): 16 DWORD fifo
(==) MGA(0): Default visual is TrueColor
(II) MGA(0): [drm] bpp: 16 depth: 16
(II) MGA(0): [drm] Sarea 2200+664: 2864
drmOpenDevice: node name is /dev/dri/card0
drmOpenDevice: open result is 7, (OK)

I guess the radeon hung the machine, and the matrox xserver simply wasn't
scheduled after that.

The lockup wasn't total - the numlock LED responded to the numlock key
(and similar for capslock) until I did the sysrq+B. There seemed to be
no reaction, other than no more LED responses.
This kernel doesn't have ACPI so it can't turn the machine off
when doing a normal shutdown, but it is usually capable rebooting.
The console was black of course, no dumps of any kind.

I can try running the radeon xserver only, as the vga console is on the matrox
card.

Helge Hafting
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


airlied at gmail

Aug 16, 2005, 4:50 PM

Post #21 of 32 (1284 views)
Permalink
Re: rc6 keeps hanging and blanking displays [In reply to]

> ...
>
> Seems like it died trying to perform int10 initialization?

I'm still pointing towards that assign pci resources patch from Gregs
tree that I mentioned earlier..

the fact that disabling the DRM stops things from working is really
bad, maybe the pci_enable_device in the DRM is setting up the devices,
whereas without it X tries and fails...

>
> I can try running the radeon xserver only, as the vga console is on the matrox
> card.
>

I'm running low on ideas, I'm also having a hard time tracking what is
actually happening, the MGA bugs I've tracked are related to that
assign pci resources patch, and I really can't see what is happening
if the DRM isn't in the mix..

If you build a working kernel (i.e. like 2.6.13 without DRM) does it
hang similarly?

Dave.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


airlied at gmail

Aug 17, 2005, 4:05 AM

Post #22 of 32 (1261 views)
Permalink
Re: rc6 keeps hanging and blanking displays [In reply to]

> >
> >I'm still pointing towards that assign pci resources patch from Gregs
> >tree that I mentioned earlier..
> >
> >
> git is completely new to me - is there a git-specific way to get this
> patch, or should I download it the usual way from somewhere?

Just grab it from the link to comment #16 on
http://bugzilla.kernel.org/show_bug.cgi?id=4965

and revert it if you could, thanks...

> That was strange, sure. Could be a different bug too.

oh it more than likely is a different bug...
>
> >I'm running low on ideas, I'm also having a hard time tracking what is
> >actually happening, the MGA bugs I've tracked are related to that
> >assign pci resources patch, and I really can't see what is happening
> >if the DRM isn't in the mix..
> >
> >If you build a working kernel (i.e. like 2.6.13 without DRM) does it
> >hang similarly?
> >
> >
> >
> 2.6.13 isn't released, so I assume you meant some earlier kernel?
> I'll see if I can get a drm-less kernel running.

Oh yeah sorry, I meant 2.6.12 or some kernel you know works...

Dave.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


helge.hafting at aitel

Aug 17, 2005, 4:05 AM

Post #23 of 32 (1270 views)
Permalink
Re: rc6 keeps hanging and blanking displays [In reply to]

Dave Airlie wrote:

>>...
>>
>>Seems like it died trying to perform int10 initialization?
>>
>>
>
>I'm still pointing towards that assign pci resources patch from Gregs
>tree that I mentioned earlier..
>
>
git is completely new to me - is there a git-specific way to get this
patch, or should I download it the usual way from somewhere?

>the fact that disabling the DRM stops things from working is really
>bad, maybe the pci_enable_device in the DRM is setting up the devices,
>whereas without it X tries and fails...
>
>
>
That was strange, sure. Could be a different bug too.

>>I can try running the radeon xserver only, as the vga console is on the matrox
>>card.
>>
>>
>>
>
>I'm running low on ideas, I'm also having a hard time tracking what is
>actually happening, the MGA bugs I've tracked are related to that
>assign pci resources patch, and I really can't see what is happening
>if the DRM isn't in the mix..
>
>If you build a working kernel (i.e. like 2.6.13 without DRM) does it
>hang similarly?
>
>
>
2.6.13 isn't released, so I assume you meant some earlier kernel?
I'll see if I can get a drm-less kernel running.

Helge Hafting
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


torvalds at osdl

Aug 17, 2005, 8:19 AM

Post #24 of 32 (1273 views)
Permalink
Re: rc6 keeps hanging and blanking displays [In reply to]

On Wed, 17 Aug 2005, Dave Airlie wrote:
>
> > git is completely new to me - is there a git-specific way to get this
> > patch, or should I download it the usual way from somewhere?
>
> Just grab it from the link to comment #16 on
> http://bugzilla.kernel.org/show_bug.cgi?id=4965

That's a good one to try (and if it matters, can you please do a full
"lspci -vvx" for before-and-after? In fact, it would probably be good to
do that _regardless_ - do it with an old known-good kernel, and with one
recent kernel).

At the same time, something struck me. Does it happen to be much warmer in
your room lately? As in due to a heatwave? I'm just wondering if it might
be something as silly as a thermal shutdown.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


helgehaf at aitel

Aug 22, 2005, 2:44 PM

Post #25 of 32 (1275 views)
Permalink
Re: rc6 keeps hanging and blanking displays [In reply to]

On Wed, Aug 17, 2005 at 08:19:36AM -0700, Linus Torvalds wrote:
>
>
> On Wed, 17 Aug 2005, Dave Airlie wrote:
> > Just grab it from the link to comment #16 on
> > http://bugzilla.kernel.org/show_bug.cgi?id=4965
>
> That's a good one to try (and if it matters, can you please do a full
> "lspci -vvx" for before-and-after? In fact, it would probably be good to
> do that _regardless_ - do it with an old known-good kernel, and with one
> recent kernel).
>
> At the same time, something struck me. Does it happen to be much warmer in
> your room lately? As in due to a heatwave? I'm just wondering if it might
> be something as silly as a thermal shutdown.


Not warmer than usual, but the machine is always hot to the touch,
it is sitting in a small closet where I have taken the door off. Air
circulation still isn't perfect, but there is a strong fan on the cpu,
almost as noisy as a vacuum cleaner. :-(

Cpu loads never killed it before, so I don't suspect that unless the
radeon 9200SE has a thermal shutdown of its own.



I have found that the crash and the balnking may be different problems.
It seems that any kernel with a _working_ drm sooner or later will cause
a hang on the radeon display, possibly but not necessarily freezing the
machine for a while or forever. This happens more often if I actually
stress drm, such as playing tuxracer. But it can happen with
plain firefox/xterm/thunderbird work too. (no opengl screensavers
or animated window managers here.)

My rock solid 2.6.13-rc4-mm1 has drm compiled in, but drm fails when X
starts, and therefore drm isn't used. And therefore, a stable kernel.
From Xorg.2.log:
drmOpenDevice: open result is 6, (OK)
drmOpenByBusid: drmOpenMinor returns 6
drmOpenByBusid: drmGetBusid reports pci:0000:00:08.0
(II) RADEON(0): [drm] DRM interface version 1.2
(II) RADEON(0): [drm] created "radeon" driver at busid "pci:0000:00:08.0"
(II) RADEON(0): [drm] added 8192 byte SAREA at 0xffffc20000147000
(II) RADEON(0): [drm] drmMap failed
(EE) RADEON(0): [dri] DRIScreenInit failed. Disabling DRI.
(II) RADEON(0): Memory manager initialized to (0,0) (1280,8191)
(II) RADEON(0): Reserved area from (0,1024) to (1280,1026)
(II) RADEON(0): Largest offscreen area available: 1280 x 7165
(II) RADEON(0): Render acceleration enabled
(II) RADEON(0): Using XFree86 Acceleration Architecture (XAA)
drmMap failed for this kernel.


Seems like replacing the radeon is a good idea, it will probably never
do stable 3D as even old kernels have this particular problem. The
performance is apalling too, the old g550 gets a 3x-5x better framerate...

The blank display problem is different. That problem follows the
bisect search, i.e. the "good" kernels never ever blanks the
display for me, and the "bad" kernels always do so after a little while.
Even if all I use is 2D stuff. (All with drm configured)

As for the patch to revert - it did fix things so an rc6 without drm
came up. I'm using that kernel now. I guess it'll be fine,
with no drm. I'll keep it running tomorrow, for stability testing.

What is the next logical step? rc6 with both drm and this patch reverted?
Or is there any new development?


There are three lspci-vvx files attached.
One for plain 2.6.13rc6, which can't run X.
One for 2.6.13rc6 with the patch reverted, and one for
the same kernel after I rebooted the machine and also
started X. The lspci-vvx was sligthly different then.


Helge Hafting
Attachments: lspci-vvx-2.6.13rc6 (16.5 KB)
  lspci-vvx-2.6.13rc6p (16.3 KB)
  lspci-vvx-2.6.13rc6p-afterX (16.4 KB)

First page Previous page 1 2 Next page Last page  View All Linux kernel RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.