Gossamer Forum
Home : Products : Gossamer Links : Discussions :

LINKS2.2.0 , INTERNAL Indexing, VERIFY LINKS BUG

Quote Reply
LINKS2.2.0 , INTERNAL Indexing, VERIFY LINKS BUG
All was going well on LINKSSQL 2.2.0
Changed Indexing to INTERNAL as per suggestions on forum to have search function properly.

Now Verify Links shows "N" number of links as Bad, though I know that those links aren's bad as all links are on a Subdomain Hosting Account provided by us to all Link Owners.

Oh God. These Bugs are really driving us crazy.
It's a Cpanel Box and there is nothing Funky about the installation. It was working all well with rspect to Verify Links till the Indexing Scheme was changed to Internal. Nothing really makes it get back to OK, no Repair Tables No Rechecking All Links, Nothing Really.

Anyone from GT listening?

HyTC

PS : This Version Of LINKSSQL 2.2.o was installed a week after it was released.

Last edited by:

HyperTherm: Oct 12, 2004, 5:54 AM
Quote Reply
Re: [HyperTherm] LINKS2.2.0 , INTERNAL Indexing, VERIFY LINKS BUG In reply to
All the Links Marked As Bad Links Have Their Status Updated to 403 in the Links Table (Status Column). I knoiw what 403 Means so no need to delve into that. All those links are accessible by what is there in the URL column in the same table.

So it's real crazy that LinksSQL updates the Status column to 403 for plain urls of the type

http://username.linkssqldomain.com

And surprisingly enough not for all.
When i went to Verify a New Link Added I Saw 17 Bad Links.
I Verified The New Link which Went About OK.
Then I went about Verifying All Links
The Number Jumped from 17 to 24

I Repair Tables
I Verify All Links Again
Same It's Stuck at 24 Bad Links
All Those 24 Bad Links Are Accessible By Typing The Address In Address Bar.
All Those 24 Bad Links Are Also Accessible When The Link To <%URL%> on profile is clicked. (ie the Detailed Page)

I Just Have Following Plugins Just In Case It Would Make It Easier for GT to Answer:
SearchLogger 1.1 Alex Krohn
XMLResults 1.0 Gossamer Threads Inc.
YahooSubcats 1.3 Mel
Recommend_It 3.0.1 Andy Newby (UltraNerds)
Days_Old 1.0 Robert Blackstone and lparry
ContactPage 1 Andy Newby (UltraNerds)

System Information
======================================
Perl Version: 5.008001
Links SQL Version: 2.2.0
DBI.pm Version: 1.45
Running under mod_perl: Yes (version 1.29)

Running under SpeedyCGI: No

Mysql Version : 4.0.20

HyTC

Last edited by:

HyperTherm: Oct 12, 2004, 6:36 AM
Quote Reply
Re: [HyperTherm] LINKS2.2.0 , INTERNAL Indexing, VERIFY LINKS BUG In reply to
Posting Again:

All those Links which were marked with a 403 status, were individually verified and then it went about ok.

Then upon running the verify with All option selected, again 13 Links were marked with a 403 status. My earlier attempt at re-re-re-re-attempting Verify All had taken the server down. So this time around i did not attempt re-re-re-re verify stuff.

After manually verify with "Recheck" single links one by one (two links from Admin), i ran the same until i had 11 Bad Links still with that 403 status as shown in Admin.

So i ran nph_verify.cgi --check all and this is what the output is from command line:

Wed Oct 13 00:56:13 2004 New child started
Wed Oct 13 00:56:13 2004 New child started
Wed Oct 13 00:56:13 2004 New child started
Wed Oct 13 00:56:14 2004 New child started
Wed Oct 13 00:56:14 2004 New child started
Wed Oct 13 00:56:14 2004 New child started
Wed Oct 13 00:56:14 2004 New child started
Wed Oct 13 00:56:14 2004 New child started
Wed Oct 13 00:56:14 2004 New child started
Wed Oct 13 00:56:14 2004 New child started
Finished launching children

43 Missing URL Success (200). Message: OK 200
11 Missing URL Success (200). Message: OK 200
12 Missing URL Success (200). Message: OK 200
13 Missing URL Success (200). Message: OK 200
14 Missing URL Success (200). Message: OK 200
21 Missing URL Success (200). Message: OK 200
31 Missing URL Success (200). Message: OK 200
53 Missing URL Success (200). Message: OK 200
1 Missing URL Success (200). Message: OK 200

With the above run, the Admin Section still shows 11 bad links with 403 status. What's all this crap happening. Read it somewhere some other user having problem with Link Verify ... it had taken the server down earlier for me. One really feels pissed off at this.

It's real crap that's happening. If there is a BUG which has been solved for Vishal, then why not make it available to others. What the hell. Running to door only to provide access to server is not the solution. Last time around when access was provided for that sendmail stuff it never got fixed and the only answer that i had was some non mod_perl stuff responsible for not letting sendmail wrapper being used by LSQL. If a stuff is not stable enough to run on NON Gt servers then please specify it in your minimum requirements.

This is absolute crazy stuff.

HyTC.

Last edited by:

HyperTherm: Oct 12, 2004, 12:40 PM
Quote Reply
Re: [HyperTherm] LINKS2.2.0 , INTERNAL Indexing, VERIFY LINKS BUG In reply to
After having rechecked each 403 Link individually, and reconciled the same, ran

nph-verify.cgi --check-all from command line and this is the output:

Wed Oct 13 01:18:36 2004 New child started
Wed Oct 13 01:18:36 2004 New child started
Wed Oct 13 01:18:36 2004 New child started
Wed Oct 13 01:18:36 2004 New child started
Wed Oct 13 01:18:36 2004 New child started
Wed Oct 13 01:18:36 2004 New child started
Wed Oct 13 01:18:37 2004 New child started
Wed Oct 13 01:18:37 2004 New child started
Wed Oct 13 01:18:37 2004 New child started
Wed Oct 13 01:18:37 2004 New child started
Finished launching children
11 Missing URL Success (200). Message: OK 200
31 Missing URL Success (200). Message: OK 200
43 Missing URL Success (200). Message: OK 200
32 Missing URL Success (200). Message: OK 200
53 Missing URL Request Failed (403) Message: Forbidden
1 Missing URL Success (200). Message: OK 200


Total Run Time: 1 second(s)
Total Links checked: 6
Total Links Bad: 1
Total Links Good: 5

Huh

HyTC
Quote Reply
Re: [HyperTherm] LINKS2.2.0 , INTERNAL Indexing, VERIFY LINKS BUG In reply to
OK Here again:

All links are verified (at least that's what the Admin Shows).
Cannot run anything related to Complete Link Verification from Admin for the fear of bringing the server down yet again as it did about 10 hours back. So did run ./nph-verify.cgi --check_all from command prompt twice: The results as scrolled are as follows:

First Run:

Checking 51 links ...

Wed Oct 13 06:59:43 2004 New child started
Wed Oct 13 06:59:43 2004 New child started
Wed Oct 13 06:59:43 2004 New child started
Wed Oct 13 06:59:43 2004 New child started
Wed Oct 13 06:59:43 2004 New child started
Wed Oct 13 06:59:43 2004 New child started
Wed Oct 13 06:59:43 2004 New child started
Wed Oct 13 06:59:43 2004 New child started
Wed Oct 13 06:59:44 2004 New child started
Wed Oct 13 06:59:44 2004 New child started
Finished launching children
31 Missing URL Success (200). Message: OK 200
32 Missing URL Success (200). Message: OK 200
33 Missing URL Success (200). Message: OK 200
34 Missing URL Success (200). Message: OK 200
11 Missing URL Success (200). Message: OK 200
12 Missing URL Success (200). Message: OK 200
23 Missing URL Success (200). Message: OK 200
21 Missing URL Success (200). Message: OK 200
22 Missing URL Success (200). Message: OK 200
53 Missing URL Success (200). Message: OK 200
43 Missing URL Success (200). Message: OK 200
1 Missing URL Success (200). Message: OK 200


Total Run Time: 2 second(s)
Total Links checked: 12
Total Links Bad: 0
Total Links Good: 12

Average time to check one link: 0.17s
Average links checked in a second: 6.00

Second Run:

Checking 51 links ...

Wed Oct 13 07:01:21 2004 New child started
Wed Oct 13 07:01:21 2004 New child started
Wed Oct 13 07:01:21 2004 New child started
Wed Oct 13 07:01:21 2004 New child started
Wed Oct 13 07:01:21 2004 New child started
Wed Oct 13 07:01:21 2004 New child started
Wed Oct 13 07:01:21 2004 New child started
Wed Oct 13 07:01:21 2004 New child started
Wed Oct 13 07:01:21 2004 New child started
Wed Oct 13 07:01:21 2004 New child started
Finished launching children
11 Missing URL Success (200). Message: OK 200
12 Missing URL Success (200). Message: OK 200
13 Missing URL Success (200). Message: OK 200
21 Missing URL Success (200). Message: OK 200
31 Missing URL Success (200). Message: OK 200
43 Missing URL Success (200). Message: OK 200
14 Missing URL Success (200). Message: OK 200
1 Missing URL Success (200). Message: OK 200


Total Run Time: 1 second(s)
Total Links checked: 8
Total Links Bad: 0
Total Links Good: 8

Average time to check one link: 0.12s
Average links checked in a second: 8.00

So Complete List of Links are really not being checked? And what is this missing URL? All the URLS are there locally on the server for all the data in <%URL%> column.

If i recheck all links from Admin, it would mark 13 with a 403 status to start with which only goes off if each 403 status link is again rechecked individually, but would bring the server down if it's retried repeatedly (due to status not being changed from 403).

HyTC
Quote Reply
Re: [HyperTherm] LINKS2.2.0 , INTERNAL Indexing, VERIFY LINKS BUG In reply to
Expanding Further:

Before running ./nph-verify --check_all from command line.

Repaired Database == Everything reported OK (All Tables OK)

This MySQL server has been running for 0 days, 2 hours, 23 minutes and 7 seconds. It started up on Oct 13, 2004 at 07:52 AM.

Turned debugging on

executed the command from commandline

This is what log shows (few examples):

GT::SQL::Driver::MYSQL::sth (8416): Executing query: SELECT COUNT(*) FROM lsql_Links from main::check_links at ./nph-verify.cgi line 165

GT::SQL::Driver::MYSQL::sth (8416): Executing query: SELECT ID FROM lsql_Links from main::check_links at ./nph-verify.cgi line 192

GT::SQL::Driver::MYSQL::sth (8418): Executing query: SELECT * FROM lsql_Links WHERE (ID = 1) from Links::Tools::check_links at /admin/Links/Tools.pm line 298

GT::SQL::Driver::MYSQL::sth (8419): Executing query: SELECT * FROM lsql_Links WHERE (ID = 11) from Links::Tools::check_links at /admin/Links/Tools.pm line 298

GT::SQL::Driver::MYSQL::sth (8420): Executing query: SELECT * FROM lsql_Links WHERE (ID = 21) from Links::Tools::check_links at /admin/Links/Tools.pm line 298

GT::SQL::Driver::MYSQL::sth (8421): Executing query: SELECT * FROM lsql_Links WHERE (ID = 31) from Links::Tools::check_links at /admin/Links/Tools.pm line 298

GT::SQL::Driver::MYSQL::sth (8422): Executing query: SELECT * FROM lsql_Links WHERE (ID = 43) from Links::Tools::check_links at /admin/Links/Tools.pm line 298

GT::SQL::Driver::MYSQL::sth (8423): Executing query: SELECT * FROM lsql_Links WHERE (ID = 53) from Links::Tools::check_links at /admin/Links/Tools.pm line 298

GT::SQL::Table (8423): Status cannot contain the value '[undefined]' at ./nph-verify.cgi line 440.

GT::SQL::Driver::MYSQL::sth (8419): Executing query: SELECT * FROM lsql_Links WHERE (ID = 12) from Links::Tools::check_links at /admin/Links/Tools.pm line 298

GT::SQL::Table (8419): Failed to execute query: 'SELECT * FROM lsql_Links WHERE (ID = ?)' Reason: MySQL server has gone away at /admin/GT/SQL/Driver/MYSQL.pm line 121.

GT::SQL::Driver::MYSQL::sth (8419): Executing query: SELECT * FROM lsql_Links WHERE (ID = 13) from Links::Tools::check_links at /admin/Links/Tools.pm line 298

GT::SQL::Table (8419): Failed to execute query: 'SELECT * FROM lsql_Links WHERE (ID = ?)' Reason: MySQL server has gone away at /admin/GT/SQL/Driver/MYSQL.pm line 121.

GT::SQL::Driver::MYSQL::sth (8419): Executing query: SELECT * FROM lsql_Links WHERE (ID = 14) from Links::Tools::check_links at /admin/Links/Tools.pm line 298

GT::SQL::Table (8419): Failed to execute query: 'SELECT * FROM lsql_Links WHERE (ID = ?)' Reason: MySQL server has gone away at /admin/GT/SQL/Driver/MYSQL.pm line 121.

GT::SQL::Driver::MYSQL::sth (8419): Executing query: SELECT * FROM lsql_Links WHERE (ID = 15) from Links::Tools::check_links at /admin/Links/Tools.pm line 298

GT::SQL::Table (8419): Failed to execute query: 'SELECT * FROM lsql_Links WHERE (ID = ?)' Reason: MySQL server has gone away at /admin/GT/SQL/Driver/MYSQL.pm line 121.

and so on.

There is nothing logged in mysql log file. No CR_SERVER_GONE_ERROR or CR_SERVER_LOST

mysql log file just has this:

041013 07:52:26 mysqld started
041013 7:52:26 InnoDB: Started
/usr/sbin/mysqld-max: ready for connections.
Version: '4.0.20-Max' socket: '/var/lib/mysql/mysql.sock' port: 3306

The Runtime information of Mysql after the run also does not show any abnormality and is as follows:

This MySQL server has been running for 0 days, 2 hours, 34 minutes and 30 seconds. It started up on Oct 13, 2004 at 07:52 AM.


There is no column storing that >1MB stuff which can give this error. No images stored in database.

There are no server timed out and closed the connection related errors.

So what exactly is happening is not clear to me

The only other possibility of this type of error (Server Gone Away) within an application program is that you tried to run a query after closing the connection to the server. This indicates a logic error in the application that should be corrected.

The Admin interface show all links ok but if a try to reverify all links then again 13 links are marked 403 on first run from Admin interface.

Any thoughts on this?

HyTC

Thanks
HyTC
==================================
Mail Me If Contacting Privately Is That Necessary.
==================================

Last edited by:

HyperTherm: Oct 12, 2004, 10:36 PM
Quote Reply
Re: [HyperTherm] LINKS2.2.0 , INTERNAL Indexing, VERIFY LINKS BUG In reply to
Anyone
Or Is It An Alien Type Of A Problem?
It's really surprising that it should be so evidently ignored

HyTC
Quote Reply
Re: [HyperTherm] LINKS2.2.0 , INTERNAL Indexing, VERIFY LINKS BUG In reply to
“It was working all well with respect to Verify Links till the Indexing Scheme was changed to Internal”

Why don’t you change it back to the original indexing scheme and wait for GT to have a look at the problem.

Regards

minesite
Quote Reply
Re: [minesite] LINKS2.2.0 , INTERNAL Indexing, VERIFY LINKS BUG In reply to
Change Back Indexing To What it was?
Well Search gets crazy for sure in that case.

HyTC

Thanks
HyTC
==================================
Mail Me If Contacting Privately Is That Necessary.
==================================
Quote Reply
Re: [HyperTherm] LINKS2.2.0 , INTERNAL Indexing, VERIFY LINKS BUG In reply to
np-verify.cgi whereever run from (Admin or sheel) always causes the following which i believe is ausing all this error:

Aborted Connection

When the span child is reduced to 1, then at least the process displays whats happening though the Links are marked with 403 status and it does go through each link. With span child 10 (default) the complete links lists is never scanned.

No other cgi of GT Applications is causing this Connection to be Aborted reproduciably. nph-verify, the Aborted Connection error is reproducible. It happens all the time whenever the script is executed (from Admin or shell).

HyTC

Thanks
HyTC
==================================
Mail Me If Contacting Privately Is That Necessary.
==================================
Quote Reply
Re: [HyperTherm] LINKS2.2.0 , INTERNAL Indexing, VERIFY LINKS BUG In reply to
I've attached some replacement files (nph-verify, Links::Tools, and GT::WWW) that hopefully will fix your problems. You will have to update the use lib and init() lines in nph-verify.cgi. Let me know how it works out.

Adrian
Quote Reply
Re: [brewt] LINKS2.2.0 , INTERNAL Indexing, VERIFY LINKS BUG In reply to
Replaced the files. For files in GT replaced at following locations:

LSQL
GCommunity (which is latest sofirst in startup.pl).
Restarted httpd-perl

(1)Improvements:

The process completes without any DB Abort errors in single pass (ie all the Links are checked unlike earlier where it was not so).
The recheck of Links Marked with 403 proceeds to completion in single pass which was not happenoing earlier. (Due to Aborted DB Connections)

(2) Status Quo:
Still "N" number of links are marked with 403 status on a Fresh Verify All.

(3) New Problem:

There is one link which has the URL accessible from http and is through a CNAME record in the domain's zone file. This URL, no matter how many times the verification is done always reports "Unable To Connect" and the status is Marked as -4

The exact Message is:

Checked 1 - http://domains.somedomain.com/ - Request Failed (-4) Message: Could not connect

The zone file of somedomain.com has the following:

domains 14400 IN CNAME www.someotherdomain.com.

The above link is accessible from http and lynx (though needs cookie allow verification while with lynx)

This is the First Check feedback. Not sure how the replacement in GT's would turn out in Gossamer
Mail which is symlinked to Community GT.


Thanks
HyTC


Thanks
HyTC
==================================
Mail Me If Contacting Privately Is That Necessary.
==================================
Quote Reply
Re: [brewt] LINKS2.2.0 , INTERNAL Indexing, VERIFY LINKS BUG In reply to
While on Points (1) and (2) in previous post the 403 status is marked irrespective of whther the check is done from Admin or from shell. Not too sure but would following

http://www.nuclearelephant.com/projects/dosevasive/

be anyway responsible for causing 403?
The front-end httpd has mod_dosevasive in place.

Thanks
HyTC

Thanks
HyTC
==================================
Mail Me If Contacting Privately Is That Necessary.
==================================
Quote Reply
Re: [HyperTherm] LINKS2.2.0 , INTERNAL Indexing, VERIFY LINKS BUG In reply to
Can you give me a list of the URL's which are causing problems (email me them if you don't want to post them)?

Adrian
Quote Reply
Re: [brewt] LINKS2.2.0 , INTERNAL Indexing, VERIFY LINKS BUG In reply to
Sure. I have just emailed the result of a Fresh Verification.
For the Link ID 1 there is no way that i can get it to be verified other than updating the table manually.

mod_dosevasive is there on front end httpd, and i shall test it after removing the same during tonights httpd and ssl upgrades tonight (our timezone).

Thanks
HyTC
Quote Reply
Re: [brewt] LINKS2.2.0 , INTERNAL Indexing, VERIFY LINKS BUG In reply to
Hi.

After the upgrades, i just tried with disabling mod_dosevasive and found that 403 wasn't occurring. However, that -4 status link (ID1 in the mail sent), still shows no change.

So the problem now narrows down to:

(1) How to get a cnamed link verified?
(2) Would it not be possible to have mod_dosevasive in place and also avoid 403 errors?

For (2), i know the other alternatives could be:
Disable mod_dosevasive with every bulk verification or
Recheck all 403 links again and again till all are verified ok.

Suggestions on this?

HyTC

Thanks
HyTC
==================================
Mail Me If Contacting Privately Is That Necessary.
==================================
Quote Reply
Re: [HyperTherm] LINKS2.2.0 , INTERNAL Indexing, VERIFY LINKS BUG In reply to
Your unverifiable link is caused due to your web server incorrectly sending body data on a HEAD request. Attached to this post is a work around we have added in GT::WWW.

Adrian
Quote Reply
Re: [HyperTherm] LINKS2.2.0 , INTERNAL Indexing, VERIFY LINKS BUG In reply to
I think you have your dosevasive sensitivity set way too high. During our debugging on your unverifiable link problem, it denied access after 5 attempts over several minutes of testing.

Adrian
Quote Reply
Re: [brewt] LINKS2.2.0 , INTERNAL Indexing, VERIFY LINKS BUG In reply to
OK this solves the problem which was there for a cname'd link.
Would this be a standard part of GT libs or do i have to keep a separate track of the replacements that you have given?

Thanks
HyTC

Thanks
HyTC
==================================
Mail Me If Contacting Privately Is That Necessary.
==================================
Quote Reply
Re: [HyperTherm] LINKS2.2.0 , INTERNAL Indexing, VERIFY LINKS BUG In reply to
This will be included in the next product release.

Adrian
Quote Reply
Re: [brewt] LINKS2.2.0 , INTERNAL Indexing, VERIFY LINKS BUG In reply to
OK. Based on number of children spawned, have tweaked the settings in dosevasive directives in front end httpd and find the 403 problem to be non existent now. The earlier settings were the defaults noted in the documentation.

Just a request, which i believe i had posted earlier in different forum also, if GT Libs can be made as a separate downloadable thing which would basically assist in avoiding any surprises for symlinked GT libs. There are misuff when Community is compared with LSQL. What i meant was a common set of GT Libs applicable and workable with all GT products, so that product upgrade and GT Libs upgrades could be two separate exercise. Perhaps as Alex mentioned elsewhere, this could be an option in the installer too.

Thanks
HyTC

Thanks
HyTC
==================================
Mail Me If Contacting Privately Is That Necessary.
==================================
Quote Reply
Re: [brewt] LINKS2.2.0 , INTERNAL Indexing, VERIFY LINKS BUG In reply to
Hi Adrian.

The problem with respect to linkid1 has reappeared again.
This actual url is not on my server. There is a CNAME entry in the domains zonefile which is on my server (just for sake of having a "Branded URL").
The only change has been that Mysql was upgraded to 4.0.22-Max which i am sure cannot be causing this, but that's the only change.

Thanks
HyTC

Thanks
HyTC
==================================
Mail Me If Contacting Privately Is That Necessary.
==================================