Gossamer Forum
Home : Products : Gossamer Links : Version 1.x :

Category stats caluclation problems

Quote Reply
Category stats caluclation problems
I did a complete wipe out of my links database and uploaded the latest version. Import worked, everything worked, until I ran BuilD all. After trying for a few minutes, the Calculating category stats part just times out. As below:

Quote:
Building Pages ...

Backing up database ...
Backup disabled ... Skipping
Done

Updating vote and hit counters ...
Done (0 s)

Updating new and popular records ...
Marking records added after: '1999-11-04' as new, rest back to old ...
Updated 0 entries to new, 0 entries back to old ...
Updated 0 entries to modified, 0 entries back to unmodified ...
Calculating popular cutoff ...
Marking records popular with hits > 2 ...
Updated 0 entries to pop, 0 entries back to unpopular ...
Done (45 s)

Calculating category stats ...

Quote Reply
Re: Category stats caluclation problems In reply to
How many of your links were imported into category '0' ??

Check and see if you have any 'broken' categories.

Quote Reply
Re: Category stats caluclation problems In reply to
    
Quote:
How many of your links were imported into category '0' ??

I searched for category 0. No results were found at all.

243 is the lowest category number I seem to have but they bounce around up to the tens of thousands after that (open directory).

Any other ideas? Thanks!

Ted

[This message has been edited by infinity (edited November 11, 1999).]

[This message has been edited by infinity (edited November 11, 1999).]
Quote Reply
Re: Category stats caluclation problems In reply to
Hello!

How many categories you have EXACTLY to import AND in the database?

You said 243? Is this in the database or more? By the following sentence, I got confused.+

I have had severe problems in many areas (Stupid enough, that people claimed them to be "My only custom requirements!") and if the nature of your problems are similar to mine, may be I can help you. What you see, I have seen number of times Wink:
Quote Reply
Re: Category stats caluclation problems In reply to
According to the List All Categories link, I have 3641 categories (although I had almost 5000 last time I built this.. odd).

All of those except for 3 test categories were imported through the parse file.
Quote Reply
Re: Category stats caluclation problems In reply to
Hello there!

You have two options.

1.
Instead of wasting time to worship what happens on the screen, you need to get in to a conversation with ALEX. Do dot try as links SQL does not work with large number of categories.


links SQL does not work for more than thousand categories, especially on a shared server!


A simple theory every one needs to know. The manner in which the internals are written Links SQL will give always problems > more the categories more the problems.

I have described all of them and ---"For myself"--- have found a wonderful solution that is convienent to me only. However,

This situation will always give problems in the following areas,

-Build stuck ups or time ups
-Extremely long pop up listing od every link in the add.cgi
-extraordinary long double listing in the admin area with alternate category multiplying with each link.
-Stats time outs.
-Cannot Validate at all in this manner.

All this problems are known and I do not want to repeat them here. You can read my threads and all the nightmares (Categories + Imports) are clearly described and well documented.

2.
The best thing you can do is reduce the number of categories. THERE IS NO OTHER WAY. The BAD part of the story is that :

Links SQL is designed for large number of sites. So lets say, 1,00,000 having 6-10 links per page distributed over 5,000 links resulting to 20,000 per Main category. This is a rough calculation. Unfortunately this is not the case. Such a distribution cannot work with Links SQL. You can have 1,00,000 links in 200 categories and it will work wonderful but the moment you have more categories it cannot take the burdon anymore. It becomes weak to carry the weight.

All the internals are designed on the basic fundamentals of "Going through all"! For large sites this is wrong. Alex mentioned on the main page that Build staggered will or may be done some day.

According to my benchmarks it works out very good with 500 - 1,000 categories. If you have more, it starts to play flute.

Further, looking at the internals I came to a surprise the way how nph-build.vgi THE MAIN ENGINE THAT DRIVES THE SCRIPT is designed. It could be very easy to focus on build category or staggered and revise the internals.

Ofcourse I am not at all interested in entering in to conversation with "other" links SQL users on this topic. I am not interested into repeating the same things.



[This message has been edited by rajani (edited November 13, 1999).]
Quote Reply
Re: Category stats caluclation problems In reply to
While I do indeed share a simlar problem; my views greatly conflict with yours. Links sql 1.02 worked magificently with 68,000 links across 5,012 categories.

When I updated to 1.1b2 is when the problems started with building all. Since I had no problems with 1.02 I can safely say that the script dooes work with thousands of categories. If there is a bug in 1.1b2 then there is a bug and hopefully someone will help me fix it.

However, the price of beta is just that -- beta. If a beta script has a few bugs and doesn't run perfectly, so be it. Life goes on.


I happen to provid tech support and such at another perl company. Perl is something I know very well and upon looking through the script I have not only learned a good deal but noticed a very nice coding job. Sure, things could have been better but links sql is well done in most areas.

Bashing a script because your custom changes do not occur properly is rude to the creators, supporters and owners. If links 1.02 worked with with over 5,000 categories with the ram ussage never getting past 1 swap per second or .23 cpu ussage.

In the future, please keep your hate to yourself or at least away from me. I'd rather remain naive then spend my time reading how horrible links sql has been to you.

------------------
Ted Sindzinski
www.infinityinternet.com
Quote Reply
Re: Category stats caluclation problems In reply to
infinity..

i'd look through the category database with a administration script like "phpmyadmin" and see if the structure is all messed up.

i am having no problems at all (of course i haven't imported or anything).. so i think it has to do something with the way it imported..

rajani.. how many links do you have?

jerry
Quote Reply
Re: Category stats caluclation problems In reply to
Hello Ted!

I tried to help you in telling you to save your time and am not at all interested to pour my hatred on links SQL. Infact I am a great lover of the design and also the quality it has to render. All that make me upset is the attitude of people in terms of how "Something is understood" and further their reaction!

I started to put messages but it becomes ridiculous when people say that its custom rather than applying their minds in understanding what eventually it could be!

The reason why I informed you was only that you could save your time as I have investigated a lot in this direction and have given up. The problem is in the internals where it starts to work in the nph-build.cgi. The problems are subdued on a dedicated server, but they are magnified on a shared server as is the case of mine. Wether you like my message or not, the fact remains.

Hello Widgetz!

I have 4600 categories and 10 links!

The moment I try to build it gets blocked up and has time out problems!. On a dedicated machine, it may go upto many thousand categories, may be even up to 10,000 categories or may be even above!!!

With this little number of links nor did it sucessfully build nor calculate stats. Looking at the internals I had a feeling that Build staggered is a better option. Build category from CategoryHierarchy table it gets information as to what it should build.

I am on a shared server + Pentium III. So ofcourse thats a reason why it could be slow, however the problem remains. Even if there is one link new, one has to build the entire database and use bandwith of atleast 50-100 MB per build daily for a hundred thousand links, in theory.

For testing, one can start with 500 categories having 10 links. Keep on importing categories via Editor.cgi and find out the cutt-off point where it stops. On every machine one can do this and figure out what and how many categories + Links the script can handle and have their own benchmark. Thats the way to trigger the problem.

It may well be that there is a bug in the last version regarding stats(if it worked earlier with 5,000 categories for infinity), however this was a trouble that I found in earlier versions that I had no possibility to build above 2000-3000 categories and 10 links!!°
Quote Reply
Re: Category stats caluclation problems In reply to
Like I said before. On links SQL 1.02 I had 5012 categories running real well. Now I'm having problems with a beta version.

Links sql works with over 1000 categories. the new beta has some problems but it is beta.
Quote Reply
Re: Category stats caluclation problems In reply to
Hello infinity!

True, its a beta and things like this may happen. Perfectly understandable.

May I ask you, since you know perl very well, and regarding this problem a suggestion I have as follows:

In the &build_category_pages sub routine nph-build.cgi will search and query the CategoryHierarchy table. It will focus on the main categories and get information on all the SubCategoryIDs. Then after it will build only those SubCategories including the main category.

This would be the ONE way to "Build new Links only!" If there are new links, one has to get all the stats cool, & hits new. Therefore one has to build those categories & sub_categories. In this system, only new, cool and hits will be build many times as many as SubCategory builds. However it will be much less than the total number of categories pages unnecessary builds.

If this is implemented, all the sites using Links SQL would be benefited in terms of time and bandwiths + cgi calls.

Looking at your problems which are not too different than mine, I am trying to make a fruitful conversation here rather than any waste of time in complainingfrom my side as it seems to be the misunderstanding above. Hoping to see your positive and constructive response, if at all you would like to take time and share.

I have tested in the manner above by importing categories, same 500 categories block imported many times, and found out the cut-off point. This was on a shared machine. I would be very interested if there is someone doing such a testing on a dedicated machine. There one can find out very quickly the other part of the range which will be much different than my findings. If this is true what I found out, it would be ofcourse a very constructive feedback to Alex, I beleive, who would know the behaviour of the script in different conditions and environments.
Quote Reply
Re: Category stats caluclation problems In reply to
I'm running a build all via telnet. After about the 30th meg of logs for the telnet session, I can see why the browser doesn't do too well (not much is printing but telnet shows all).

I'll let you know the results, when I wake up.