Gossamer Forum
Home : Products : Gossamer Links : Version 1.x :

Debate : Multiple categories ./. Categories Alternates :

Quote Reply
Debate : Multiple categories ./. Categories Alternates :
Hello Mr. Pataki!

www.gossamer-threads.com/scripts/forum/resources/Forum9/HTML/000080.html
You wrote earlier, this I debate here further...
Quote:
I'm not sure what you mean by "multi-categories"...

There are two ways or parts to say :

1 - Category Alternates One is the possibility of submitting links in one main category and having a second alternative possibility of assigning into a second category.

2 - Multiple CategoriesSecond is to insert the link in two categories or Multiple Categories.

Both seems confusing and eventually do the same. Funny. Or it is a wrong way to describe. But I tried to do this only to descriminate what is a way in describing the difference in what I think Links SQL has to offer and its inherrent limitation.

Category Alternates
Studying the present design of Links SQL it offeres an alternate possibility to insert one more link in to anathor category. By doing so in the main links table there is only one category number inserted and the alternate category number is inserted into anathor table.

Multiple Categories
In such a concept the categories are inserted into the same table of links at the same time.

Let us give an example. Subject categories and Geographic categories.

In the system of Category Alternates one has to decide which is a priority, and lets say Geographic categories are the priorities. So this goes into the main Links table. Then Subject categories goes into the Categoey Alternates.

In the system of Multiple Categories Both the categories goes into the Links table at the same time. Multiple Categories to me means both are main categories. In both the system eventually it will be the same thing but the end result of the story is its basic difference in the following points.

A - The manner how they are submitted.
B - The manner how they are in use, for add, modify, admin, etc.
C - The way how they could help in import - export regarding its growth.

Explaining further, lets say how they are added. In Alternates they would simply be chosen from a list of all categories in one table and then one can insert from the same table and put one more category into anathor table of alternates. So eventually one link is inserted into two categories. When we see this process of how this is or could be done in reality a surfer is offered in the add form two fields. both this foeld have the same list of categories!!! So this list cannot be sorted out because it is produced from the same table of category. It is therefore I define it as Categories Alternates as the table of category remains the same and only an alternative possibility is offered to the surfer by inserting one more link. So what happens in the other system...

Multiple Categories means to me that a multiple choice of set of categories are offered to the surfer in the add form. So how this can be done. Then one needs two tables to do this so that add.cgi could produce two set of listings from two different tables of categories. They could be also more number of tables of categories! So a surfer submits directly into two real seperated categories into the database.

What does it mean by the term medium to Large directories? To me it means a directory having more then 1,00,000 links at least as a kind of min. start of the definition. Lets say or assume one has a project in mind with 2,00,000 links to talk about. So this number of links has to be somehow distributed somehow into different categories so that they are reasonably sorted somehow. So if I have 1000 categories incliding its sub-categories they I will have on an average 200 links in every category. (This will never be the case, ofcourse, its just an assumtion). Therefore I will have 20 pages of 10 Links per page. The visitors will have to go through all this if they do not want to go through search system. Yahoo avoids this very carefully, as it said in a news. Yahoo wants to categorize every links in its right place and keep on sub-categorization further untill enough value it reached to assign a category value to that perticular link. They said it on www.internet.com some tims ago. Makes sense.

What if I have 2,000 links? I will see all of them in add form! Unless one have to go to that perticular area of submission and capture the category from that page, which links does extremely well. But what will happen to admin? When I have two fields of categories each 2,000 links, they will be 4,000 links loaded per link!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Alex say use category ID instead. This is not practical.

Therefore what is required is a table selection routine. When the scripts goes through cat_list routine, if a subroutine is developed that it selects a table from the global array seems perfect. This means in the field of subject category it produces a list of those relevant categories and in the field of geographic categories it produces those relevant listing inthe popup field of the add/modify/admin area. The categories are sorted out from the tables. This reduces load and unnecessary transfer occupy less time of every one, surfer or admin in every situation. For e.g., to repeat, .....

If I have 2,000 categories then.........
Category Alternates system which is now in the Links SQL will produce 4,000 categories of two fields (category + alternate category) in the listing of every link. If the admin page has ten links to work or validate per page, then simply for this 40,000 listing of categories will be downloaded for this page of admin!!!

If I receive 200 Links request per day then to validate 200 links per day, I will have to sit down in front of the computer and would be forced to download as below :

200 Links to validate per day
2,000 categories in one table category
2 fields of categories alternates.
4,000 listing of categories to be downloaded per link of those two alternate categories.

4,000 categories listing X 200 Links = 8,00,000 listings of categories per day to download for the admin.

So in a nutshell 8,00,000 listings of categories to be downloaded for the admin and also 8,00,000 listing of categories for all those who wanted to insert or submit links per day = 1,600,000 = 1,6 Millions listing transfer per day only due to the category being only in one table. With this traffic of per day 200 links submission per month 1,600,000 X 30 days = 48,000,000 = 48 Millions listings being transfered per month only this categories listings table problem. If I calculate the price of telephone bills of the shity german telekom then it would ammount to atleast $50-$100 per month to handle those from the admin, just the transfer time. What about the energy and the life costs of the admin? This is related to the design of how the Links SQL selects and produces the list of categories! Here, the category alternate is shown with the fonts -2 that has to be activated to be able to show a proper field, ofcourse. Otherwise one has to control and do this twice as bad. Because, everytime, one has to get in to category alternate to check what is being filled in there and download the rest of the html data for all those links, which could only be worse!!!

This has been argued in earlier threads. Alex says that this is a custom demand by me and he wants money for it. I think it is a related to its basic programming logic regarding its target to functions what the programing should do and I seriously beleive that Alex should come-up with an upgrade immediately. This upgrade ofcourse will help me immediately but also every other liscenced user who is trying to setup their database in regards to MySQL tables. Later on it would be difficult to change when we see how and its consequences in relation to its numbers in the categories table or in links table etc. Ones a person takes a direction and starts later it would be a volume of energy to change and re-categorize everything.

One of the most important point. Does Links SQL consider the possibilities of growth in terms of categories? More you have categories more difficult it is going to be! Therefore, instead of considering further development in terms of producing web-pages on-the-fly as it says on the mainpage, like a good salesman, I would suggest that if this point is considered as a priority would be the best. This will help all the links SQL users and just not me. I have to come-up and say this because it affects me the most. But there may be so many how wants this possibility in future and also who are possibily using it.

Lets us see otherwise. Tell me Alex, that you do not agree at all to this point. Tell me that you will never consider to develop this area. Then I will take a decision further. I do not think this is a custom requirement and
will never have an intention of paying to you for it, nor do I have enough for myself. As you said you are willing to develop a special module for the project, this may be helpful when there is some money coming in and there is a real requirement of developing a seperate module which is also specific to new concepts and design of specific services, but now regarding this area I do not have the slightest doubt that this is a not custom requirement and it is undebatly a thing that needs an urgent upgrading. Its not a modification but an upgrade in the interest of everyone, yours as a seller and a designer and ours as users, as it reduces the overloads on the servers + costs. Also I do not have the energy to come here and discuss this matter over and over again. Had it been a modification, or a new feauture, you have and are doing this for the interest of everyone, especially of the program Links SQL in general.

Therefore what is possible. A simple table select sub-routine thats prevents a duplicate disply of categories everywhere through a global defined array.

This question I have debated several times and there has been only negative or neutral response. There are no users who has here the capicity to develop mods on Links SQL. The entire further development is somehow blocked or is not in the manner what one sees it could be as a wish. Its only you who is active to bring more mods and features. This is again dependent on the sale of the script. Moreover things like this I do not think I can develop any modification like this nor there is anybody here who can. So one is kind of dependent on what is offered and what is there. They are all nice and good but one needs something more. Everyone. To run a website. I know Alex, you are supporting this forum more then anywhere else. This is great. You have supported me earlier for this installation problems which was very helpful. But now this upgrade is really important. Earlier I have posted this problem for your support so that atleast I could help myself. But it does not work.

Also, for anyone debating further on this topic, please try to see this point as in a broad perspective and not just if it affects you or not. Not just simple war of words for the sake of a debate as it has happened earlier.

There are not many programs on the internet similar to Links SQL and therefore it could be one of the best programs. It should never ever be the case like that of mine, that one buys a product with a lot of expectations, joy, respect on the programming, love on its features but ends up in a great disappointment and dissatisfation due to its inherrent limitations that is hard for novices like me to further make a project out of it. Also if such an upgrade is not done, then people who are having a similar denmands will have to consider twice to buy this product. Had I known this earlier, I MYSELF WOULD HAVE THOUGHT TWICE IF AT ALL I SHOULD BUY THIS OR NOT. I was one of the first to buy it, even before the demo. As a result, I cannot even use it and the basic installation is hanging on the internet doing nothing. I cannot go forward. No support from you or anyone else.











[This message has been edited by rajani (edited August 29, 1999).]
Quote Reply
Re: Debate : Multiple categories ./. Categories Alternates : In reply to
Hello Mr. Pataki!

Quote:
There is _NO_ way of using CGI/HTML to do this

THE ONLY WAY IS CGI PERL OR EMBEDDED PERL and not even php3, for your information. This is ofcourse, so long as the directory remains a medium size. When it grows, then one has to migrate to a more mature languages like C++, etc. This is our intention to make one that of small to medium size directory. So there is no way but to think about this

Quote:
I can see it being implemented via Java

Java is the worst languages that one can think of regarding such matters. If it is disabled that the submission area is gone. I have consulted a well known proffessor and have discussed about this project and also the Links SQL.

He was of a clear opinion that there is no other way but to talk to the programmer about it. Otherwise one can forget about this script as more it grows, more difficult it becomes and requires hundreds of modifications and is very very expensive to use.


------------------
rajani











Quote Reply
Re: Debate : Multiple categories ./. Categories Alternates : In reply to
I got about halfway through your post... I have to admit.

I see the problem as simply the large number of categories makes it difficult to manage, since download times increase significantly.

There is _NO_ way of using CGI/HTML to do this, since the data is on the server, and CGI is stateless.

I can see it being implemented via Java, where the categories are sent once - and the program can request the 'update level' of the file on the client, and if it's older than the last database update, it's re-downloaded, if not, it's just used. Since categories are added more rarely than links, the same category download could persist indefinitely.

This Java applet allows you to search/sort/select the categories you want.

If they are put into a pick-list, it could even allow you to order them, and select which you want as the 'main' category.

There are some technical problems with this, since Java is _not_ perfect, but it certainly offers answers.

As for your main vs alternate categories, perhaps Alex would consider a modification such that category '0' is unique in that it flags a category as belonging to multiple categories that are listed in the category table, with NO category being 'main'. As long as more than one category exists, the link is considered a '0'link (similar to links on Unix, all links need to go, before a file goes bye-bye...). If a link is deleted [from a category] and it's category field is '0' Links looks at the category table, and removes that category number from the the Link-Record. If that removal drops the category count to '1' then that number is inserted into the main links database, rather than the '0' and the link records are removed from the category tables.

Alex -- is this possible?

Rajani -- is this what you are asking for?

I'm very confused over this, since I'm familiar with the large directories, and am planning for one, but don't see the issue beyond the excess-download-time of a giant list box -- which is a _real_ problem, I agree.
Quote Reply
Re: Debate : Multiple categories ./. Categories Alternates : In reply to
First, in database design, 'key' fields must be unique and real. This is the "ID" number of the category. If you substantially change the category, you need to generate a new ID.

If you do it in a haphazard fashion, then you create orphan links. If you do it logically, you capture those links and re-locate them.

This is why the "LinkID" never changes. Ever, ever ever. If you re-number the database, you have essentially created a new database, and _all_ relations have to be rebuilt.

The category ID will _never_ change.

There is one and only one category ID for each category.

There is one and only one linkID for each link.

There is a category relations table to allow multiple categories to contain a single link, and a single link to be a member of several categories.

I really don't see a problem with the design of LinkSQL, except in some minor details that are easily addressable at this time, and with some technical problems such as huge drop down lists.

==> The way large databases handle the category selection is they DON'T. You have to know the category you want, or you have to add the link _from_ that category. The drop down box is a great shortcut while it works. Rather than criticising it, be glad it's there at all, and perhaps find someone who can write a Java front end for it. _THAT_ is the solution. Really. That would be a worth while mod, but it's not really part of LinkSQL, but an interface issue for sites with huge numbers of categories.

==> As I said before, I must be missing all the issues, since the only one I see is the size of the category drop down, which could be addressed through a trick of HTML (Letting an admin just download the list once, into a different browser window and select the numbers from that) or by a Java applet.

==> If you delve into database design, and SQL, you will see that the SQL database is capable of doing many things -- including mutli-admin. All that is needed is a log-on function for the admin with the password/ID not hard coded. MySQL can give column-level access to tables, such that an admin can only update the "editor_pick" column in the database with their log-on.

==> Last night I was thinking about how to add the ratings mod in, and make the votes more descriptive. Just add a ratings table:
(realize this is still rough, and I'm typing from memory, not notes):

Code:
ID , ingeger
Link_ID , integer
Vote01 , integer
Vote02 , ingeger
Vote03 , ingeger
Vote04 , ingeger
Vote05 , ingeger
Vote06 , ingeger
Vote07 , ingeger
Vote08 , ingeger
Vote09 , ingeger
Vote10 , ingeger Legacy_Votes , ingeger
Legacy_Rate , ingeger
Each "vote" would be added to the appropriate table entry with an increment command to the appropriate column. This could be done in _real_ time, since database access for this is quite quick.

The Legacy_ fields import the current vote and ratings, so that when new ratings are counted, the counts ratings come out ok.

The ratings could be updated with each _build_ as usual, or even dynamically generated, depending on the activing and presentation of your site.

==> Hit counts. I was thinking also that the hits.db could be used to provide the "Most popular titles of the past 24 hours" or "Since last build" Like the top10 pages, this could show the most popular in the past x-hours.


The gist of this is that SQL is very powerful, and mods that took a lot of effort and energy in Link2.0 are going to be relatively easy from a datbase/input/output point of view. Most time is going to be in the interface.

Alex is hopefully working on the DBSQL.pm interface, which will encapsulate _all_ calls to the Links database, and make all calls the same.

This is what I mean by leaving Links 2.0 behind. Once you enter RDBMS and SQL, the concepts of flat-file management are over the horizon.

Quote Reply
Re: Debate : Multiple categories ./. Categories Alternates : In reply to
Alex...

The main category idea.

I think I like my '0' ID suggestion, since it takes into account a lot of eventual features, without stigmatizing a link into one category -- which could cause problems down the line.

As for returning searches -- since searches are returned by category, not by link, returning a link in multiple categories is actually preferred, since a person might scan the categories for the place to start.

If links are returned in 'priority order' then the same link would be next to itself, since it's priority would be the same.

Duplicate links could then be screened out perhaps by inserting the found link into a list, and if the value already exists, next link.

Alex, I really do prefer the '0' category idea, since as the directory grows, what is the 'primary' category can blur. It would make adding/deleting categories easier since all categories are equal, and the only distinction is 1 or more than 1.

From a user point of view, a link should be _primarily_ in any category they are in, and additionally_listed in other categories. If they swich categories, the point of view of the database should change with them.

This keeps LinkSQL very general, and non-specific,with no hard-coded or non-standard logic.

Does this make sense?

A link exists in the _DATABASE_ not a category. The link is the leaf-data. The categories are means of categorizing the link, not defining it. Categories can and do change. The link doesn't.







Quote Reply
Re: Debate : Multiple categories ./. Categories Alternates : In reply to
Hello Mr. Pataki!

Quote:
but don't see the issue beyond the excess-download-time of a giant list box -- which is a _real_ problem, I agree.
Here I missed this point. The issue is just not only pop-down mass production that the Link SQL could playfully do disregarding all the upsetting to her master : Admin :! The issue is a bit more complex.

If one thinks about a directory that grows then one has to also think about the administration of it at a later date. What happens if you insert a category new in relation to its ID? Ones you start using only one table Category, then it will give an ID from that table and put it in the CategoryID field of Links table. If you want to inject that links to your database further, then there will be one more CategoryID coming out from the same table getting into Table Category Alternates. So slowly everything starts getting nested in relation to each other.
So once you accept this direction, it becomes very difficult further more you walk in this direction, more difficult it becomes to change. Because the database gets nested internally somehow of links in relationship within their tables regarding the CategoryIds.

It is therefore this is the time now, to take a direction and find out if one wants to live with this what Links SQL has to offer. Any later upgrade or modification will be difficult to change. So its just not Admin pop-up that is a problem. Thats the reason why I needed to meet a professor in Berlin University to discuss the matter and show and explain the difficulties. He concluded the same. He said later to convert the table category in to many tables is no problem but it will not be possible to change the category Ids from their fields. This point was not to my thinking and as he grasped all this even without looking at the codes of the programm!!! He severly criticized this area and said this cannot be true if someone did such a programming of the script.

As you said there is anathor directory to your knowledge that has 1,40,000 categories. I fail to imagine how this directory would look like in the area of add.cgi + admin.cgi especially in connection to the field add category. The pop up will be half a kilometer long if the Links SQL would handle this!!!

It was this professor who came up with this suggestion of a table selector routine. This one has to somehow define somewhere and then use to generate a data from that table. Therefore the data is neatly sorted out and unnecessary data is not loaded everywhere. Moreover this data in the related table could be updated and revised. For e.g.

subroutine table_sort
$tab_select = xxx,yyy,zzz

MYSQL SELECT FROM TABLE.$tab_select

This routine should be able to generate three lists each tables. The professor did ofcourse gave an answer but thats the way I think how it could be. In such a system the field category will not produce an unnecessary listing of categories. This applies to the user and also the admin.

I have tried to suggest this earlier to Alex, but he was not at all in the mood of helping me on this point but only saying "This is a custom requirement and I need to pay him!" So he claims to help every one but not me on this point.

------------------
rajani











Quote Reply
Re: Debate : Multiple categories ./. Categories Alternates : In reply to
I've posted an alternate idea for handling large drop down lists at:

http://www.gossamer-threads.com/...um9/HTML/000080.html

Please follow up on that idea there.

As for Alternate Categories vs Multiple Categories, the only difference you seem to mention is in how the category lists are displayed, and on finding an easy way to enter the information.

I prefer a link belonging to a main category, as it makes it easy to determine what category a link should come up in when searching. I don't think you should get the same link twice do you?

Cheers,

Alex
Quote Reply
Re: Debate : Multiple categories ./. Categories Alternates : In reply to
Hello Alex,

Quote:
There are a lot of issues here, but the main one seems to be an efficent way of dealing with large categories. What do you think of the following:
1. Let people enter in a category name or a category id. So if you turn db_gen_category_list off, Links SQL will only display a text box rather then a select list. However, you don't need to enter in an ID, but can rather enter in a category name. The script will check that you entered it properly, and return an error if there was a mistake.

2. Create a "Category View" pop up window which would use javascript to display root categories. Clicking on a plus beside any category would expand or contract that tree. This would allow the admin to keep a window open to view the category heiarchy.

Number 1 is very easy, only a couple lines of code. Number 2 would take some more time.
----Point-1------------
Nice to see that you have given thoughts to it. Number one is a good suggestion. So then people could try to insert their own category and if Links SQL finds it then its done. What if not? So again this pop story will come with thousands of categories listed.
Secondly, the nightmare I have is to see the kilometer long category listings in my admin. (I am on a dial up networking and do not have a fixed DSL line like Mr. Pataki!). Actually I do not mind this kilometer long poip-up but I do mind the kilometer long bills from the shity german telekom.

-----Point-2-------
Javascript is not something that is the best solution. Lets say that admin uses. What about the users? They will have to download the entire list. So even this does not help.

Actually speaking both the features are interesting. So its nice to know those ideas and hope that they comeout somewhere as a mod to help and make it comfortable.

Hello Mr. Pataki!

Quote:
The category ID will _never_ change. There is one and only one category ID for each category...

To all this comments I fully agree and ofcourse this is the only logical thing to do. CategoryIDs cannot be different.

What I am suggesting is a solution that can help everyone at every instance related to categories. It can only make the Links SQL script smarter. This ofcourse means that it will have to close the eyes of the existing users of Links v2.0 for a while giving everybody an option to use a Multiple Tables for categories or one table by a switch in the cfg. What I suggest is the following...

There are 1000 categories... They need to be divided. Lets say ten countries for e.g.

Then
Country = Category ID Range
USA = 001 - 099
Germany = 100 - 199
Canada = 300 - 399
.....
So ten tables can be created for ten countries. Each table number starts with an auto_increment of their ID Range. Therefore If Newyork is 001 then I add anathor category to this table and its 002

In Canada table it starts from 300 as Category IDs and goes forward upto 399.

In this system, one has sorted all the categories before hand. Then everything becomes fine. All the handling in the users section and also the admin. You can divide the categories in as many tables as you want since their Category IDs WILL REMAIN the same. Makes Sense!

This is what I proposed in the above message.

For this one simple needs to revise the routine where it generates the Cat_gen_list. There one small routine needs to be developed where it selects the tables of the categories. This can help tremendously, if it is figured out how. This is where only Alex could tell us if this makes sense or if this is possible.

What I considered is also changing the Category Index table to a non-incrementing one. Therefore in one table if there are category numbers given for a potential future growth then still their IDs would remain the same which are assigned before. One can then have ones own cateloging system that may also fall to some international standards of Library cateloging.

------------------
rajani











Quote Reply
Re: Debate : Multiple categories ./. Categories Alternates : In reply to
I don't understand why you need to break up the category tables... Just SELECT the category items you want from the table and if you don't find what you want, add.

There is no reason to have all those tables.

Basically, if you wanted to add an item to Canada, go to the add form, type 'canada' and the the form will be reloaded with all the categories under Canada -- and only those categories -- in the select box.

If you want to add to Canada/Restaurants - same thing.

If you just want to 'add' you can enter the category ID or name directly.

I'm sure Alex can officially implement this in a few lines of code, since all the internals and data is already there.

Is that what you are talking about?

This is probably easier than his javascript offering.
Quote Reply
Re: Debate : Multiple categories ./. Categories Alternates : In reply to
When desiging Links SQL, I did look at separating into three tables: Links, Categories and LinksCategories. The Links table would not have any mention of the category, and the relations would all be stored in LinksCategories table as in:

LinkID CategoryID

I dropped this as it added a lot of extra problems and work, for what I thought (and still do think) was little benefit. Some of the problems:

- Multiple results in searches (I don't think think it should be shown, espescially if sorted by priority. Removing it is not trivial, as you can't just skip it in perl as you lose the total count of results).
- It's much more difficult to handle the admin and adding/removing Links.
- A lot more prone to getting orphaned Links and orphaned Categories.

I don't think any of the hurdles are insurmountable, I'm just not convinced that approach is that much better?

Quote:
There are 1000 categories... They need to be divided. Lets say ten countries for e.g.

Then
Country = Category ID Range
USA = 001 - 099
Germany = 100 - 199
Canada = 300 - 399
.....
So ten tables can be created for ten countries. Each table number starts with an auto_increment of their ID Range. Therefore If Newyork is 001 then I add anathor category to this table and its 002

To be honest, this would be a nightmare to implement. Wink To break it up into ranges isn't worthwhile. The only hurdle you are trying to solve is the long select lists, which I think 1 and 2 would help solve, I wouldn't recommend breaking up the tables.

Cheers,

Alex
Quote Reply
Re: Debate : Multiple categories ./. Categories Alternates : In reply to
Hello Mr. Pataki!

Quote:
I don't understand why you need to break up the category tables...... Just SELECT the category items you want from the table and if
you don't find what you want, add.
This is to break up the pop-up list of categories. Once this area is broken up then every damn thing becomes easier. Every handling of the users, searches also within categories, everything. To be able to work on 200 - 300 Links is no joke. Mr. Pataki! Try and start with your new project with 4,00,000 and I challenge you, if you have four thousand categories, or more (Kind of every 1,00,00 Links have 1000 categories) and if you are able to run it efficiently without multiple admin, then I shall send a bottle of Chanpagne, an expensive one from Berlin to you as a present. Efficiency related to publishing, handling, ease to work by the users and the admin, etc. Let Alex be the Judge!!! Smile Smile Smile Accept it, my dear. It should be however without any extra revision, modification, change in the codes, etc. Use neat script of the distribution.

Hello Alex!
I prefer this discussion here and not in other thread because if there is a message then I get an email and that notification is important to me not only now but also at a later date as this acute problem for me is important to solve.
Quote:


Quote:
The Links table would not have any mention of the category, and the relations would all be stored in LinksCategories table as in:....
Yes, Alex, I also agree. This would not help at all in any area of the function. Theoritically in MySQL one can also have everything in just one table. Evary thing gets dumped into one. But one needs to divide to be able to handle it better, regarding toe use, exports and imports, etc. Also the performance is considered.
This can also easiliy be incorporated in the main Links Table anyway. Moreover, a lonely LinksCategories table on its own for the export and the imports does not make sense. Once if there are some problems in indexing of the categories of auto_increment etc, the whole database may have problems. So the value assigned to a links MUST get into the same place and the moment it is broken up, chance of problems are automatically programmed for it to be born.

Quote:
To be honest, this would be a nightmare to implement. To break it up into ranges isn't worthwhile. The only hurdle you are trying to solve is the long select lists, which I think 1 and 2 would help solve, I wouldn't recommend breaking up the tables.
This would not be a nightmare to revise the program it, nor it would be a nightmare to implement. I have practically MEDITATED on it for now more than two months after I bought it from you. I also know what I am talking about. Its not just trying to solve the pop-ups. Had it been only the pup-ups I would then take several other apporach, for e.g. rather than spending my time in lecturing here, I would simply produce mass of add.cgi`s into different names and include all the pop-ups in there so that it does not have to get into the db_cat_gen_list at all. The time I spend here writing all this, which so much I have never ever done, is much more than solving it through cgi, that will insert directly into the database the correct value instead of using the table of categories.

But what then happens to the admin. What happens to the build routine and the performance, etc.

My suggestion here is to develop a table selector routine. This means that if a data is divided then it searches for the routine that searches which table to get it from. The name of the table is then a variable which is defined before.

This is the only first-class solution. There is a clear advantage in all areas as follows :

1 - Admin and the user pop-ups becomes automatically tiny and there is no byte etra download more than necessary. For every link only the category list is downloaded that where it belongs to the theme of the category, i.e. A link belongs to theme 1 & theme 2 then only sub-categories of theme 1 & 2 are downloaded.

2 - The mod of Alternate categories does not simply remain in the background but is brought in the foreground. One accepts at the start that alternate categories is not just a simply an extra option but a system how a project is going to be build. For e.g. Every link is to be inserted into two categories. So one designs the admin + add cgi accordingly.

3 - There may be a reduction in the build time upto 10-30%, I guess, however Alex may strongly disagree (And will I am sure). When there are hundreds of thousands of categories that the tiny little nph-build.cgi has to control with the help of MySQL then offering to this baby less data of each category list will give some speed to the build routine.

4 - One has then used a full power and the features of MySQL and not sticking with the limitations of Links v2.0 categories.db listings. So the buyers of Links SQL having an already running a setup database in ASCII format will simply have to undergo minor modifications of the categories listings + Links.db to be able to use the full power of MySQL build + search routines, especially when the lists are big.

5 - Links SQL then becomes an independently designed program that accepts all the Links v2.0 users but also having a design that it has been independently designed that does not have inherrent design limitation which has resulted at the moment. As a design by itself, it has then a greater value.

6 - I have suggested a table selector routine. This needs to be incorporated in the dBSQL.pm! So once done there may be many more ideas by experts to develop further mods on this table selector routine, which the name is a variable. This gives a wonderful possibility from the begining to have a platform from future development.

7 - This idea have comeup from a concrete problem. However this table selector routine can lead to further development and use. Lets say an alternative indexing method could be based on this table selector routine.

8 - Keywords searches could be oriented on this table selector routine. This means if I have all many tables for searches of keywords, then it is smart enough to select from a variable table and if it finds one it goes there and if not then goes to the general as a default. for e.g. keyword search if there are all the data stored on computers, a specific table for it, then the searches goes much faster in that table! Instead of loading the whole table having one millions of rows for the searches table if that itself is categorized before hand then it simply reduces to loading and use of memory + all the benefits. This keyword is captured from the user input and passed on or taken up as a variable for the table selector routine.

9 - I would go further and think about even dividing Links table by itself. If one can divide the table into five for e.g., then while building nph-build.cgi goes through the table selector and builds Links with the help of a table selector routine, from one table after anathor. When one has huge ammount of data stored in the table one has to think of all this. In just one table of Links it is loaded in the memory. The whole table. What if the server has 128 MB memory and my database is 250 MB where Links table itself has 150MB!!! When talking large directories this is size of a database is not said as a joke but a real life situation.

So therefore one is talking and questioning the very fundamentals of a concept. One triggers on the basic design concept. A change in the approach. The concept is to develop a general routine that is able to fragment database tables into many and be able to select it as an where required. It is only the fundamental revision in the very basic idea that can help Links SQL to run speedy and effective and fly in the SQL database and not a cosmetic solution of Java scripts and other things. They are nice and good solutions but only bringing a help to that perticular area. I am looking for a larger benefit out of a small change. This idea of a table selector routine is deadly to me. Because it gives an interesting power to Links SQL!!!

My question, Alex. Is this table selector routine so difficult to develop? It is such ideas that can really help Links SQL 1.02 take forward to Links SQL v2.0! I really want to see Links SQL v2.0 be one of the deadliest ones and killing one on the internet. By such a strong development environment and a backing of the programmer itself for the development, Links SQL is and would be the only one that can stand the market.

------------------
rajani











Quote Reply
Re: Debate : Multiple categories ./. Categories Alternates : In reply to
Rajani --

The logic behind creating extra tables and sub-ranges goes against RDBMS theory.

Alex is correct in this, but he and I dis agree on the use of a reltaions table for Links/Categories.

I think as LinkSQL grows to fill the midrange, the way MySQL fills the midrange between the low end, and Oracle, it will have to use link-category relations tables.

I also disagree about orphan links, since link collection could be done very easily by any 'Category'=0 link that has no relations in the 'Category' table is an 'orphan' and could be moved to the 'validate' table, or other 'intermediate' table.

I also belive LinkSQL needs a "pending" table for editors to move from validate, to 'on-hold' status, with all record data preserved.

Anyway, I belive LinkSQL will evolve, and Alex will see that how he envisions the program isn't exactly how the program gets used in the real world. Some assumptions turn out to be a bit different from reality.

This is a process.



[This message has been edited by pugdog (edited August 31, 1999).]
Quote Reply
Re: Debate : Multiple categories ./. Categories Alternates : In reply to
Hello Mr. Pataki!

We both agree to RDBMS and the above said proposal of a table selector routine seems to misunderstood.

Pl. read my message again and give a thought to it, if interested.

------------------
rajani











Quote Reply
Re: Debate : Multiple categories ./. Categories Alternates : In reply to
The difference is you want multiple tables, arbitrarily for you broken by country. For someone else it would have to be by another critera, and so on.

If you keep all category information in a single table, but only SELECT and send to the browser the relevant categories, you do not increase the complexity of the logic, and have only a single extra step in the current process.

Alex's 2nd (JavaScript) option was similar to this.

An option to _not_ send any category information until a user selects a category tree, or search term is another. This would add one extra step to the add process, but overcome the problems of a large list.

Quote Reply
Re: Debate : Multiple categories ./. Categories Alternates : In reply to
Hello Mr. Pataki!

It is always so that the person who has the pains understands it the most. Funny.

Thats the story with me.

The idea is to be to generate different set of listings in two different pop-up menu and also two different tables. It can also be three tables or many.Conceptually, the problems and the built and inherrent limitations of Links SQL needs to be triggered and solved. The requirement of any medium / large directories is and will be to be able to classify information by categorizing them and administration of it. . This will never change.

Mr. Pataki and Alex!

Before debating and misunderstanding further please try to understand the problem again, which I do so for the last time.

Links SQL needs a an urgent revision in the area of categories how it builts the listings. Without this the program will be extremely problematic for some of the buyers who has a requirement to generate a listing on two different themes of categories and administer them.

Currently the Links SQL has a programming nature that has no possibility to seperate the categories from different tables. This is only because of how DBSQL.pm is designed.

I need to present users, when they submit, two different fields having two different set of categories. One field has Geographic categories and the second has Subject categories. When they submit both their respective Categories IDs should get in to the Links table. Thats called Multiple submission of a link in two diffent categories.

At the moment if I use the design as it is, then both the fields will have the same listings! That looks ridiculous. How do I divide them? Which Javascript will help me?

Further, the argument of going into the category itself and submitting from there is also not that convincing. Because if the category falls into the third step, all the users are forced to get into three steps paying their bills of dial-up networking. If multiple submissions are allowed i.e. a user could submit a link in two categories, then he will be forced to again undertake three more steps. In all everyone will be forced to undergo six steps in all. Imagine 1000 links being submitted per day including modifications (May be not all gets into the database), then the server overload on the website + everyone paying extra telephone bills.

And Mr. Pataki!
When you will start your directory, you may well need some day to seperate your categories into two different themes in your submission area and also present them to your clients or users. How will you do it? Such a need many users of Links SQL may have. The need would be "Multiple submissions in one click (Submit button) in Multiple categories"! So to have the same categories listings into two different fields is stupid. This becomes ridiculous especially when one has a very long list of categories. It is nonsense to produce two times a list of 2000 categories and the users then take ten minitues to search the pop-ups.

Hello Alex!
Quote:
To be honest, this would be a nightmare to implement. To break it up into ranges isn't worthwhile. The only hurdle you are trying
to solve is the long select lists, which I think 1 and 2 would help solve, I wouldn't recommend breaking up the tables.

For me I am trying to calculate which nightmare is less. Also, breaking into ranges is only to be able to keep space for future categories to come in between.

------------------
rajani













[This message has been edited by rajani (edited September 01, 1999).]
Quote Reply
Re: Debate : Multiple categories ./. Categories Alternates : In reply to
Multiple Submits --

There is no reason to break the database up.

If you allow for a 'search' or an add to only a sub-tree, multiple insertions could be accomplished just as in links 2.0 with a multi-select box.

I _STILL_ don't know what the problem is, and I guess I never will.

I agree the length of the list is a problem, but that can be handled more quickly, and more logically by pre-selecting what is sent from the server, _NOT_ by carving up your database.

At some point it might make sense to have several database tables for separate areas of the site, but that is not something built into Links yet. Then, you have the problem of cross database searches, and multiple connect and open commands. I don't see my site growing into millions of links -- much less categories -- if it does, I'll have probably sold out, or have a team of programmers working on a custom solution.

But should a database get that large, it would seem breaking it up into sub-tables based on the heirarchy would be a logical idea, so people are actually 'in' a sub-tree when connected to the database, and their position is controlled by a top-level program and table that has the locations of the other tables and databases.

_BUT_ this does _NOT_ address the issue you are presenting.

I still don't understand why if you want to add a link to "canada" you can't enter "canada" into the add-link field, and when it sends you the add-link form, it lists _only_ the categories under canada?

This is far simpler and more logical than carving up the database.



Quote Reply
Re: Debate : Multiple categories ./. Categories Alternates : In reply to
Hello Mr. Pataki!

I see more and more that a clear understanding is not yet come accross. Therefore what I see it as extremely good revision in the routines is also perhaphs not understood and valued for that reason. I give you a concrete e.g. as below. This may also help to you , Alex and other proffessional eyes that troggers on this barking thread. I will thing of also perhaphs designing a page of add + admin regading to this issue tommorow.

---------------User side-------------------

Submit Field in Add.cgi >>> pop-up

Question :
What is your website related to? Submit into two categories Subject / Geographic)

Subject Categories :

Computers (Category ID = 10000)
Computers/Hardware (Category Sub_ID = 11000)
Computers/Hardware/Motherboard (Category Sub_ID = 11100)

Geographic Categories :

Where are you based?

Germany (Category ID = 1000)
Germany/East Germany(Category Sub_ID = 1100)
Germany/East Germany/Berlin (Category Sub_ID = 1110)

Therefore a click on the submit button will result in submission into the following :

The Validate table will result into >>>
(Brackets info will not be there, but just for the explaination.Only in Bold will be inserted)

In the column of Table ID following will be the result :
(Category ID =) 10000 (Computers)
(Category ID =) 1000 (Germany)

In the column of Category ID following will be the result :

(Category Sub_ID =) 11100 (Computers/Hardware/Motherboard)
(Category Sub_ID =) 1110 (Germany/East Germany/Berlin)

So what happens then? The Main category has its own table, here e.g. Computers & Germany. All the sub_categories will be listed in there having their own unique category ID. There will be no conflict. All those categories are also categorized / organised!!!

Every links has a table ID and their category ID. This becomes a primary revision in the built up routine and in every step it goes to the table ID and does everything. Therefore it no longer is going through all those Long productions of listing of categories, built-up routine is also then much faster. Searches could also be triggered in the same way, if a table sensitive routine is developed, etc, etc.

Problem is how to revise the table selector routine.

my @array = $XXX
$XXX = $table_id
$table_id = all the rows in the table of table ID
For e.g. Computers =, Germany = etc.

and then pass the variable into

MySQL SELECT FROM TABLE.$XXX

Therefore all the table IDs are passed to the MySQL command as a variable.

For e.g.
(Category ID =) 10000 (Computers)
(Category ID =) 1000 (Germany) Then...

The add.cgi will produce one popup for Computers and the second for Germany. The categories are themselves classified beforehand from their respective tables. Therefore if one has 2,000 categories or even 20,000 categories it does not matter as they are always organized and seperated and divided. Therefore the table selector routine could be a star feature of the Links SQL. it helps even if you have 200 categories and want to use Multi_Categories mod. That to me is a real Multiple submission in Multiple categories with one click.

Through this one has solved the problem of kilometer long pop-ups, fatser builtup routines, are faster searches, etc, etc








------------------
rajani











Quote Reply
Re: Debate : Multiple categories ./. Categories Alternates : In reply to
Hello Alex!

I have seen you answering other thread, but not my this one. So I assume that either you need more time to think over it or there is no possible answer from your side.

If there is no answer from you I will assume that it will not be possible. In that case I will need to decide as I need to go forward and it has been now many month that Links SQL is hanging on the web and I am not able to use it at all. So I will have to decide what shall I do next. Wish you a nice week-end and thanks for all your attention and support.

------------------
rajani











Quote Reply
Re: Debate : Multiple categories ./. Categories Alternates : In reply to
Hi Rajani,

Sorry about the late reply! First off, let me start by saying that you shouldn't be splitting tables for performance reasons. If properly indexed, Mysql can handle tables with million+ rows witthout problems. By splitting up into multiple tables, you introduce a lot of new problems. The design you describe goes against conventional database design principles, and would really introduce a lot of headaches and make the program less flexible. I would need to hard code in a lot of things to make that work.

I believe you can accomplish what you want with the current multiple category feature and some modifications (without making any changes to the fundamental design of Links SQL).

Make the visitor go to the category they want to submit to and then click add. There, they can cut and paste the name of any other categories they want, or you could provide a drop down list of a subset of categories (i.e. only regional cats) and let them pick. Yahoo goes with the first approach and lets people cut and paste the info in.

You would then just need to modify add.cgi so that the information is passed through properly to the admin.

This keeps everything in one table, and makes it much easier to manage.

Hope this helps,

Alex
Quote Reply
Re: Debate : Multiple categories ./. Categories Alternates : In reply to
Hello Alex!

Quote:
This keeps everything in one table, and makes it much easier to manage.
In this respect you are right here. The main data remains in one table. This means Category and Links table remains the same. I did not propose here a change at all!!!

The change I proposed is different. What I proposed is something that I see it clearly and also now beleive that you do not and will not understand. The proposed change will keep and administer the data storage within one table but to ease the add + admin different tables are used. Alex, tell me what ever you want, but there is NO OTHER WAY but to make some revision of some kind. I am in contact with the prople who programmed www.Fireball.de!!! Regading this area, they understood me very fast and even without looking at your design codes in details they said the same after looking at your demo.

Alex, I would like to make stepwise revision of the design of internals. Pasting those codes here does not make sense. Can I email you regarding this?

The entire revision is so clear in my mind. I see it so clear that it can be an overall important change. But it is hard to explain it to others.

------------------
rajani











Quote Reply
Re: Debate : Multiple categories ./. Categories Alternates : In reply to
Well, you are right, I don't understand. Smile Perhaps you can outline in SQL how you think the underlying mysql table should look like?

As always, please feel free to email me as well.

Cheers,

Alex
Quote Reply
Re: Debate : Multiple categories ./. Categories Alternates : In reply to
You can always email Alex... he's very approachable.

If those others have seen what you are saying, then perhaps they have a solution?

But if any of them include 'assigning' a range of numbers to a category and/or putting the categories into separate tables, then they are against database design principles.

The reason is Why? There are many, many solutions to the large number of category problems, all of which keep the categories in a single table, and allow the RDBMS to operate as it was designed to do, with no arbitrary ranges.

If you break the categories up into separate tables, it makes searches much more difficult, and assigning ranges as you said -- 400-500 to this country, and 501-600 to that country -- makes _NO_ sense at all in database theory.

If you assign 1000 numbers to Germany, what happens when you hit 1001 entries? Also, what about all the 'holes' in the categories? Wasted space that other topics might have been able to use.

The database already can index:
Code:
Germany (Category ID = 1000)
Germany/East Germany(Category Sub_ID = 1100)
Germany/East Germany/Berlin (Category Sub_ID = 1110)

Just the way it is, all in one table, and if you wanted "Germany/East Germany" all you'd have to do is ask the database to return matching entries -- no code numbers needed.

If you wanted 'Berlin' it would find _ALL_ berlins --- even those not in Germany. You don't have to deal with a code number at all.

IF what you want is some _CONTROL_ over the layout of the database, why not add a field to the categories "Sub_Cat_Num" and put in whatever information you want? Don't worry about what Links is doing to keep track of the categories -- let the database system do what it does best. Just add a field, and do with it what you want.

Part of database design is to NEVER change the unique identifier. That means if you had to re-number your categories because you ran out of room, or had to make a layout change, you'd violate several "rules" of database design. If you add a field that you manipulate via certain rules , so when you re-number all attached sub-categories and links are also renumbered, then you are following the design theory.

I think you've gotten caught up in the trees, and forgot all about the forest growing up around you.



Quote Reply
Re: Debate : Multiple categories ./. Categories Alternates : In reply to
Hello

Quote:
You can always email Alex... he's very approachable.
Till now I have prefered posting on this forum because :
I like the way how the discussion goes on further here especially with your (Mr. Pataki) reaction. Also it gives a chance for others to participate, if not now, may be in the future. Thirdly, I hate to go through thousands of emails and have the continuity of the theme of a discussion. Here this is very very convinient. I just have to glimpse my messages and then read the answer. Thats just SOOOOO practical. When it comes to the perl codes, I may send him an email.

Moreover, in the past I wrote to him and the answer was two lines mummm that I do not like such a discouraging answer. I hate to see myself in a situation where I feel for something that I beleive strong for it, I spend time for it by writing, thinking, etc and Alex does not react or answer in anyway that could be satisfactory. I however understand his little or no or cold reaction.

Quote:
If those others have seen what you are saying, then perhaps they have a solution?
I did not show them the perl scripts design. I explained them the logic and the programmers understand very very fast. When I told him that all the categories are stored into one table, he reacted that, even before I described my problem : "But then how the hell the users will submit? You will have a mile long pop-list!!! Or you go for capturing the path from the location, which is a very uncomfortable way. ....Well, you can`t make a big project out of such a script!"

As you and Alex said, MySQL can handle Million+ rows, but thats hard for one person to handle. One has to have a user friendly interface in between.

One of the striking idea he gave was that :
""Use an Absolute ID instead of a relative ID!""

Currently Links uses a relative ID generated by MySQL and thats the second problem. If a system of an Absolute ID was also built-in, all the future work would be very easy. This was the biggest disappointment to know an inherrent lacking feature. The suggestion is that Links SQL should use the following :

Colums are then -
MySQL_ID Links_ID Main_Geographic_Cat_ID Main_Subject_Cat_ID

Where

Links_ID = Main_Geographic_Cat_ID.MySQL_ID (or in worst case Main_Geographic_Cat_IDMySQL_ID)

Such systems helps tremondously for all further programming sub_routines. Here Main_Geographic_Cat_ID is chosen as a priority simply to be able to have a unique handle. A classic example to have a geographic_ID is to have the telephone country&citycode.MySQLID

So today I may have ten categories. Next month I may increase up to one hundred and still I can work with the categories_ids manually. Later I may have thousands. This should be possible to handle and Links SQL has currently no possibilities to do this.

Here we are not giving a range to the MySQL to define the IDs but simply for the admin to administer. The intellegent part of the story is that the Categoriy_IDs are no longer based to relative IDs and are controlled and given by the Administrator for a future and better handling. Hence one has to decide upon a system that one has a possibility to have a new furure Category_ID in between them.

It is this basic understanding that helps a routine function of perl that one can develop a table sensor routine. At the end of the day all the final categories and the data of Links will be entered into the two main tables. But this additional sub_routine revisions helps at all levels to handle it. Simply use it in a way that one work with it comfortably.

Quote:
Part of database design is to NEVER change the unique identifier.
And from where does this unique identifier comes from? Is it based on MySQL auto_increment or a function or a part of the database within for e.g. category value assigned by the administrator as described above? With the current design, I shall have geographic and subject category_IDs mixed up following no system that I can handle it with ease. This shall be always the case with everyone that intends to use Links SQL for a large project. Alex claims on the main page that it is for use it for large directories, mine is going to be one and I am thinking on it how to use it. There is NO WAY that I can dare to use it, atleast the way it has been claimed. So I am not caught up in trees but I am scared of being lost in this forest that category area of programming is going to create Wink Moreover, once the trigger is pressed and the start is given, the database get filled up everyday and later on to change the values within the database is again a nightmare.

------------------
rajani











Quote Reply
Re: Debate : Multiple categories ./. Categories Alternates : In reply to
This thread is getting a bit long, but.. Wink

From the last message, I understood:

1. You don't like the fact that category's are assigned an increasing integer as their unique identifier.

2. You think that managing a list of categories that are identified by just a number will be unmanagable.

3. You agree that in the end, the categories should be stored in a Category table and the links in a Links table. You just think there should be some support for managing how the categories are displayed.

My proposed solution:

Leave things as is, but work off the category name instead. It would be very easy to display a list of a subset of categories in the add form. i.e. Select a region to add your link to: and then you just search for category whose names start with 'Regional/'.

The add form won't have a mile long pop up list as you put it, as your subscribers will go to the category first, then click add (this is exactly like Yahoo does it, a very large project). The admin screen can just enter in the name of the category, instead of trying to remember some citycodephonecode.mysqlid ?!

Cheers,

Alex
Quote Reply
Re: Debate : Multiple categories ./. Categories Alternates : In reply to
rajani -- I think it is now _you_ who does not understand Smile

The auto-increment "category ID" is the unique links identifier. Forget about it. It's what links uses to tie the category names to the links. That's _all_ it does.

Feel free to add a field to the category database to handle your "Country_Code" or "Geographic_Code" and you can assign them, and use them as you see fit. You can decide that Germany is 4000-4999 and it does not affect links. If you then want to find all the Germany links, rather than searching on "Germany" in the category name, you could use the identifier field and search for "Country_Code >= 4000 && Country_Code <=4999"

But, this is great if you need to do data input.

It's _MUCH_ better just to leave it as it is, and add in a routine -- as Alex has described -- to list the categories that fit a certain pattern.

SO, you want to add "Germany" links. You don't want to download all the categories. Just go to the "Germany" area of your site, and enter the links.

In the Admin area, it would be easy to have a two-step insert. When prompted to add a link, you are prompted to first select a category -- you would enter "Germany" and the next page would be your regular entry screen with _ONLY_ the subcatgories containing "Germany" in them.

You could arbitrarily say that you do not want more than 500 "hits" returned. So, the program would prompt you "Germany returned 1200 hits, please narrow your requrest" and you would enter "Germany Berlin" and the program would then show all categories for "Germain/Berlin" and hopefully that would be less than your limit of 500.

Does this make sense? It does what you need/want in a way that still keeps the database "according to the rules"

Alex is addressing the large category issue in a way that will address all the concerns of a large database.

I think that when you see his solutions, you will understand what he is saying. Hopefully the next release is due out soon Smile