Gossamer Forum
Home : General : Perl Programming :

Search Engine Needed.

Quote Reply
Search Engine Needed.
Here is the search engine I need:

Please note that this Search Engine I have designed after the Windows NT OS as far as INTERNAL
functionality. In Other words, each CGI module needs to be SOMEWHAT independent of one another, but need to work together at times.

1 The main page for the Search Engine needs to be the Default.htm
2 On the main page the following need to be possible;
A Anything I wish to place on the main page should be possible.
B The search interface with Optional City, State, Country
C The Default Country, USA
D A list of main Categories in 3 columns/2 with a link to a RANDOM search output for that category.
More on this later.

The Add URL Interface:

When someone wants to add a URL the following must be on the form;

1 Website/Company Name (NOT optional)
2 Webmaster's Name/Owner's name (NOT optional)
3 Business adress (PHONE NUMBER AS AN OPTION)
4 City, State, and Country (NOT optional)
5 A REAL email adress which will be added to a database/email program that uses Sockets, so I can anounce new features etc. to all users. (NOT optional)
6 A password will be emailed automaticaly to the user's email adress for future reference/Optional ability to change KEY WORDS ONLY.
7 URL of Website (NOT optional)
8 Key word entrance field. NOTE: ONLY 3 KEY WORDS ALLWOED AND MUST BE SEPARATED BY A SPACE. The amount of characters per word must be according to the longest word in our dictionary. (NOT optional)
9 SHORT description no more than two lines (20 words MAX). (NOT optional)
10 Category (NOT OPTIONAL) The user MUST type in a Category where he/she wants the URL to be palce in. The Category MUST match ONE of the 3
KEY WORDS submited above.
11 IMPORTANT: THE SEARCH ENGINE NEEDS TO PERMIT DOMAIN NAMES ONLY! Example www.webster.com or org, or whatever. Anything after the .com or .net or .whatever-the-last-text after the last period must not be permited.
THIS SEARCH ENGINE MUST ONLY ACCEPT WEBSITES WITH THEIR OWN DOMAIN NAME AND SUBDOMAINS. If someone wants to sobmit something like www.test.webster.com that is fine. BUT, if someone oants to submit something like www.webster.com/test it must not be permited.

12 Domains such as http://www.jpost.co.il/ MUST be permited. So the search engine needs to filter out the / and that is why the http:// has to be inserted automatically and after the / has been filtered out of the URL.

Also, the http:// must be automatically inserted in the form so the user does not have to do it.

The password must work with cookies just as it does with UBB (www.ultimatebb.com). The user name must be the full name of the submitter.

The Engine Itself:

The search engine must work as follows, (internaly)

1 A The Search Engine needs to create a directory for each letter of the Alphabet, (A, B, C, etc. and then create a Directory for EACH Country Code under each country code create a directory for each letter of the Alphabet then create the State (if it does not exist) create a directory for each letter of the Alphabet than the city under its proper alphabeth letter, for example if user submits Nashville, TN, USA, create a directory for each letter of the Alphabet under the city as well and then create step 2

2 When someone submits a website, the search engine needs to do the following:
A Take the Category sugested by the submitter, and create the directory for that category under the Directory that begins with the first letter of
the Sujested category.

B Create the Sujested Category ONLY if it does not EXIST. If it exists,
Than The SEARCH ENGINE HAS to create an aditional SUBDIRECTORY that is named with the first letter in each key word submited.
For example: lets say that I submit a website with the following sugested
category; Computers
Key Words; Windows NT Training
the submited page would be under d: (or whatever drive the engine is in we will use D: for this example)
d:\index\C\US\T\TN\N\Nashville\C\Computers\W\Windows\N\NT\T\ Training\1.htm If \Windows\NT\Training already exists becauase it might have been created when someone else submited the same thing, than create
the 1.htm (or whatever the next number is available) if 1.htm already exists,than create 2.htm if 2.htm exists, than go to the next number that is available etc.

NOTE: The Alphabet directories and the Country Codes subdirectory and their respective Alphabet subdirectories should be created by default by the search engine by using a INITIALIZE script and this script should create all neccessary files etc.

C The HTML Pages
When the 1.htm is created, it MUST be as follows:
1 The Engine needs to automaticaly create the <meta name="keywords" content="windows NT Training"> (Whatever the key words the submiter entered
needs to be inserted between the "". I used Windows NT Training as an example"
2 The <meta NAME="Author" CONTENT="Solomon Y. Tulbure"> Needs to be inserted. I used my name as an example, but the submiter's name would be here.
3 The <title>Karaite Net</title> Needs to be the COMPANY NAME/WEBSITE NAME separated by a , (EACH WORD IN THE "" MUST BE
4 The <body> Needs to contain the Website/Company Name, Description entered above by the Submiter and the URL with its apropriate LINK
to the website/The Name of the Website/Company LINKED to its URL.
5 IMPORTANT! To the RIGHT side of the page, the the search engine MUST ALSO insert the "banner exchange code" for that directory more on
banner code later.
6 The search results pages must be within the main page and has to use include-bots created dinamicaly.

7 The search engine must also keep a records database of MOST used key words by visitors, and can be viewed by administrators.

8 The search engine MUST only search for complete words NOT partial.

9 The search engine must keep searchable database with its own search engine of all domain names. The database bust NOT
be in htm or text format. It has to be only accesable via the domain search, so people can search and see if the doamin is already registered.

D 1 The search engine MUST check and see if the description and key words contain any forbidden words. This Search engine will NOT allow Adult/pornographic websites. (there needs to be a ON/OFF switch so that it may be allowedlater if I change my mind) So the filter should be optional in the admin page.

2 The search engine MUST go to each URL and search for the Default.htm and/or Index.htm and any file that begins with
Index and Default, and search the files to see if they contain any of the words forbidden by this search engine. If yes, then place the NEW submission in a special database where I have to aprove the site or not.

If it does not contain any of the words forbidden then automatically aprove the website. Something fancy and cute must be displayed on the screen letting the SUBMITTER KNOW that it is IN PROCCESS and he/she should wait a little bit.

Once the search is done a yes/no type message needs to apear just like INFOSEEK. On EVERYTHING the search engine displays in visitor's browsers, the Main banner that is on the front page of the website needs to be displayed, BUT this banner will be part of a internal Banner Rotation system so the code for the banner needs to be inserted in EVERYTHING the search engine displays

with the exception of the search results where a different code is displayed next to every search result. This banner will be small, so don't worry.

E 1 The search engine MUST automatically RATE every page by the amount of visitors it gets.
2 The rating must be done by a 5 star system. For exabple, the pages most visited need to be rated with 5 little cool looking stars.
3 The rating must have a level of ratio that uses % for rating. For example, the page that has 125% more visitors than the one before it
should get the 5 stars. The page that has 110% 4 1/2 stars, the pages that has 100% 4 stars the page that has 80% 3 1/2 Stars, the ones
with 50% 3 stars the ones with 30% 2 stars the ones with 10% 1 star and the ones with less than 10% no stars.
4 The ratings code must be inserted into every page when submited and the ratings stars should apear next to
the Title of the Page/name of Page/Company Name.


The search options are as follow.
1 Search by Key-Words as default. and by key words with City and/State and/or Country or all.
2 IMPORTANT! The search engine MUST search for pages that match the search and disply them at random.
3 In another words, ALL submited websites MUST have equal EXPOSURE ratio.
4 Example: lets say I have 20,000 pages under d:\index\C\US\T\TN\N\Nashville\C\Computers\W\Windows\N\NT\T\ Training\
If someone uses key words that match any or all of the pages in this category the engine needs to
pick the ones to be displayed in the first screen AT RANDOM ! If the visitor wants more, the "NEXT 15"
button will be pressed or a new search might be done. When/IF the "NEXT 15" another 15 search results will be displayed BUT all of the next 15 MUST be chosen at random. In other words, a new search will be done once the "NEXT 15" with the SAME key words as before, BUT THE visitor will not know that he/she/it is actually doing another RANDOM search and does not need to know.

5 On ALL search results there needs to be a COOL button that says "File Complaint". This is where another database needs to be created in a separate directory. When the visitor presses this button they will go to another screen that MUST submit ALL of the following:
A Registration info with REAL Name
B Real Mailing Adress
C Real Email
D Credit Card INFO and Social Security #
E Phone Number (optional)
F Website and Company against who the Complaint is made which MUST include URL.
G Evidence to validate complaint. Here he/she should be able to insert text, comments and
upload a file/document as proof of the allegations made.
H Agree to an agreement that tells her/him that she is responsible for the complaint and Nobody else in the world can be
accused of the complaint. Also that she assumes FULL responsibility for this complaint and does not and will never hold/claim that the owners of this website and company are responsible for any and all of the allegations made. This agreement will be designed by me and I must have access to create it and change it.
I Agree to the fact that he/she will not claim a refund under any circumstances ever, and is not entitled to a refund ever for any reason and will never try to obtain a refund and renounces all legal rights to a refund.
J The Price (which I should be able to change latter, the amount I will charge) default is $10.00 US
K Understand that the complaint will be filed immediately and recorded and the fee will be charged immediately but the complaint will NOT be available untill a signed written letter with a copy of Drivers Licence is mailed to us, and with this letter, the user name and password he /she recieved in the EMAIL MUST be included. Any other evidence is welcome. This will be kept is a SafeDeposit Box and becomes OUR property and we Reserve the right to Never release any of these documents unless the Court systems need it. The Documents and letter MUST be mailed via Registered mail, and certified by a Notary-Public Official.

IMPORTANT: Once the submit/Agree-Submit button is pressed, The SCRIPT/Program needs to check and see if the URL exists in our database. If it does then submit the info to the database, and email a copy to the webmaster with the subject in email URL COMPLAIN. Place the record in the database and have the button for the MAKE PUBLIC in the Admin page so that once the Letter and Documents is recived by us in the mail I can aprove the document for public access which will be displayed to a visitor by a separate search feature which we will call Website Business Credit Bureau.

Website Business Credit Bureau

1 Under this button once its pressed visitors will se a search option which will search the complaints database, and bring up a list of matching key words which must be a URL as a keyword or a company name or part of the company name linked to their respective compalint files.
2 There will also be a list of the top 100 websites with the most complaints each linked to the complaint file where all the information will be displayed, all complaints. If more than one complaint exists, they should be listed and linked to each complaint file.
3 A visitor will have an button on this page called REMOVE COMPLAINT, and here the visitor must enter all the info the complainer submited, all that apply that is, but this time a REAL Phone number MUST be supplied, as well as a written certified letter.
4 The visitor must include any info and reason why the complaint is not true/there was an agreement between the two parties.
5 Otherwisethe record will remain in the database for 7 Years, when it will be automatically deleted by the engine. The engine/Program-CGI Script needs to check the date once a month, and if it finds any complaints that have been filed more than 7 Years ago, should deleted it and all records of it.
6 If the complaint has been settled, a written letter and certified by Notary Public must be mailed with a Money Order for the amount of $10.00 US

The Banner Rotator

The banner Rotator needs to function as follows;

A Main Banner
1 For the main banners to be placed at the top of the main page it needs to ONLY allow 1000 banners in the pool.
2 It MUST be FAST and NOT use up much CPU TIME.
3 It needs to keep track of each banner as each banner will have its own account, meaning;
AA--A visitor needs to fill out a form and all Information is mandetory. Name and/Company/Website Name, adress, Credit card info Real Emai (more than one if he/she has and wants), but ONE is mandetory.
BB--Choose one of the account options
1 Prepay for 3 Months Unlimited Displays $2,000.00-PerMonth NONE REFUNDABLE
2 Prepay for 6 Months Unlimited Displays $1,500.00-PerMonth NONE REFUNDABLE
3 Prepay for 1 Year Unlimited Displays $1,000.00-PerMonth NONE REFUNDABLE
4--It needs to do all calculations as far as Keeping track of expiration dates.
5--Once an account has expired the banner rotator needs to email the webmaster and Customer with the subject BANNER EXPIRED and in the body of email all account info submited by Customer.
6 The account must NOT be deleted however, and should have a Activate/RE-Activate option in my Admin Control Panel.
7--It needs to store the banner submited by the customer and place the account in the Banner Accounts Pending and email me with the subject Banner Accounts Pending
and in the body of the email a link to that Control Panel where I must always enter a password before I have access.
8--Multiple Admin accounts must be possible.
9--The banner must be 500x40 256 Colors-MAX and LESS than 7K in Gif format ONLY and ANIMATION is alowed but ONLY ONE LOOP AND the engine needs to verify this before it submits it for aproval. If it does not meet requirements, the user must be told immediately.
10 The Customer should be emailed as soon as His/her account has been aproved
11 The Search Engine needs to have a Banner Code directory, with a file created by the Banner CGI so it can add and remove banners as they become available/unavailable.
12 Banner Engine needs to display the amount of banner spaces stil available to but ONLY to the Administrator

B Small Search Results Banner
1 Must function exactly as the one above with the following exceptions.
2 A---Prepay for 3 Months Unlimited Displays $100.00-PerMonth NONE REFUNDABLE
B---Prepay for 6 Months Unlimited Displays $75.00-PerMonth NONE REFUNDABLE
C---Prepay for 1 Year Unlimited Displays $50.00-PerMonth NONE REFUNDABLE

3 The banner must be 103Widex62High 256 Colors-MAX and LESS than 3K in Gif format ONLY and ANIMATION is alowed but ONLY ONE LOOP


Registration User Name Policy

All the registrations i.e. User names and passwords must work EXACTLY like UBB (www.ultimatebb.com) AND Must share the same database file for users so that when a User want to register with the site/ UBB / Search engine or Singles Network, he/she need not get a new account/new-registration.

The cookies must work as it does with UBB.
Also, if a user frgets password, an option for the engine to send password to his/her email MUST be enabled and available throughout the site/script.


The next thing I will want to work as one with the Search engine ANT Banner engine is a CGI/Software for a Single's Club, which may be on a different SERVER.

Also, a multi-room Chat room system Java Based with as many rooms as I wish to add, and it needs to use the same User Name and Password and again automatically log people in if cookie matches...and ISP ID displayed at all times and who is on line at ANY given time updated every 3 minutes or upon REFRESH.

Anyway, that is it for now, although I have alot more to add, because this will be for a EVERYTHING someone wants website, so more to come.

I would like to work with you on this, and possibly JoinVentures.
I can provide the Server and Space and Bandwidth, you provide all the Programming and tweaking.

Let me know what you think.

Solomon Tulbure
7758 Verona Lane
Powell, TN 37849
Phone 423-947-1236

Quote Reply
Re: Search Engine Needed. In reply to
I suppose you want all this done for free?
Are these programs you currently have and need modification to, or do you require new programs to be installed/written?

This can be a great deal of easy, yet time consuming modifications or a long term programming project which would require some input on your part to enact. If you want such changes, I suggest either you hire a professional programmer that can get these changes done for you within a time constraint and also provide you some support in the long term.

In any case, a person who has the skill to write these programs from scratch for you is not usually interested in what you have to offer as far as web space goes. Most likely, they already have that.

There are also a number of other issues you'd have to deal with in this project concerning security and server setup, and quite frankly, NT would not be the system I used to develop a web solution of this magnitude.

Sorry if this sounds judgemental, they are just some thoughts/opinions, and is not meant to be rude.
Quote Reply
Re: Search Engine Needed. In reply to
have a look at what i am curently working on:

Quote Reply
Re: Search Engine Needed. In reply to

I loved your website.
Did you design it?

It looks absolutely perfect!!!

Quote Reply
Re: Search Engine Needed. In reply to

You could check:

http://www.xav.com and look at Xavatoria II.
Quote Reply
Re: Search Engine Needed. In reply to
Hello Sol,

If you like Xav Search II get it:


I modify the script and fixed two bugs


Lucas Saud - #19815087

Quote Reply
Re: Search Engine Needed. In reply to

The search engine is great but I don't see any "add" url options like Links has, nor any of the features I mentioned above.

I don't just need a great search "ENGINE" I need a Search Engine that has all/at-least-most of the features mentiuoned above.