Gossamer Forum
Home : Products : Gossamer Links : Discussions :

Site CleanUp during Build....possible?

Quote Reply
Site CleanUp during Build....possible?
Hi All

One of the bugaboos with dealing with a static site is that sometimes categories get dropped, or just sometimes some links get dropped. For example, if you have Category A with 100 links and you drop, say, 50 of those links, then (assuming 10 links per page), you suddenly negate 5 pages on your site. However, visitors might still have those pages in their Favorites and search engines might still list those old pages, too. Yes, I know that we can use Fileman to manually make the adjustments, but that still leaves open the possibility that current users would wonder where the heck their pages went. Plus, it could be a big chunk of time to go through everything.

So, I'm looking for a way to have the Build routine determine how many pages are required for a particular category, examine the contents of that directory, rebuild the new content and then take the remaining existing pages and either delete them or convert them to redirects that go to either the top of the site or the top of their respective categories.

With this kind of site cleanup in the system, it could make site maintenance easy to maintain for a non-tech person.

Think this is possible?

Many thanks. Smile

------------------------------------------
Quote Reply
Re: [DogTags] Site CleanUp during Build....possible? In reply to
Yes, it is possible for sure. There could be several ways. It was discussed on the forum already, no solution at the moment, tough.
I will think about it.

Best regards,
Webmaster33


Paid Support
from Webmaster33. Expert in Perl programming & Gossamer Threads applications. (click here for prices)
Webmaster33's products (upd.2004.09.26) | Private message | Contact me | Was my post helpful? Donate my help...
Quote Reply
Re: [DogTags] Site CleanUp during Build....possible? In reply to
Yes this is possible. It should be fairly simple - although I'm saying that without looking at the available plugin hooks :)

Having said that, instead of managing the deletions during a build, would it not be best to delete the category directories when you delete a category?

If you want to delete individual pages that become redundant then that is a whole lot trickier and it is going to be tough to locate which links are on which pages and then to figure out which pages are redundant.

Edit:

How about this - a seperate script that you would execute after a full build that would analyze the last modified time for the category pages and delete ones that haven't updated - meaning that they are redundant?

Last edited by:

Paul: Jan 11, 2003, 8:17 AM
Quote Reply
Re: [Paul] Site CleanUp during Build....possible? In reply to
Quote:
Having said that, instead of managing the deletions during a build, would it not be best to delete the category directories when you delete a category?
If nothing else is placed into the dir, then the answer is yes.
But this can not be guaranteed (too much people use LSQL, and so much custom mods are done), so I think deleting the category directories would be not a good idea. Just my 2 cents.

Best regards,
Webmaster33


Paid Support
from Webmaster33. Expert in Perl programming & Gossamer Threads applications. (click here for prices)
Webmaster33's products (upd.2004.09.26) | Private message | Contact me | Was my post helpful? Donate my help...
Quote Reply
Re: [Paul] Site CleanUp during Build....possible? In reply to
In Reply To:
How about this - a seperate script that you would execute after a full build that would analyze the last modified time for the category pages and delete ones that haven't updated - meaning that they are redundant?

Hmmm.......that would work.....Do a Build, and then do a Clean. Sounds very interesting! Cool

------------------------------------------
Quote Reply
Re: [DogTags] Site CleanUp during Build....possible? In reply to
BTW:
I also used the same outdated date solution in Links 2.0, what Paul suggested, to delete those outdated files. The solution works fine. However I think this is just a workaround, and not the best solution.

There could be possible to extend the original build function to check & delete those files, which match the name, and the index number is higher than highest page number. That could be the best solution.

So why to to click Build & click Clean, if we could have the same result by clicking just Build?

Best regards,
Webmaster33


Paid Support
from Webmaster33. Expert in Perl programming & Gossamer Threads applications. (click here for prices)
Webmaster33's products (upd.2004.09.26) | Private message | Contact me | Was my post helpful? Donate my help...
Quote Reply
Re: [webmaster33] Site CleanUp during Build....possible? In reply to
Oh, I absolutely agree! All do in Build....

"One click does the trick!" Wink

------------------------------------------
Quote Reply
Re: [DogTags] Site CleanUp during Build....possible? In reply to
I'm getting there with my Delete_Old_Pages plugin. So far, I have;

+ Delete all pages in the /pages/(or whatever defined) path...excluding the /images/ directory, which is installed by default there...
+ Delete pages that do not have the correct extension defined in 'build_extension' and 'build_index'.

Things I'm working on;

+ Actually removing the folders of un-needed categories that used to exist.
+ The removal of pages that do not need to exist (i.e category has been deleted, or links have been removed from a category, so less pages are needed....outdated ones will be removed)
+ Ability for the above services to be run in nph-build.cgi automatically...

I'm hoping to have a beta-release by the middle of next week Smile

Cheers

Andy (mod)
andy@ultranerds.co.uk
Want to give me something back for my help? Please see my Amazon Wish List
GLinks ULTRA Package | GLinks ULTRA Package PRO
Links SQL Plugins | Website Design and SEO | UltraNerds | ULTRAGLobals Plugin | Pre-Made Template Sets | FREE GLinks Plugins!
Quote Reply
Re: [Andy] Site CleanUp during Build....possible? In reply to
This would be great, Andy! Smile

------------------------------------------
Quote Reply
Re: [Andy] Site CleanUp during Build....possible? In reply to
Quote:
I'm getting there with my Delete_Old_Pages plugin. So far, I have;

Hi Andy,

Checking your plugin list on LinksSQL.net it looks as though this plugin didn't get off the ground?

Are you still interested in building a Site cleanup plugin for obsolete pages and directories?



Comedy Quotes - Glinks 3.3.0, PageBuilder, StaticURLtr, CAPTCHA, User_Edit_Profile

Quote Reply
Re: [Chas-a] Site CleanUp during Build....possible? In reply to
Yeah, it never really took off. I may take a look at it again (UNIX only though, as NT stuff is a PITA for this sort of thing).

What kind of features would you be looking for?

Cheers

Andy (mod)
andy@ultranerds.co.uk
Want to give me something back for my help? Please see my Amazon Wish List
GLinks ULTRA Package | GLinks ULTRA Package PRO
Links SQL Plugins | Website Design and SEO | UltraNerds | ULTRAGLobals Plugin | Pre-Made Template Sets | FREE GLinks Plugins!
Quote Reply
Re: [Andy] Site CleanUp during Build....possible? In reply to
1. Be able to flag categories (and all sub categories optionally checked) as "safe" which means don't touch/erase.

2. Be able to put in top level redirects to special "category" page. Then we can target ads to that category when the page is missing. Use rewrite rules here maybe!

The truth of the matter is, I'd make this ONE component of a "SUPER" plugin.

Plugin would work by category (and all subs under it typically, selectable)
- update all entries, automatically, or manually, or semi manually: spider link, pull in description, title, etc, let user edit ALL these as a GROUP (all on screen at once, change any fields). Automatic means replace everything with new info with no user intervention (for new links categories, typically)
On Auto, you could designate "safe" fields that will not be overwritten/updated. For example, keep the Description, update title, etc.
On manual, it will go one entry at a time, and you approve or edit each link and ALL fields in turn (maybe 1,2,5 on screen at once).
on semi-auto, a whole set will come up at once, withmain fields: 10-25 and you edit in line. Can check "approve all" and it approves that 10-25.

You would click a link to open the target site in new window.
Add Review and Rating info also while window open.
Set flag for "award" or "special symbol" to appear (we like this site a LOT or it has our link or ??)
The reciprocal link checker (yes, you already did this, so incorporate it!) will set a flag and indicate if a reciprocal link is there. Should recognize link AND banner.

Plugin would do all the CLEANUP tasks also, auto and manual selectable.

Sorry if this is overboard, just thinking...
Quote Reply
Re: [webslicer] Site CleanUp during Build....possible? In reply to
A few things came up in this discussion, that are "interesting" at least.

1) putting "extra" things in the defined Links/pages directory, so that simply deleting the categories would be a problems. _that_ should be a no-no :) Why? If you look at how Unix itself is set up, mixing static and variable data in the same directory -- or even tree or partition -- is to be avoided. Why? WRITING to a disk has risks, no matter where/how it's done. So, you should NOT put any "static" or otherwise associated data in the /pages directory. It should be put into another directory [tree]. Associated images and such should be in the images tree, etc.

I know there has been talk in the past of packaging "data" with the HTML (mostly from M$ and their sponsored authors and books), but it's a *bad* idea all around. It's a little more complex to set up at first, but it's a LOT easier to maintain in the end.

Links "static" pages are *not*. They are technically dynamic/variable data from the OS's point of view. The associated files do not change (nearly as often), so those are "static". Separate those forms of data.

2) "safe" categories, etc. In order to do this, the plugin would need to map out the site, and keep a (regeneratable) topology of what the site looks like. It may not seem obvious at first, but if you are trying to deal with "renegade" sites, that are not following good layout rules as outlined superficially in #1, there are problems that are going to occur.

3) you might be tempted to add fields to the Category or Links records to follow this data, but that is a bad idea as well. This "safe" or other data does *NOT* belong to the category, or to the core of Links. It belongs only to this plugin. The plugin needs to "image" the existing layout, and map it on to the existing data structure. Also, if the plugin does that, "changes" to the Links site are much more obvious, when the topology of the map doesn't match the datasource.

4) Detail & New pages are also an area that needs attention, and while some sites may want to keep old New pages around, Detail pages that don't match a current LinkID are sort of pointless.

There are other things, but these related mostly to the logic in the development of the plugin, while the others can be added in later as feaures ;)

Actually, this seems to be the antithesis or mirror image of Yogi's page BUILDer plugin <G>


PUGDOG� Enterprises, Inc.

The best way to contact me is to NOT use Email.
Please leave a PM here.
Quote Reply
Re: [pugdog] Site CleanUp during Build....possible? In reply to
Pudog;

Programming philosophy... whew... don't want to get into that. I'm not looking at that, that's your job for sure.
Rather, I'm looking at (and living) the webmaster side of running sites that need to make money every week...

My LinksSQL site is Static. It's VERY STATIC.

LOL.

It's static cause my server is loaded down enough with running CGI/Perl/PHP, JAVA STATS, Backups, etc...

and Google likes good old static pages and even static remnants of old builds , which are still getting hits and making $$.

Static = More $$.

It's only dynamic when I do a Build or Add a listing ... LOL.
Quote Reply
Re: [webslicer] Site CleanUp during Build....possible? In reply to
>> and Google likes good old static pages and even static remnants of old builds

Yes, but if you look at the rewrite rules, i know for a fact that google likes the static-look urls plenty good :). And, they like the fact the content changes when they come back. It means the page is active.

If you tune the /directory/ you are using, to more accurately target your site, then you'll even get higher up, since the main url is rewritten as /domain.com/targeted_directory_word/index.html

You will get AT LEAST the same ranking using the rewrite rules on a dynamic site. I watched it happen on my sites. The increase is pretty quick, and pretty impressive.

I'm currently watching this across over a dozen sites, in completely different targeted markets -- hobbies, fandom, postcards, localities, etc.

And it does make a difference :)

I used to do static pages, but my sites are all medium to small, and dynamic works fine. Static pages on a shared server can make a difference though.


PUGDOG� Enterprises, Inc.

The best way to contact me is to NOT use Email.
Please leave a PM here.
Quote Reply
Re: [pugdog] Site CleanUp during Build....possible? In reply to
"And, they like the fact the content changes when they come back. It means the page is active. "

That's one theory, Pudog.. My experience so far says otherwise very loudly.
What tells you that? If you haved any stats to show that, I'd be interested to see what you are doing.

The way Page Rank is calculated means penalties for changes.
It's a feedback loop and when you pull one page, a whole tree is recalculated, the NEXT or NEXT NEXT round. We'd be in the toilet if we used dynamic. Google prefers tiny changes. It means you aren't messing with the system, specially if we're talking over 1,000 pages on a site.


Does each site you put up have a static or dynamic ip, for instance?

Other small factors: When CGI freezes, the static pages wil still be there.
Google is getting smarter all the time, also.

The rewrite rules make the URL VERY long. TOO LONG. Google now has filters in place that will cut credit on extra long URL's.

Peace, out.
Quote Reply
Re: [webslicer] Site CleanUp during Build....possible? In reply to
  
pugdog, from my experience SE's (Google etc.) do like fresh content but whether your sites pages get a ranking boost from this is debatable.

Using mod_rewrite's not a all encompassing solution for all sites. Yes its great at getting pages using variables in the query string indexed and for sites like yours where (as far as I can tell) you only use the dynamic pages I've no doubt its made a massive difference to your traffic but for LinksSQL sites only using the static built pages (who's content usually changes after a Build-all) using Rewrite's an unnecessary approach.

Quote:
What kind of features would you be looking for?

From my own point of veiw one of my sites pages are constintly changing (example. page 1 may change from isValidated=Yes to No a couple of times a month due to Amazon.com changing their Availablity info on each item my site lists, we only list items that are shipping). Im using a custom 404 page to display an error message for SE traffic arriving on pages that have been deleted. At the moment I need to login via FTP to manually delete pages not updated after the last build-all.

It would be great it your Cleanup plugin could simply delete pages and directories that didn't update after the last Starting > Complete build date.
Quote Reply
Re: In reply to
Quote:
delete pages and directories that didn't update after the last Starting > Complete build date.

Andy, is this the sort of functionality you'd be looking to add to a Cleanup plugin if it occured?



Comedy Quotes - Glinks 3.3.0, PageBuilder, StaticURLtr, CAPTCHA, User_Edit_Profile

Quote Reply
Re: [webslicer] Site CleanUp during Build....possible? In reply to
There seems to be some scoring bonus if a site has changed pages, but the content stays relatively the same. It means the page is "active" and "fresh" yet still targeted, not very, very dusty and old.

The most in-flux item in this whole thing is the search engines themselves.

Google flat-out tells you the Title counts *far* more than keywords, or any other meta- tags that are not visible. *VISIBLE* content is what they are looking for, and they are not looking for clutter.

They *strongly* suggest less than 100 links per page, and *not* getting listed with link farms.

To build up your page rank, depends on a lot of things, including EXTERNAL links to your page, and how high the page that links to you is ranked.

If they gave away all the secrets, then you could spoof the engines again, and they don't want to do that, but I have about a dozen niche sites, and with the new 20-channels of targeting, I'm starting to watch things very closely.

There are a few ideas I'm working on, and trying out.

I used to run static sites. Postcards.com was static for a long time. Only the hits/counts changed on the pages.

I've found dynamic sites have quite a few advantages, especially with user log on, and the pages laying around between builds started to cause problems. I'd rather have a search engine get a redirection page, and lead the user to the new page. The search engine will catch up eventually.


PUGDOG� Enterprises, Inc.

The best way to contact me is to NOT use Email.
Please leave a PM here.
Quote Reply
Re: [pugdog] Site CleanUp during Build....possible? In reply to
  
Sounds like your going about it the right way. I wish you luck :)

Quote:
and the pages laying around between builds started to cause problems. I'd rather have a search engine get a redirection page, and lead the user to the new page. The search engine will catch up eventually.

This is why im interested in the idea behind a Cleanup plugin used in conjunction with a static site.

Quote:
I've found dynamic sites have quite a few advantages, especially with user log on,

I think I've figured a way around the login issue (show Login or Logout as the link text and turning on any login required features on the site) by building 'static' PHP pages that check for the login cookie on the clients machine.

The main issue i have with using dynamic pages instead of static (in conjunction with mod_rewrite) is the server resources required to query MySQL and serve up a new page for each impression - for a site receiving a large amount of unique's a day (even on a dedicated server) this can cause the server to spit out its dummy.



Comedy Quotes - Glinks 3.3.0, PageBuilder, StaticURLtr, CAPTCHA, User_Edit_Profile

Last edited by:

Chas-a: Mar 31, 2004, 10:40 AM