Gossamer Forum
Quote Reply
Regex Help
I have something like:
"Hey, this is a title! Come & read me =)))"
it should be regexed to something like:
hey-this-is-a-title-come-read-me

Here is the code for php; may someone help me with perl, please?

public static function getTitleForUrl($title)
{
$title = strval($title); // makes a string
...
$title = strtr(
$title,
'`!"$%^&*()-+={}[]<>;:@#~,./?|' . "\r\n\t\\",
' ' . ' '
); // substututes the signs; i dont understand the long empty values.
$title = strtr($title, array('"' => '', "'" => '')); // substitutes all " and ' to nothing

$title = preg_replace('/[ ]+/', '-', trim($title)); // substitutes spaces and trim?
$title = strtr($title, 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz'); // why not lowerchar?
...
return $title;
}
Quote Reply
Re: [Robert] Regex Help In reply to
$title =~ s/^\s+|\s+$//g;
$title =~ s/\s+/-/g;
$title =~ tr/`!"'$%^&*()-+={}[]<>;:@#~,.\/?|//;
$title =~ tr/\r\n\t\\//; # delete them also
$title =~ tr/ABCDEFGHIJKLMNOPQRSTUVWXYZ/abcdefghijklmnopqrstuvwxyz/;

Ok?
Quote Reply
Re: [Robert] Regex Help In reply to
Looks like a good approach, I dealt with something similar while I was trying to make some URL-escaping for LinksSQL.
I ended up with:

Code:
sub superlc {
my $return = shift;
$return =~ tr/ÄÖÜÏËÈÒÙÌÀÉÓÍÁÝÊÔÛÎÂÇÑÕÃÅÆ/äöüïëèòùìàéúíáýêôûîâçñõãåæ/;
$return = lc($return);
return $return;
}

Code:
sub URL_escape {
my $return = shift;
my %umlaute = (
"ä" => "ae",
"Ä" => "Ae",
"ü" => "ue",
"Ü" => "Ue",
"ö" => "oe",
"Ö" => "Oe",
"ß" => "ss",
"À" => "a",
"à" => "a",
"Á" => "a",
"á" => "a",
"Â" => "a",
"â" => "a",
"Æ" => "a",
"æ" => "a",
"Ç" => "c",
"ç" => "c",
"È" => "e",
"è" => "e",
"É" => "e",
"é" => "e",
"Ê" => "e",
"ê" => "e",
"Ë" => "e",
"ë" => "e",
"Î" => "i",
"î" => "i",
"Ï" => "i",
"ï" => "i",
"Ô" => "o",
"ô" => "o",
"Œ" => "oe",
"œ" => "oe",
"Ù" => "u",
"ù" => "u",
"Û" => "u",
"û" => "u",
"Ÿ" => "y",
"ÿ" => "y",
);
my $umlautkeys = join ("|", keys(%umlaute));
$return =~ s/($umlautkeys)/$umlaute{$1}/g;
$return =~ s/&#47;/\//g;
$return =~ s/'|#|\||\s|\t|\r|\n|\/|;|"|&/-/g;
while (substr($return,0,1) eq "-") {
substr($return, 0, 1) = "";
}
while (substr($return,-1,1) eq "-") {
chop($return);
}
$return = lc($return);
$return =~ s/([^\w\-.:,!~*'()])/sprintf("%%%02X",ord($1))/eg;
$return =~ s/-{1,}/-/g;
return $return;
}

Maybe some inspiration for you.

Regards

n||i||k||o
Quote Reply
Re: [Robert] Regex Help In reply to
Hi,

I would do it like this:

Code:
my $test = "andy íóé foo --- Hey, this is a title! Come & read me =)))";
$test = lc($test); # lowercase normal A-Z
$test =~ tr|ÀÂÄàâäÇçÉÊÈËéêèëÏÌÎïìîíÖÔÒöôòóÜÛÙüûù|aaaaaacceeeeeeeeiiiiiiiooooooouuuuuu|; # convert all utf into a-z
$test =~ s/[^0-9a-z]+/-/gi; # anything that DOESNT match 0-9 or a-z, convert to -
$test =~ s/-+/-/gi; # if we have stuff like --- just replace with -


Angelic

Cheers

Andy (mod)
andy@ultranerds.co.uk
Want to give me something back for my help? Please see my Amazon Wish List
GLinks ULTRA Package | GLinks ULTRA Package PRO
Links SQL Plugins | Website Design and SEO | UltraNerds | ULTRAGLobals Plugin | Pre-Made Template Sets | FREE GLinks Plugins!
Quote Reply
Re: [Andy] Regex Help In reply to
Thank you, Andi.
But you kill also "§" - and this isnt killed in the original.
I do this for Google to have everywhere the same links inside a page; so i have to hit just the function given in php.
Quote Reply
Re: [Robert] Regex Help In reply to
I have copied something like

$title =~ tr/`!"'$%^&*()-+={}[]<>;:@#~,.\/?|//; # delete them all

ok. I took
$title =~ s/\?//; # delete ?

Last edited by:

Robert: Dec 9, 2017, 3:03 PM