Hello All!
Here goes the code wich can help you parse big RDF file to get smaller files with the caqtegories, only wich you need. When you have small file, import can be made faster and more easy. May be you need it...
Enjoy :)!
Anton
#!/usr/bin/perl -w
$flag = 0;
%tops = (
"Top/Shopping/Antiques_and_Collectibles" => "/home/anton/download/A&C.rdf",
"Top/Recreation/Collecting" => "/home/anton/download/Collecting.rdf",
"Top/Recreation/Antiques" => "/home/anton/download/Antiques.rdf"
);
open (INPUT, "gunzip -c /home/anton/download/content.rdf.u8.gz |");
while (<INPUT>){
print "\nString number $." if ($.P000 == 0);
$str = $_;
unless ($flag) {
foreach $key (keys %tops) {
if ($str =~ m($key)) {
$flag = 1;
$topic = $key;
open (FILE, ">$tops{$key}") or die "Cannot open >$tops{$key}";
print "\n$tops{$key} opened";
last;
}
}
}
if ($flag){
if ($str =~ /<Topic/ && !($str =~ m($topic))){
$flag = 0;
close (FILE);
print "\nFile closed";
next;
}
print FILE $str;
}
}
close (INPUT);
Here goes the code wich can help you parse big RDF file to get smaller files with the caqtegories, only wich you need. When you have small file, import can be made faster and more easy. May be you need it...
Enjoy :)!
Anton
#!/usr/bin/perl -w
$flag = 0;
%tops = (
"Top/Shopping/Antiques_and_Collectibles" => "/home/anton/download/A&C.rdf",
"Top/Recreation/Collecting" => "/home/anton/download/Collecting.rdf",
"Top/Recreation/Antiques" => "/home/anton/download/Antiques.rdf"
);
open (INPUT, "gunzip -c /home/anton/download/content.rdf.u8.gz |");
while (<INPUT>){
print "\nString number $." if ($.P000 == 0);
$str = $_;
unless ($flag) {
foreach $key (keys %tops) {
if ($str =~ m($key)) {
$flag = 1;
$topic = $key;
open (FILE, ">$tops{$key}") or die "Cannot open >$tops{$key}";
print "\n$tops{$key} opened";
last;
}
}
}
if ($flag){
if ($str =~ /<Topic/ && !($str =~ m($topic))){
$flag = 0;
close (FILE);
print "\nFile closed";
next;
}
print FILE $str;
}
}
close (INPUT);