Gossamer Forum
Home : General : Perl Programming :

XML::Parser

Quote Reply
XML::Parser
I use Perl to take data and create xml files that I can upload. Now I need to go the other direction and convert an XML file into variables.

I've got XML::Parser installed and have played around with a couple of demos. However the demos seem to be more interested in showing you various counts and comments and other useless displays.

Can anybody point me in the right direction of a simple way to just get name=value out of an xml file? In other words, is there one script that will work on multiple xml files or do you have to define every attribute and tag expected?

Any help is greatly appreciated...
Quote Reply
Re: [Watts] XML::Parser In reply to
This isn't using xml::parser but it works and I can hack it.

http://www.webreference.com/perl/tutorial/1/index.html

You give it a url of a xml file and then tell it which tags you want to retrieve.
Quote Reply
Re: [Watts] XML::Parser In reply to
I prefer XML::Simple.

http://search.cpan.org/~grantm/XML-Simple-2.14/lib/XML/Simple.pm

Example (from my PingServices plugin):

Code:
<?xml version="1.0"?>
<methodCall>
<methodName>pingback.ping</methodName>
<params>
<param>
<value><string>http://www.mysite.com/blog/article.html</string></value>
</param>
<param>
<value><string>http://www.yoursite.com/blog/article.html</string></value>
</param>
</params>
</methodCall>

Converted to Perl data structure with:

Code:
sub pingback_req_parse
{
my $xml = $IN->post_data();
my $data = xml_to_hash($xml);

return (ref $data eq 'HASH')
? {
method => $data->{methodName},
from => $data->{params}->{param}->[0]->{value}->{string},
to => $data->{params}->{param}->[1]->{value}->{string}

}
: undef;
}

sub xml_to_hash
{
my $xml = shift;
$xml =~ s/^([^<]*)</</; #leading spaces before first tag breaks XML. remove them!

my $data = eval { XML::Simple::XMLin($xml) }; #eval to failsafe on bad input

return (ref $data eq 'HASH') ? $data : undef;
}

Note that there is an XML::RPC module I could have used for the above example, but XML::Simple is more portable as it lets me parse XML RPC calls as well as any other XML-like structure, such as Trackback RDF data from blogs, into a a more-or-less literal translation into a nested hash, where each XML tag or attribute is a hash key.

Philip
------------------
Limecat is not pleased.

Last edited by:

fuzzy logic: Sep 19, 2006, 10:33 AM
Quote Reply
Re: [fuzzy logic] XML::Parser In reply to
Thanks a million... that was what I needed. You pointed me in the right direction. Here is my simplified version:

Code:
#! perl
use XML::Simple;
use strict;

my $ref = XMLin("./testorder.xml");

print $ref->{BorrowerData}->{Borrower}->[0]->{BorrowerName};

print "\n\n";

if ($ref->{BorrowerData}->{Borrower}->[1]) {
print $ref->{BorrowerData}->{Borrower}->[1]->{BorrowerName};
print "\n\n";
}

if ($ref->{Property}->{PropertyType} eq "PUD") {
print "This Property is in a P.U.D. called: \n";
print $ref->{Property}->{ProjectName};
print "\n\n";
}

Here is a snippet of the XML file:
Code:


<BorrowerData>
<Borrower TitleOnly="Individual">
<BorrowerName>John Smith<BorrowerName>
<SocialSecurityNumber></SocialSecurityNumber>
</Borrower>

<Borrower TitleOnly="Individual">
<BorrowerName>Jane Doe</BorrowerName>
<SocialSecurityNumber></SocialSecurityNumber>
</Borrower>
</BorrowerData>

<Property OwnerOccupied="Yes" SecondHome="No" PropertyType="PUD">
<ProjectName>Timberwood Park</ProjectName>
</Property>

I'll play around with attributes and loops later. (I spent the last 4 hours trying to make sense of the crap on cpan.org and only got frustrated.)

Here is a much better FAQ (for xml::simple):
http://www.cs.biu.ac.il/~kurzbed/perl_xml_course/faq%20about%20xml-simple.html

(the link to the FAQ on the page at CPAN doesn't work).

.
Quote Reply
Re: [Watts] XML::Parser In reply to
FYI - I just changed this line:

my $ref = XMLin("./testorder.xml");

to:

my $ref = XMLin("./testorder.xml", suppressempty=>['']);

To keep from getting empty hashes where an element is empty. In the above example if you tried to

print $ref->{BorrowerData}->{Borrower}->[0]->{SocialSecurityNumber};

You would have gotten something like: HASH<0x19ae638>.

Turning on suppress empty will keep that from happening.