Gossamer Forum
Home : Products : DBMan : Installation :

Splitting at delimiter problems

Quote Reply
Splitting at delimiter problems
Hi. I am wondering if anyone can be of assistance with this problem, which is almost certainly an error in the way I'm parsing the data file, but I can't find it. I am having search results show up in the wrong "field" of the generated HTML tables, but only for some of the records. Others work fine.

Here is my database definition in the configuration file:

%db_def = (

Distributor => [0, 'alpha', 40, 60, 0, '', ''],

Address1 => [1, 'alpha', 40, 60, 0, '', ''],

Address2 => [2, 'alpha', 40, 60, 0, '', ''],

City => [3, 'alpha', 30, 60, 0, '', ''],

State => [4, 'alpha', 10, 30, 0, '', ''],

Zipcode => [5, 'alpha', 8, 20, 0, '', ''],

Areacode => [6, 'alpha', 6, 10, 0, '', ''],

Phone => [7, 'alpha', 10, 20, 0, '', ''],

Fax => [8, 'alpha', 10, 20, 0, '', ''],

URL => [9, 'alpha', 60, 255, 0, '', '']

);


Here are 5 lines from the database, using a pipe delimiter:

CED|D/B/A/ ALSTON ELECTRIC SUPPLY|331 NORTH ST. JOSEPH AVE|BREWTON|AL|36426|334|867-6084|NULL|NULL

CED|1302 4TH ST S W|NULL|CULLMAN|AL|35055|256|734-9624|NULL|NULL

CED|1207 CENTRAL PKY SW|NULL|DECATUR|AL|35601|256|355-1933|NULL|NULL

FORBES DISTRIBUTING CO.|1522 5TH AVENUE SE|NULL|DECATUR|AL|35601|NULL|NULL|NULL|NULL

MAYER ELECTRIC SUPPLY CO|3133 HIGHWAY 20 WEST|NULL|DECATUR|AL|35601|256|350-2616|NULL|NULL

The problem that I am having is that although the results are returned when I do a search, they frequently show up on the wrong line.

I am wondering, does the issue lie in the fact that I am using "Distributor" as the primary key and some of them are the same (aren't primary keys supposed to be unique?), or is it possibly that there are characters that have special meaning to Perl (like ampersand, comma)?

I inherited this database project and am not sure how well the previous organization planned the design. Nevertheless, I need to figure out why things are parsing (or at least displaying) incorrectly.

If you want to test the database and see the error (will that help?), you can go to http://www.kwpowell.com/cgi-bin/greenlee/db.cgi and check it out. The userid and password are both "user". Wildcard searches are disabled.

I would also be glad to share the html.pl or any other necessary files if it will help answer this question.

Thanks in advance for any advice you can offer!



------------------
Scot Robnett
srobnett@earthlink.net
Quote Reply
Re: Splitting at delimiter problems In reply to
I would suggest putting a copy of your database file into excel and checking th alignment of the fields. It's easier to see the actual field contents when some are blank when you can separate them in this manner.

The problem you are having is usually caused by a corrupted database file.

Usually the first record that shows the problem is a clue as to the line that could be incorrect, or it could be the previous record.


Quote Reply
Re: Splitting at delimiter problems In reply to
Thanks, Lois. Actually, I did all the cleanup work in Excel just as you suggested, then saved it out to a tab-delimited text file. Then I did a regexp search and replace, replacing \t with | in TextPad. I verified within Excel that any empty fields were replaced with the "NULL" string. I verified that \t no longer existed anywhere within the file.

I had already gone through your next suggestion as well, which was to go to the line where the problem occured to see if it was a problem with that line or the previous line. In every case that I tested this, the integrity of the lines was good.

That's why I was wondering if it had something to do with special characters or duplicate primary keys.

Is there anything else I can do to check the integrity of the data file? I've done samplings the way you suggested, and also have gone through and picked out random lines for testing - they all look okay.



------------------
Scot Robnett
srobnett@earthlink.net
Quote Reply
Re: Splitting at delimiter problems In reply to
The first thing I notice is that you seem to be using an older version of DBMan. The current version no longer inserts "NULL" into empty fields.

Quote:
I am wondering, does the issue lie in the fact that I am using "Distributor" as the primary key and some of them are the same (aren't primary keys supposed to be unique?)

I'm not sure that is the cause of your current problem, but it will definitely cause other problems in your database. Your key field must be unique for each record.

Quote:
is it possibly that there are characters that have special meaning to Perl (like ampersand, comma)?

No, that won't cause a problem.

The only thing I can think of is that using TextPad may have caused a problem. I'm not familiar with the program, but you need to be sure that every line has just a newline character at the end. Some text editors add both a newline and a linefeed, which can cause problems.

It may be possible that the problem is due to duplicate keys. At any rate, you need to change that before you go on.


------------------
JPD





Quote Reply
Re: Splitting at delimiter problems In reply to
Thank you, 'Just Plain Denise'! Smile
(or Carol)

I have already inserted a unique identifier for each record (it's just simple numbering) because I thought that this was probably causing some of the problems. That didn't do the trick. I will leave it this way though, because the unique key is necessary anyway.

Possibly TextPad is inserting a linefeed character, but it has a pretty powerful search and replace utility in which you can do regexp searches for \t, \n, etc. I did that. I'm not sure what the difference is in the characters between newline and linefeed - is there any difference? Maybe this is my problem.

I also made sure to save in UNIX format instead of PC format because I'm uploading to a FreeBSD w/Apache server. So that takes care of all the little ^M nasties too.

The problem is definitely in my db file somewhere. I tried inserting a value into both the $rec{'Fax'} and $rec{'URL'} fields, which are position 8 and 9 respectively in the array, but when I print $rec{'URL'} I get the value for $rec{'Fax'}. Sometimes. Aaaargh.

In reference to how my day has gone, speaking of my own capabilities here..."Stupid is as stupid does, sir." (whoops - I guess that would be "ma'am" to you, huh?)

I'm stumped. I am going to bong this project for the evening and get a fresh start tomorrow.

Thanks for the reply.
Quote Reply
Re: Splitting at delimiter problems In reply to
 Smile (For a number of things in your last post.)

You seem to have taken care of the linefeed issue -- that's the "little ^M nasties." (Or maybe ^M isn't linefeed, but that's what I was thinking of.)

Tell you what.

Save your .db file someplace safe and then upload an empty .db file. Start putting in some records and see what happens. Put in as many as you can stand before you go stark raving mad. Then check it with a "list all." If the same problem arises, it's a problem with your .cfg file or coding elsewhere. If the same problem does not arise, then it's probably somewhere in your original .db file.


------------------
JPD





Quote Reply
Re: Splitting at delimiter problems In reply to
Thanks. That's exactly what I planned to do this morning, because it certainly would be difficult to check every line thoroughly just by plain eyesight - there are 4552 records! I was going to start with 10-20 lines and see what kind of results I got.

I really think the issue is with the db file, since everything else about the config and html generation seems to work fine.

Regarding the NULL insertion, I tried the db both with and without NULL between pipes where there is no data in the field. The database did react differently, but the fields came out wrong in both cases.

On another note, do you know of a way to make the primary key auto-generate a number which is one higher than the previous record? Do I need to do something like
Code:
$rec{'Distributor_ID'}++;
?

Thank you for all your help. Did anyone ever tell you that you're pretty cool? Smile



------------------
Scot Robnett
srobnett@earthlink.net
Quote Reply
Re: Splitting at delimiter problems In reply to
Okay, I think I'm on to something here. Or maybe it's the fact that I got some sleep. Wink

If I:

* add a column for "ID"
* make it the primary key
* fill it with unique numbers
(each new record increments by one)
* fill anything that's blank with "NULL"

then the records display properly.

But now I am not sure how to make the ID field auto-increment. Suggestions? I tried a separate script like this that worked okay:

Code:
#!/usr/bin/perl5

$db_file = "distribs.db";
$end_of_records = `tail -1 $db_file`;
@endrecords = split(/\|/,$end_of_records);
$current_id = @endrecords[0];
$new_id = $current_id + 1;

But when I try to replicate this inside of distribs.cfg like this:

Code:
# Create an auto-increment for the distributor ID.
$id_end_of_records = `tail -1 $db_file_name`;
@endrecords = split(/\|/,$id_end_of_records);
$current_id = @endrecords[0];
$new_id = $current_id + 1;

and insert $new_id in here:

Code:
%db_def = (
ID => [0, 'numer', 5, 10, 1, '$new_id', ''],
Distributor => [1, 'alpha', 40, 60, 1, '', ''],
Address1 => [2, 'alpha', 40, 60, 1, '', ''],
Address2 => [3, 'alpha', 40, 60, 0, '', ''],
City => [4, 'alpha', 40, 60, 1, '', ''],
State => [5, 'alpha', 3, 20, 1, '', ''],
Zipcode => [6, 'alpha', 10, 20, 0, '', ''],
Areacode => [7, 'alpha', 4, 10, 0, '', ''],
Phone => [8, 'alpha', 10, 20, 0, '', ''],
Fax => [9, 'alpha', 10, 20, 0, '', ''],
URL => [10, 'alpha', 40, 255, 0, '', ''],
);

I get only a value of 1 when I go to add a record.

Also, I don't really want the null fields to display NULL. I'd rather they don't display anything. But if I remove the word NULL from the record in its appropriate space, then the record alignment gets messed up again. Other than displaying "N/A" or something to that effect, how do I remedy this?

------------------
Scot Robnett
srobnett@earthlink.net


[This message has been edited by srobnett (edited March 30, 2000).]

[This message has been edited by srobnett (edited March 30, 2000).]
Quote Reply
Re: Splitting at delimiter problems In reply to
DBMan has its own automatic number generation. All you have to do is take advantage of it.

Since you have a large number of records already in your .db file, you'll first want to open the default.count file and replace whatever is there with a number one larger than the highest number already in your database. (If your highest number is 4782, enter 4783.)

Then, in your .cfg file, set:

$db_key = 'ID';

and

$db_key_track = 1;

Also make sure that you have the correct counter file defined in the upper portion of your .cfg file:

# Full path and file name of the counter file.
$db_id_file_name = $db_script_path . "/default.count";

Upload the edited .cfg file and the .count file. Make sure you set the permissions on the .count file to 666 -- rw-rw-rw-

From there on out, DBMan will automatically increment your counter. Cool, huh?

Be sure to take out

$new_id

from your .cfg file. You won't be needing that.


I'm not sure why you're having the problem with eliminating NULL from the fields, but I always say "Go with what works." Smile

Regarding not printing out NULL fields, this is really easy to do, but not so easy to explain. The best way to explain it, I guess is to give you an example.

Let's say that the following is in your sub html_record:

Code:
print qq|
<table>
<tr><td>ID</td>
<td>$rec{'ID'}</td></tr>
<tr><td>Distributor</td>
<td>$rec{'Distributor'}</td></tr>
<tr><td>Address1</td>
<td>$rec{'Address1'}</td></tr>
<tr><td>Address2</td>
<td>$rec{'Address2'}</td></tr>
<tr><td>City</td>
<td>$rec{'City'}</td></tr>
<tr><td>State</td>
<td>$rec{'State'}</td></tr>
<tr><td>Zipcode</td>
<td>$rec{'Zipcode'}</td></tr>
<tr><td>Areacode</td>
<td>$rec{'Areacode'}</td></tr>
<tr><td>Phone</td>
<td>$rec{'Phone'}</td></tr>
<tr><td>Fax</td>
<td>$rec{'Fax'}</td></tr>
<tr><td>URL</td>
<td>$rec{'URL'}</td></tr>
</table>
|;

Change it to:

Code:
print qq|
<table>
<tr><td>ID</td>
<td>$rec{'ID'}</td></tr>
<tr><td>Distributor</td>
<td>$rec{'Distributor'}</td></tr>
<tr><td>Address1</td>
<td>$rec{'Address1'}</td></tr>
|;
unless ($rec{'Address2'} eq "NULL") {
print qq|

<tr><td>Address2</td>
<td>$rec{'Address2'}</td></tr>
|;
}
print qq|

<tr><td>City</td>
<td>$rec{'City'}</td></tr>
<tr><td>State</td>
<td>$rec{'State'}</td></tr>
|;
unless ($rec{'Zipcode'} eq "NULL") {
print qq|

<tr><td>Zipcode</td>
<td>$rec{'Zipcode'}</td></tr>
|;
}
unless ($rec{'Areacode'} eq "NULL") {
print qq|

<tr><td>Areacode</td>
<td>$rec{'Areacode'}</td></tr>
|;
}
unless ($rec{'Phone'} eq "NULL") {
print qq|

<tr><td>Phone</td>
<td>$rec{'Phone'}</td></tr>
|;
}
unless ($rec{'Fax'} eq "NULL") {
print qq|

<tr><td>Fax</td>
<td>$rec{'Fax'}</td></tr>
|;
}
unless ($rec{'URL'} eq "NULL") {
print qq|

<tr><td>URL</td>
<td>$rec{'URL'}</td></tr>
|;
}
print qq|

</table>
|;

Does this make sense?

I realized as I was doing this that you may run into problems with the NULL in the empty fields for another reason. New records will not have NULL in them. You can add code to write NULL to empty fields, but I'll have to work on how to do that, if you need it. Smile


------------------
JPD





Quote Reply
Re: Splitting at delimiter problems In reply to
Yes, that makes sense! Thanks. And I will definitely take advantage of the built-in autonumbering.

I was thinking that I would try one last time to go ahead and remove NULL from the empty fields and see if I could get things working the same as they do with the string inserted, and since I want all the fields to print even if they're empty, I was thinking of using something more along the lines of:

Code:
if(($rec{'Address2'} eq "") | | ($rec{'Address2'} eq "NULL")) {
print qq|
<tr>
<td>Address Line 2</td>
<td> - </td>
</tr>
|;
}
else {
print qq|
<tr>
<td>Address Line 2</td>
<td>$rec{'Address2'}</td>
</tr>
|;
}

That should work, yes?

Thanks again.



------------------
Scot Robnett
srobnett@earthlink.net


[This message has been edited by srobnett (edited April 03, 2000).]

[This message has been edited by srobnett (edited April 03, 2000).]
Quote Reply
Re: Splitting at delimiter problems In reply to
Yes, that should work. The code looks sound. Smile


------------------
JPD