Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Python: Python

dbf.py API question

 

 

Python python RSS feed   Index | Next | Previous | View Threaded


ethan at stoneleaf

Aug 2, 2012, 8:55 AM

Post #1 of 10 (327 views)
Permalink
dbf.py API question

SQLite has a neat feature where if you give it a the file-name of
':memory:' the resulting table is in memory and not on disk. I thought
it was a cool feature, but expanded it slightly: any name surrounded by
colons results in an in-memory table.

I'm looking at the same type of situation with indices, but now I'm
wondering if the :name: method is not pythonic and I should use a flag
(in_memory=True) when memory storage instead of disk storage is desired.

Thoughts?

~Ethan~
--
http://mail.python.org/mailman/listinfo/python-list


__peter__ at web

Aug 3, 2012, 2:03 AM

Post #2 of 10 (319 views)
Permalink
Re: dbf.py API question [In reply to]

Ethan Furman wrote:

> SQLite has a neat feature where if you give it a the file-name of
> ':memory:' the resulting table is in memory and not on disk. I thought
> it was a cool feature, but expanded it slightly: any name surrounded by
> colons results in an in-memory table.
>
> I'm looking at the same type of situation with indices, but now I'm
> wondering if the :name: method is not pythonic and I should use a flag
> (in_memory=True) when memory storage instead of disk storage is desired.

For SQLite it seems OK because you make the decision once per database. For
dbase it'd be once per table, so I would prefer the flag.

Random

> Thoughts?

- Do you really want your users to work with multiple dbf files? I think I'd
rather convert to SQLite, perform the desired operations using sql, then
convert back.

- Are names required to manipulate the table? If not you could just omit
them to make the table "in-memory".

- How about a connection object that may either correspond to a directory or
RAM:

db = dbf.connect(":memory:")
table = db.Table("foo", ...)

--
http://mail.python.org/mailman/listinfo/python-list


ethan at stoneleaf

Aug 3, 2012, 6:11 AM

Post #3 of 10 (315 views)
Permalink
Re: dbf.py API question [In reply to]

Peter Otten wrote:
> Ethan Furman wrote:
>
>> SQLite has a neat feature where if you give it a the file-name of
>> ':memory:' the resulting table is in memory and not on disk. I thought
>> it was a cool feature, but expanded it slightly: any name surrounded by
>> colons results in an in-memory table.
>>
>> I'm looking at the same type of situation with indices, but now I'm
>> wondering if the :name: method is not pythonic and I should use a flag
>> (in_memory=True) when memory storage instead of disk storage is desired.
>
> For SQLite it seems OK because you make the decision once per database. For
> dbase it'd be once per table, so I would prefer the flag.

So far all feedback is for the flag, so that's what I'll do.


> Random
>
>> Thoughts?
>
> - Do you really want your users to work with multiple dbf files? I think I'd
> rather convert to SQLite, perform the desired operations using sql, then
> convert back.

Seems like that would be quite a slow-down (although if a user wants to
do that, s/he certainly could).

> - Are names required to manipulate the table? If not you could just omit
> them to make the table "in-memory".

At one point I had thought to make tables singletons (so only one copy
of /user/bob/scores.dbf) but that hasn't happened and is rather low
priority, so at this point the name is not required for anything beside
initial object creation.

> - How about a connection object that may either correspond to a directory or
> RAM:
>
> db = dbf.connect(":memory:")
> table = db.Table("foo", ...)

dbf.py does not support the DB-API interface, so no connection objects.
Tables are opened directly and dealt with directly.

All interesting thoughts that made me think. Thank you.

~Ethan~
--
http://mail.python.org/mailman/listinfo/python-list


python.list at tim

Aug 3, 2012, 8:08 PM

Post #4 of 10 (313 views)
Permalink
Re: dbf.py API question [In reply to]

On 08/03/12 08:11, Ethan Furman wrote:
> So far all feedback is for the flag, so that's what I'll do.

I agree with the flag, though would also be reasonably content with
using None for the filename to indicate in-memory rather than
on-disk storage.

-tkc




--
http://mail.python.org/mailman/listinfo/python-list


ombdalen at gmail

Aug 4, 2012, 8:04 PM

Post #5 of 10 (318 views)
Permalink
Re: dbf.py API question [In reply to]

On Thu, Aug 2, 2012 at 5:55 PM, Ethan Furman <ethan [at] stoneleaf> wrote:
> SQLite has a neat feature where if you give it a the file-name of ':memory:'
> the resulting table is in memory and not on disk. I thought it was a cool
> feature, but expanded it slightly: any name surrounded by colons results in
> an in-memory table.
>
> I'm looking at the same type of situation with indices, but now I'm
> wondering if the :name: method is not pythonic and I should use a flag
> (in_memory=True) when memory storage instead of disk storage is desired.
>
> Thoughts?

I agree that the flag would be more pythonic in dbf.py.

I was not aware that you are adding sqlite functionality to your
library. This is very cool!

I have been through the same questions with my own DBF library, and
I've come to some conclusions: First, I decided to make the library
read-only and in-memory. That is all we need in-house anyway. Second,
I decided to make an external tool for converting DBF files to sqlite:

https://github.com/olemb/dbfget/blob/master/extras/dbf2sqlite

(To anyone reading: I have not yet made a public announcement of
dbfget, but I will shortly. Consider this an informal announcement:
https://github.com/olemb/dbfget/ )

I am considering adding a "streaming=True" flag which would make the
table class a record generator, and a "save()" method which would
allow you to save data back to the file, or to a new file if you
provide an optional file name. In fact, I had this functionality in
earlier versions, but decided to chuck it out in order to make the API
as clean as possible.

I hope this can help you somehow in your decision making process.
--
http://mail.python.org/mailman/listinfo/python-list


ethan at stoneleaf

Aug 5, 2012, 7:09 AM

Post #6 of 10 (317 views)
Permalink
Re: dbf.py API question [In reply to]

Ole Martin Bjørndalen wrote:
> On Thu, Aug 2, 2012 at 5:55 PM, Ethan Furman <ethan [at] stoneleaf> wrote:
>> SQLite has a neat feature where if you give it a the file-name of ':memory:'
>> the resulting table is in memory and not on disk. I thought it was a cool
>> feature, but expanded it slightly: any name surrounded by colons results in
>> an in-memory table.
>>
>> I'm looking at the same type of situation with indices, but now I'm
>> wondering if the :name: method is not pythonic and I should use a flag
>> (in_memory=True) when memory storage instead of disk storage is desired.
>>
>> Thoughts?
>
> I agree that the flag would be more pythonic in dbf.py.
>
> I was not aware that you are adding sqlite functionality to your
> library. This is very cool!

Actually, I'm not. I had stumbled across that one tidbit and thought it
was cool, but cool is not always pythonic. ;)


> I am considering adding a "streaming=True" flag which would make the
> table class a record generator,

You can do this by implementing either __getitem__ or __iter__, unless
the streaming flag would also make your table not in memory.


> I hope this can help you somehow in your decision making process.

All comments appreciated. Thanks!

~Ethan~
--
http://mail.python.org/mailman/listinfo/python-list


ethan at stoneleaf

Aug 6, 2012, 8:40 AM

Post #7 of 10 (306 views)
Permalink
Re: dbf.py API question [In reply to]

[redirecting back to list]

Ole Martin Bjørndalen wrote:
> On Sun, Aug 5, 2012 at 4:09 PM, Ethan Furman <ethan [at] stoneleaf> wrote:
>> Ole Martin Bjørndalen wrote:
>> You can do this by implementing either __getitem__ or __iter__, unless the
>> streaming flag would also make your table not in memory.
>
> Cool!
>
> Wow! I realize now that this could in fact be fairly easy to
> implement. I just have to shuffle around the code a bit to make both
> possible. The API would be:
>
> # Returns table object which is a subclass of list
> table = dbfget.read('cables.dbf')
> for rec in table:
> print rec
>
> # Return a table object which behaves like an iterator
> table = dbfget.read('cables.dbf', iter=True)
> for rec in table:
> print rec
>
> I have a lot of questions in my mind about how to get this to work,
> but I feel like it's the right thing to do. I will make an attempt at
> a rewrite and get back to you all later.
>
> One more API question: I am uncomfortable with:
>
>
> dbfget.read()
>
> Should it just be:
>
> dbfget.get()
>
> ?
>
> - Ole

`dbfget` is the package name, and `read()` or `get` is the
class/function that loads the table into memory and returns it?

Maybe `load()`?

~Ethan~
--
http://mail.python.org/mailman/listinfo/python-list


ed at leafe

Aug 7, 2012, 6:10 PM

Post #8 of 10 (300 views)
Permalink
Re: dbf.py API question [In reply to]

On Aug 2, 2012, at 10:55 AM, Ethan Furman wrote:

> SQLite has a neat feature where if you give it a the file-name of ':memory:' the resulting table is in memory and not on disk. I thought it was a cool feature, but expanded it slightly: any name surrounded by colons results in an in-memory table.
>
> I'm looking at the same type of situation with indices, but now I'm wondering if the :name: method is not pythonic and I should use a flag (in_memory=True) when memory storage instead of disk storage is desired.

When converting from paradigms in other languages, I've often been tempted to follow the accepted pattern for that language, and I've almost always regretted it.

When in doubt, make it as Pythonic as possible.


-- Ed Leafe



--
http://mail.python.org/mailman/listinfo/python-list


ethan at stoneleaf

Aug 8, 2012, 8:18 AM

Post #9 of 10 (292 views)
Permalink
Re: dbf.py API question [In reply to]

Ed Leafe wrote:
> When converting from paradigms in other languages, I've often been tempted to follow the accepted pattern for that language, and I've almost always regretted it.

+1

>
> When in doubt, make it as Pythonic as possible.

+1 QOTW

~Ethan~
--
http://mail.python.org/mailman/listinfo/python-list


ombdalen at gmail

Aug 8, 2012, 9:21 AM

Post #10 of 10 (292 views)
Permalink
Re: dbf.py API question [In reply to]

On Wed, Aug 8, 2012 at 5:18 PM, Ethan Furman <ethan [at] stoneleaf> wrote:
> Ed Leafe wrote:
>> When converting from paradigms in other languages, I've often been
>> tempted to follow the accepted pattern for that language, and I've almost
>> always regretted it.
> +1
>> When in doubt, make it as Pythonic as possible.
> +1 QOTW
> ~Ethan~

+2 from me as well.

Totally in spirit with the Zen of Python!
--
http://mail.python.org/mailman/listinfo/python-list

Python python RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.