Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Python: Python

File Parsing Question

 

 

Python python RSS feed   Index | Next | Previous | View Threaded


shankarjee at gmail

Sep 12, 2007, 2:00 PM

Post #1 of 8 (159 views)
Permalink
File Parsing Question

Hi,
I am new to Python. I am trying to do the following

inp = open(my_file,'r')

for line in inp:
# Perform some operations with line
if condition something:
# Start re reading for that position again
for line in inp:
if some other condition
break
# I need to go back one line and use that line value.
# I need to perform the operations which are listed in the
top with this line
# value. I cannot push that operation here
# I cannot do this with seek or tell.

In Perl this is what I have
while (<inp> ){
# my_operations
next if /pattern/
while (<inp>) {
operations again
last if /pattern2/
}
seek(inp,(-1-length),1)
}

This works perfectly in Perl. Can I do the same in Python.

Thanks
Jee
--
http://mail.python.org/mailman/listinfo/python-list


zentraders at gmail

Sep 12, 2007, 3:02 PM

Post #2 of 8 (148 views)
Permalink
Re: File Parsing Question [In reply to]

Save the previous line in a variable if you want the previous line
only.
for line in inp:
# Perform some operations with line
if condition something:
print prev_line
print line
break
# I need to go back one line and use that line value
--> prev_line = line

If you want to do more than that, then use data=inp.readlines() or you
can use
data = open(myfile), "r").readlines(). The data will be stored in
list format so you can access each line individually.

--
http://mail.python.org/mailman/listinfo/python-list


zentraders at gmail

Sep 12, 2007, 3:07 PM

Post #3 of 8 (147 views)
Permalink
Re: File Parsing Question [In reply to]

I'm assuming you know that python has a file.seek(), but you have to
know the number of bytes you want to move from the beginning of the
file or from the current location. You could save the length of the
previous record, and use file seek to backup and then move forward,
but it is simpler to save the previous rec or use readlines() if the
file will fit into a reasonable amount of memory.

--
http://mail.python.org/mailman/listinfo/python-list


shankarjee at gmail

Sep 12, 2007, 3:28 PM

Post #4 of 8 (144 views)
Permalink
Re: File Parsing Question [In reply to]

I would prefer to use something with seek. I am not able to use seek()
with "for line in inp". Use tell and seek does not seem to do anything
with the code. When I try to do

for line in inp.readlines():
# Top of Loop
if not condition in line:
do_something
else:
for lines in inp.readlines():
if not condition
do_something
else:
break
pos = inp.tell()
inp.seek(pos) ---> This line has not effect in the program

Not sure if Iam missing something very basic. Also the previous line
needs to be used in the position I call # Top of Loop.

Thanks


On 9/12/07, Zentrader <zentraders [at] gmail> wrote:
> I'm assuming you know that python has a file.seek(), but you have to
> know the number of bytes you want to move from the beginning of the
> file or from the current location. You could save the length of the
> previous record, and use file seek to backup and then move forward,
> but it is simpler to save the previous rec or use readlines() if the
> file will fit into a reasonable amount of memory.
>
> --
> http://mail.python.org/mailman/listinfo/python-list
>
--
http://mail.python.org/mailman/listinfo/python-list


zentraders at gmail

Sep 12, 2007, 5:08 PM

Post #5 of 8 (146 views)
Permalink
Re: File Parsing Question [In reply to]

> for line in inp.readlines():

If you are now using readlines() instead of readline(), then
a) it is only used once to read all data into a container
b) you can access each element/line by it's relative number

data=open(filename, "r").readlines()
for eachline in data : (not readlines())

so try
print data[0] ## first rec
print data[9] ## 10th rec, etc

you can use
ctr = 0
for eachline in data:
##do something
if ctr > 0:
print "this line is", eachline ## or data[ctr]
print "prev_line = ", data[ctr-1]
ctr += 1

or a slightly different way
stop = len(data)
ctr=0
while ctr < stop:
## do something
if ctr > 0 :
this_line = data[ctr]
prev_line = data[ctr-1]
ctr += 1

Sorry, I don't use file.seek() so can't help there

--
http://mail.python.org/mailman/listinfo/python-list


__peter__ at web

Sep 12, 2007, 11:03 PM

Post #6 of 8 (145 views)
Permalink
Re: File Parsing Question [In reply to]

Am Wed, 12 Sep 2007 17:28:08 -0500 schrieb Shankarjee Krishnamoorthi:

> I would prefer to use something with seek.

Writing Perl in any language?

> I am not able to use seek()
> with "for line in inp". Use tell and seek does not seem to do anything
> with the code. When I try to do
>
> for line in inp.readlines():

readlines() reads the whole file at once, so inp.tell() will give the
position at the end of the file from now on.

> # Top of Loop
> if not condition in line:
> do_something
> else:
> for lines in inp.readlines():
> if not condition
> do_something
> else:
> break
> pos = inp.tell()
> inp.seek(pos) ---> This line has not effect in the program
>
> Not sure if Iam missing something very basic. Also the previous line
> needs to be used in the position I call # Top of Loop.

If you want to use seek/tell you can't iterate over the file directly
because

for line in inp:
# ...

reads ahead to make that iteration highly efficient -- so you will often
get a position further ahead than the end of the current line.

But you can use readline() (which doesn't read ahead) in conjunction with
tell/seek; just replace all occurences of

for line in inp:
# ...

with

for line in iter(inp.readline, ""):
# ...

Peter
--
http://mail.python.org/mailman/listinfo/python-list


__peter__ at web

Sep 12, 2007, 11:49 PM

Post #7 of 8 (144 views)
Permalink
Re: File Parsing Question [In reply to]

Dennis Lee Bieber wrote:

> for line in inp:
>
> will read one line at a time (I'm fairly sure the iterator doesn't
> attempt to buffer multiple lines behind the scenes)

You are wrong:

>>> open("tmp.txt", "w").writelines("%s\n" % (9*c) for c in "ABCDE")
>>> instream = open("tmp.txt")
>>> for line in instream:
... print instream.tell(), line.strip()
...
50 AAAAAAAAA
50 BBBBBBBBB
50 CCCCCCCCC
50 DDDDDDDDD
50 EEEEEEEEE
>>>

Here's the workaround:

>>> instream = open("tmp.txt")
>>> for line in iter(instream.readline, ""):
... print instream.tell(), line.strip()
...
10 AAAAAAAAA
20 BBBBBBBBB
30 CCCCCCCCC
40 DDDDDDDDD
50 EEEEEEEEE
>>>

Peter
--
http://mail.python.org/mailman/listinfo/python-list


shankarjee at gmail

Sep 13, 2007, 6:28 AM

Post #8 of 8 (133 views)
Permalink
Re: File Parsing Question [In reply to]

Great. That worked for me. I had some of my routines implemented in
Perl earlier. Now that I started using Python I am trying to do all my
automation scripts with Python. Thanks a ton

Jee

On 9/13/07, Peter Otten <__peter__ [at] web> wrote:
> Dennis Lee Bieber wrote:
>
> > for line in inp:
> >
> > will read one line at a time (I'm fairly sure the iterator doesn't
> > attempt to buffer multiple lines behind the scenes)
>
> You are wrong:
>
> >>> open("tmp.txt", "w").writelines("%s\n" % (9*c) for c in "ABCDE")
> >>> instream = open("tmp.txt")
> >>> for line in instream:
> ... print instream.tell(), line.strip()
> ...
> 50 AAAAAAAAA
> 50 BBBBBBBBB
> 50 CCCCCCCCC
> 50 DDDDDDDDD
> 50 EEEEEEEEE
> >>>
>
> Here's the workaround:
>
> >>> instream = open("tmp.txt")
> >>> for line in iter(instream.readline, ""):
> ... print instream.tell(), line.strip()
> ...
> 10 AAAAAAAAA
> 20 BBBBBBBBB
> 30 CCCCCCCCC
> 40 DDDDDDDDD
> 50 EEEEEEEEE
> >>>
>
> Peter
> --
> http://mail.python.org/mailman/listinfo/python-list
>
--
http://mail.python.org/mailman/listinfo/python-list

Python python RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.