Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Python: Python

Raw strings as input from File?

 

 

Python python RSS feed   Index | Next | Previous | View Threaded


utabintarbo at gmail

Nov 24, 2009, 11:52 AM

Post #1 of 13 (496 views)
Permalink
Raw strings as input from File?

I have a log file with full Windows paths on a line. eg:
K:\A\B\C\10xx\somerandomfilename.ext->/a1/b1/c1/10xx
\somerandomfilename.ext ; t9999xx; 11/23/2009 15:00:16 ; 1259006416

As I try to pull in the line and process it, python changes the "\10"
to a "\x08". This is before I can do anything with it. Is there a way
to specify that incoming lines (say, when using .readlines() ) should
be treated as raw strings?

TIA
--
http://mail.python.org/mailman/listinfo/python-list


python at mrabarnett

Nov 24, 2009, 12:27 PM

Post #2 of 13 (467 views)
Permalink
Re: Raw strings as input from File? [In reply to]

utabintarbo wrote:
> I have a log file with full Windows paths on a line. eg:
> K:\A\B\C\10xx\somerandomfilename.ext->/a1/b1/c1/10xx
> \somerandomfilename.ext ; t9999xx; 11/23/2009 15:00:16 ; 1259006416
>
> As I try to pull in the line and process it, python changes the "\10"
> to a "\x08". This is before I can do anything with it. Is there a way
> to specify that incoming lines (say, when using .readlines() ) should
> be treated as raw strings?
>
.readlines() doesn't change the "\10" in a file to "\x08" in the string
it returns.

Could you provide some code which shows your problem?
--
http://mail.python.org/mailman/listinfo/python-list


carsten.haese at gmail

Nov 24, 2009, 12:28 PM

Post #3 of 13 (465 views)
Permalink
Re: Raw strings as input from File? [In reply to]

utabintarbo wrote:
> I have a log file with full Windows paths on a line. eg:
> K:\A\B\C\10xx\somerandomfilename.ext->/a1/b1/c1/10xx
> \somerandomfilename.ext ; t9999xx; 11/23/2009 15:00:16 ; 1259006416
>
> As I try to pull in the line and process it, python changes the "\10"
> to a "\x08".

Python does no such thing. When Python reads bytes from a file, it
doesn't interpret or change those bytes in any way. Either there is
something else going on here that you're not telling us, or the file
doesn't contain what you think it contains. Please show us the exact
code you're using to process this file, and show us the exact contents
of the file you're processing.

--
Carsten Haese
http://informixdb.sourceforge.net

--
http://mail.python.org/mailman/listinfo/python-list


utabintarbo at gmail

Nov 24, 2009, 1:20 PM

Post #4 of 13 (472 views)
Permalink
Re: Raw strings as input from File? [In reply to]

On Nov 24, 3:27 pm, MRAB <pyt...@mrabarnett.plus.com> wrote:
>
> .readlines() doesn't change the "\10" in a file to "\x08" in the string
> it returns.
>
> Could you provide some code which shows your problem?

Here is the code block I have so far:
for l in open(CONTENTS, 'r').readlines():
f = os.path.splitext(os.path.split(l.split('->')[0]))[0]
if f in os.listdir(DIR1) and os.path.isdir(os.path.join(DIR1,f)):
shutil.rmtree(os.path.join(DIR1,f))
if f in os.listdir(DIR2) and os.path.isdir(os.path.join(DIR2,f)):
shutil.rmtree(os.path.join(DIR2,f))

I am trying to find dirs with the basename of the initial path less
the extension in both DIR1 and DIR2

A minimally obfuscated line from the log file:
K:\sm\SMI\des\RS\Pat\10DJ\121.D5-30\1215B-B-D5-BSHOE-MM.smz->/arch_m1/
smi/des/RS/Pat/10DJ/121.D5-30\1215B-B-D5-BSHOE-MM.smz ; t9480rc ;
11/24/2009 08:16:42 ; 1259068602

What I get from the debugger/python shell:
'K:\\sm\\SMI\\des\\RS\\Pat\x08DJQ.D5-30Q5B-B-D5-BSHOE-MM.smz->/arch_m1/
smi/des/RS/Pat/10DJ/121.D5-30/1215B-B-D5-BSHOE-MM.smz ; t9480rc ;
11/24/2009 08:16:42 ; 1259068602'

TIA

--
http://mail.python.org/mailman/listinfo/python-list


joncle at googlemail

Nov 24, 2009, 1:50 PM

Post #5 of 13 (465 views)
Permalink
Re: Raw strings as input from File? [In reply to]

On Nov 24, 9:20 pm, utabintarbo <utabinta...@gmail.com> wrote:
> On Nov 24, 3:27 pm, MRAB <pyt...@mrabarnett.plus.com> wrote:
>
>
>
> > .readlines() doesn't change the "\10" in a file to "\x08" in the string
> > it returns.
>
> > Could you provide some code which shows your problem?
>
> Here is the code block I have so far:
> for l in open(CONTENTS, 'r').readlines():
>     f = os.path.splitext(os.path.split(l.split('->')[0]))[0]
>     if f in os.listdir(DIR1) and os.path.isdir(os.path.join(DIR1,f)):
>         shutil.rmtree(os.path.join(DIR1,f))
>         if f in os.listdir(DIR2) and os.path.isdir(os.path.join(DIR2,f)):
>                 shutil.rmtree(os.path.join(DIR2,f))
>
> I am trying to find dirs with the basename of the initial path less
> the extension in both DIR1 and DIR2
>
> A minimally obfuscated line from the log file:
> K:\sm\SMI\des\RS\Pat\10DJ\121.D5-30\1215B-B-D5-BSHOE-MM.smz->/arch_m1/
> smi/des/RS/Pat/10DJ/121.D5-30\1215B-B-D5-BSHOE-MM.smz ; t9480rc ;
> 11/24/2009 08:16:42 ; 1259068602
>
> What I get from the debugger/python shell:
> 'K:\\sm\\SMI\\des\\RS\\Pat\x08DJQ.D5-30Q5B-B-D5-BSHOE-MM.smz->/arch_m1/
> smi/des/RS/Pat/10DJ/121.D5-30/1215B-B-D5-BSHOE-MM.smz ; t9480rc ;
> 11/24/2009 08:16:42 ; 1259068602'
>
> TIA

jon [at] jon-deskto:~/pytest$ cat log.txt
K:\sm\SMI\des\RS\Pat\10DJ\121.D5-30\1215B-B-D5-BSHOE-MM.smz->/arch_m1/
smi/des/RS/Pat/10DJ/121.D5-30\1215B-B-D5-BSHOE-MM.smz ; t9480rc ;
11/24/2009 08:16:42 ; 1259068602

>>> log = open('/home/jon/pytest/log.txt', 'r').readlines()
>>> log
['K:\\sm\\SMI\\des\\RS\\Pat\\10DJ\\121.D5-30\\1215B-B-D5-BSHOE-MM.smz-
>/arch_m1/\n', 'smi/des/RS/Pat/10DJ/121.D5-30\\1215B-B-D5-BSHOE-
MM.smz ; t9480rc ;\n', '11/24/2009 08:16:42 ; 1259068602\n']

See -- it's not doing anything :)

Although, "Pat\x08DJQ.D5-30Q5B-B-D5-BSHOE-MM.smz" and "Pat
\x08DJQ.D5-30Q5B-B-D5-BSHOE-MM.smz" seem to be fairly different -- are
you sure you're posting the correct output!?

Jon.
--
http://mail.python.org/mailman/listinfo/python-list


joncle at googlemail

Nov 24, 2009, 1:54 PM

Post #6 of 13 (466 views)
Permalink
Re: Raw strings as input from File? [In reply to]

On Nov 24, 9:50 pm, Jon Clements <jon...@googlemail.com> wrote:
> On Nov 24, 9:20 pm, utabintarbo <utabinta...@gmail.com> wrote:
[snip]
> Although, "Pat\x08DJQ.D5-30Q5B-B-D5-BSHOE-MM.smz" and "Pat
> \x08DJQ.D5-30Q5B-B-D5-BSHOE-MM.smz" seem to be fairly different -- are
> you sure you're posting the correct output!?
>

Ugh... let's try that...

Pat\10DJ\121.D5-30\1215B-B-D5-BSHOE-MM.smz
Pat\x08DJQ.D5-30Q5B-B-D5-BSHOE-MM.smz

Jon.
--
http://mail.python.org/mailman/listinfo/python-list


tjreedy at udel

Nov 24, 2009, 3:06 PM

Post #7 of 13 (467 views)
Permalink
Re: Raw strings as input from File? [In reply to]

utabintarbo wrote:
> I have a log file with full Windows paths on a line. eg:
> K:\A\B\C\10xx\somerandomfilename.ext->/a1/b1/c1/10xx
> \somerandomfilename.ext ; t9999xx; 11/23/2009 15:00:16 ; 1259006416
>
> As I try to pull in the line and process it, python changes the "\10"
> to a "\x08".

This should only happen if you paste the test into your .py file as a
string literal.

> This is before I can do anything with it. Is there a way
> to specify that incoming lines (say, when using .readlines() ) should
> be treated as raw strings?

Or if you use execfile or compile and ask Python to interprete the input
as code.

There are no raw strings, only raw string code literals marked with an
'r' prefix for raw processing of the quoted text.

--
http://mail.python.org/mailman/listinfo/python-list


rhodri at wildebst

Nov 24, 2009, 5:11 PM

Post #8 of 13 (460 views)
Permalink
Re: Raw strings as input from File? [In reply to]

On Tue, 24 Nov 2009 21:20:25 -0000, utabintarbo <utabintarbo [at] gmail>
wrote:

> On Nov 24, 3:27 pm, MRAB <pyt...@mrabarnett.plus.com> wrote:
>>
>> .readlines() doesn't change the "\10" in a file to "\x08" in the string
>> it returns.
>>
>> Could you provide some code which shows your problem?
>
> Here is the code block I have so far:
> for l in open(CONTENTS, 'r').readlines():
> f = os.path.splitext(os.path.split(l.split('->')[0]))[0]
> if f in os.listdir(DIR1) and os.path.isdir(os.path.join(DIR1,f)):
> shutil.rmtree(os.path.join(DIR1,f))
> if f in os.listdir(DIR2) and os.path.isdir(os.path.join(DIR2,f)):
> shutil.rmtree(os.path.join(DIR2,f))

Ahem. This doesn't run. os.path.split() returns a tuple, and calling
os.path.splitext() doesn't work. Given that replacing the entire loop
contents with "print l" readily disproves your assertion, I suggest you
cut and paste actual code if you want an answer. Otherwise we're just
going to keep saying "No, it doesn't", because no, it doesn't.

> A minimally obfuscated line from the log file:
> K:\sm\SMI\des\RS\Pat\10DJ\121.D5-30\1215B-B-D5-BSHOE-MM.smz->/arch_m1/
> smi/des/RS/Pat/10DJ/121.D5-30\1215B-B-D5-BSHOE-MM.smz ; t9480rc ;
> 11/24/2009 08:16:42 ; 1259068602
>
> What I get from the debugger/python shell:
> 'K:\\sm\\SMI\\des\\RS\\Pat\x08DJQ.D5-30Q5B-B-D5-BSHOE-MM.smz->/arch_m1/
> smi/des/RS/Pat/10DJ/121.D5-30/1215B-B-D5-BSHOE-MM.smz ; t9480rc ;
> 11/24/2009 08:16:42 ; 1259068602'

When you do what, exactly?

--
Rhodri James *-* Wildebeest Herder to the Masses
--
http://mail.python.org/mailman/listinfo/python-list


rhodri at wildebst

Nov 24, 2009, 5:16 PM

Post #9 of 13 (460 views)
Permalink
Re: Raw strings as input from File? [In reply to]

On Wed, 25 Nov 2009 01:11:29 -0000, Rhodri James
<rhodri [at] wildebst> wrote:

> On Tue, 24 Nov 2009 21:20:25 -0000, utabintarbo <utabintarbo [at] gmail>
> wrote:
>
>> On Nov 24, 3:27 pm, MRAB <pyt...@mrabarnett.plus.com> wrote:
>>>
>>> .readlines() doesn't change the "\10" in a file to "\x08" in the string
>>> it returns.
>>>
>>> Could you provide some code which shows your problem?
>>
>> Here is the code block I have so far:
>> for l in open(CONTENTS, 'r').readlines():
>> f = os.path.splitext(os.path.split(l.split('->')[0]))[0]
>> if f in os.listdir(DIR1) and os.path.isdir(os.path.join(DIR1,f)):
>> shutil.rmtree(os.path.join(DIR1,f))
>> if f in os.listdir(DIR2) and
>> os.path.isdir(os.path.join(DIR2,f)):
>> shutil.rmtree(os.path.join(DIR2,f))
>
> Ahem. This doesn't run. os.path.split() returns a tuple, and calling
> os.path.splitext() doesn't work.

I meant, "doesn't work on a tuple". Sigh. It's been one of those days.

--
Rhodri James *-* Wildebeest Herder to the Masses
--
http://mail.python.org/mailman/listinfo/python-list


invalid at invalid

Nov 24, 2009, 7:31 PM

Post #10 of 13 (458 views)
Permalink
Re: Raw strings as input from File? [In reply to]

On 2009-11-25, Rhodri James <rhodri [at] wildebst> wrote:
> On Tue, 24 Nov 2009 21:20:25 -0000, utabintarbo <utabintarbo [at] gmail>
> wrote:
>
>> On Nov 24, 3:27 pm, MRAB <pyt...@mrabarnett.plus.com> wrote:
>>>
>>> .readlines() doesn't change the "\10" in a file to "\x08" in the string
>>> it returns.
>>>
>>> Could you provide some code which shows your problem?
>>
>> Here is the code block I have so far:
>> for l in open(CONTENTS, 'r').readlines():
>> f = os.path.splitext(os.path.split(l.split('->')[0]))[0]
>> if f in os.listdir(DIR1) and os.path.isdir(os.path.join(DIR1,f)):
>> shutil.rmtree(os.path.join(DIR1,f))
>> if f in os.listdir(DIR2) and os.path.isdir(os.path.join(DIR2,f)):
>> shutil.rmtree(os.path.join(DIR2,f))
>
> Ahem. This doesn't run. os.path.split() returns a tuple, and calling
> os.path.splitext() doesn't work. Given that replacing the entire loop
> contents with "print l" readily disproves your assertion, I suggest you
> cut and paste actual code if you want an answer. Otherwise we're just
> going to keep saying "No, it doesn't", because no, it doesn't.

It's, um, rewarding to see my recent set of instructions being
followed.

>> A minimally obfuscated line from the log file:
>> K:\sm\SMI\des\RS\Pat\10DJ\121.D5-30\1215B-B-D5-BSHOE-MM.smz->/arch_m1/
>> smi/des/RS/Pat/10DJ/121.D5-30\1215B-B-D5-BSHOE-MM.smz ; t9480rc ;
>> 11/24/2009 08:16:42 ; 1259068602
>>
>> What I get from the debugger/python shell:
>> 'K:\\sm\\SMI\\des\\RS\\Pat\x08DJQ.D5-30Q5B-B-D5-BSHOE-MM.smz->/arch_m1/
>> smi/des/RS/Pat/10DJ/121.D5-30/1215B-B-D5-BSHOE-MM.smz ; t9480rc ;
>> 11/24/2009 08:16:42 ; 1259068602'
>
> When you do what, exactly?

;)

--
Grant
--
http://mail.python.org/mailman/listinfo/python-list


joncle at googlemail

Nov 25, 2009, 4:58 AM

Post #11 of 13 (445 views)
Permalink
Re: Raw strings as input from File? [In reply to]

On Nov 25, 3:31 am, Grant Edwards <inva...@invalid.invalid> wrote:
> On 2009-11-25, Rhodri James <rho...@wildebst.demon.co.uk> wrote:
>
>
>
> > On Tue, 24 Nov 2009 21:20:25 -0000, utabintarbo <utabinta...@gmail.com>  
> > wrote:
>
> >> On Nov 24, 3:27 pm, MRAB <pyt...@mrabarnett.plus.com> wrote:
>
> >>> .readlines() doesn't change the "\10" in a file to "\x08" in the string
> >>> it returns.
>
> >>> Could you provide some code which shows your problem?
>
> >> Here is the code block I have so far:
> >> for l in open(CONTENTS, 'r').readlines():
> >>     f = os.path.splitext(os.path.split(l.split('->')[0]))[0]
> >>     if f in os.listdir(DIR1) and os.path.isdir(os.path.join(DIR1,f)):
> >>         shutil.rmtree(os.path.join(DIR1,f))
> >>         if f in os.listdir(DIR2) and os.path.isdir(os.path.join(DIR2,f)):
> >>             shutil.rmtree(os.path.join(DIR2,f))
>
> > Ahem.  This doesn't run.  os.path.split() returns a tuple, and calling  
> > os.path.splitext() doesn't work.  Given that replacing the entire loop  
> > contents with "print l" readily disproves your assertion, I suggest you  
> > cut and paste actual code if you want an answer.  Otherwise we're just  
> > going to keep saying "No, it doesn't", because no, it doesn't.
>
> It's, um, rewarding to see my recent set of instructions being
> followed.
>
> >> A minimally obfuscated line from the log file:
> >> K:\sm\SMI\des\RS\Pat\10DJ\121.D5-30\1215B-B-D5-BSHOE-MM.smz->/arch_m1/
> >> smi/des/RS/Pat/10DJ/121.D5-30\1215B-B-D5-BSHOE-MM.smz ; t9480rc ;
> >> 11/24/2009 08:16:42 ; 1259068602
>
> >> What I get from the debugger/python shell:
> >> 'K:\\sm\\SMI\\des\\RS\\Pat\x08DJQ.D5-30Q5B-B-D5-BSHOE-MM.smz->/arch_m1/
> >> smi/des/RS/Pat/10DJ/121.D5-30/1215B-B-D5-BSHOE-MM.smz ; t9480rc ;
> >> 11/24/2009 08:16:42 ; 1259068602'
>
> > When you do what, exactly?
>
> ;)
>
> --
> Grant

Can't remember if this thread counts as "Edwards' Law 5[b|c]" :)

I'm sure I pinned it up on my wall somewhere, right next to
http://imgs.xkcd.com/comics/tech_support_cheat_sheet.png

Jon.
--
http://mail.python.org/mailman/listinfo/python-list


rzantow at gmail

Dec 1, 2009, 5:22 PM

Post #12 of 13 (363 views)
Permalink
Re: Raw strings as input from File? [In reply to]

utabintarbo <utabintarbo [at] gmail> wrote in
news:adc6c455-5616-471a-8b39-d7fdad2179e4 [at] m33g2000vbi
om:

> I have a log file with full Windows paths on a line. eg:
> K:\A\B\C\10xx\somerandomfilename.ext->/a1/b1/c1/10xx
> \somerandomfilename.ext ; t9999xx; 11/23/2009 15:00:16 ;
> 1259006416
>
> As I try to pull in the line and process it, python changes the
> "\10" to a "\x08". This is before I can do anything with it. Is
> there a way to specify that incoming lines (say, when using
> .readlines() ) should be treated as raw strings?
>
> TIA

Despite all the ragging you're getting, it is a pretty flakey thing
that Python does in this context:
(from a python shell)
>>> x = '\1'
>>> x
'\x01'
>>> x = '\10'
>>> x
'\x08'

If you are pasting your string as a literal, then maybe it does the
same. It still seems weird to me. I can accept that '\1' means x01,
but \10 seems to be expanded to \010 and then translated from octal
to get to x08. That's just strange. I'm sure it's documented
somewhere, but it's not easy to search for.

Oh, and this:
>>> '\7'
'\x07'
>>> '\70'
'8'
... is realy odd.

--
rzed
--
http://mail.python.org/mailman/listinfo/python-list


davea at ieee

Dec 1, 2009, 9:39 PM

Post #13 of 13 (366 views)
Permalink
Re: Raw strings as input from File? [In reply to]

rzed wrote:
> utabintarbo <utabintarbo [at] gmail> wrote in
> news:adc6c455-5616-471a-8b39-d7fdad2179e4 [at] m33g2000vbi
> om:
>
>
>> I have a log file with full Windows paths on a line. eg:
>> K:\A\B\C\10xx\somerandomfilename.ext->/a1/b1/c1/10xx
>> \somerandomfilename.ext ; t9999xx; 11/23/2009 15:00:16 ;
>> 1259006416
>>
>> As I try to pull in the line and process it, python changes the
>> "\10" to a "\x08". This is before I can do anything with it. Is
>> there a way to specify that incoming lines (say, when using
>> .readlines() ) should be treated as raw strings?
>>
>> TIA
>>
>
> Despite all the ragging you're getting, it is a pretty flakey thing
>
When the OP specified readline(), which does *not* behave this way, he
probably deserved what you call "ragging." The backslash escaping is
for string literals, which are in code, not in data files.

In any case, there's a big difference between surprising (to you), and
flakey.
> that Python does in this context:
> (from a python shell)
>
>>>> x = '\1'
>>>> x
>>>>
> '\x01'
>
>>>> x = '\10'
>>>> x
>>>>
> '\x08'
>
> If you are pasting your string as a literal, then maybe it does the
> same. It still seems weird to me. I can accept that '\1' means x01,
> but \10 seems to be expanded to \010 and then translated from octal
> to get to x08. That's just strange. I'm sure it's documented
> somewhere, but it's not easy to search for.
>
>
Check in the help for "escape Strings". It's documented (in vers. 2.6,
anyway) in a nice chart that backslash followed by 3 digits, is
interpreted as octal. I don't like it much either, but it's inherited
from C, which has worked that way for 30+ years.

Online, see
http://www.python.org/doc/2.6.4/reference/lexical_analysis.html, and
look in section 2.4.1 for the chart.
> Oh, and this:
>
>>>> '\7'
>>>>
> '\x07'
>
>>>> '\70'
>>>>
> '8'
> ... is realy odd.
>
>
Octal 70 is hex 38 (or decimal 56), which is the character '8'.

DaveA
--
http://mail.python.org/mailman/listinfo/python-list

Python python RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.