Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Python: Python

get rid of duplicate elements in list without set

 

 

Python python RSS feed   Index | Next | Previous | View Threaded


zasaconsulting at gmail

Mar 20, 2009, 7:16 AM

Post #1 of 14 (3147 views)
Permalink
get rid of duplicate elements in list without set

Hello there,

I'd like to get the same result of set() but getting an indexable
object.
How to get this in an efficient way?

Example using set

A = [1, 2, 2 ,2 , 3 ,4]
B= set(A)
B = ([1, 2, 3, 4])

B[2]
TypeError: unindexable object

Many thanks, alex
--
http://mail.python.org/mailman/listinfo/python-list


marduk at letterboxes

Mar 20, 2009, 7:30 AM

Post #2 of 14 (3103 views)
Permalink
Re: get rid of duplicate elements in list without set [In reply to]

On Fri, 2009-03-20 at 07:16 -0700, Alexzive wrote:
> Hello there,
>
> I'd like to get the same result of set() but getting an indexable
> object.
> How to get this in an efficient way?
>
> Example using set
>
> A = [1, 2, 2 ,2 , 3 ,4]
> B= set(A)
> B = ([1, 2, 3, 4])
>
> B[2]
> TypeError: unindexable object

>>> A = [1, 2, 2 ,2 , 3, 4]
>>> B = list(set(A))
>>> B[2]
3

However, as sets are unordered, there is no guarantee that B will have
the same ordering as A.



--
http://mail.python.org/mailman/listinfo/python-list


thomasvangurp at gmail

Mar 20, 2009, 7:54 AM

Post #3 of 14 (3106 views)
Permalink
Re: get rid of duplicate elements in list without set [In reply to]

You could use:
B=list(set(A)).sort()
Hope that helps.
T
--
http://mail.python.org/mailman/listinfo/python-list


marduk at letterboxes

Mar 20, 2009, 8:14 AM

Post #4 of 14 (3101 views)
Permalink
Re: get rid of duplicate elements in list without set [In reply to]

On Fri, 2009-03-20 at 07:54 -0700, thomasvangurp [at] gmail wrote:
> You could use:
> B=list(set(A)).sort()
> Hope that helps.

Which will assign None to B.

sorted(list(... or B.sort() is probably what you meant.

--
http://mail.python.org/mailman/listinfo/python-list


tino at wildenhain

Mar 20, 2009, 8:15 AM

Post #5 of 14 (3100 views)
Permalink
Re: get rid of duplicate elements in list without set [In reply to]

thomasvangurp [at] gmail wrote:
> You could use:
> B=list(set(A)).sort()
> Hope that helps.

That would leave a B with value None :-)

B=list(sorted(set(A))

could work.
Attachments: smime.p7s (3.17 KB)


google at mrabarnett

Mar 20, 2009, 8:27 AM

Post #6 of 14 (3104 views)
Permalink
Re: get rid of duplicate elements in list without set [In reply to]

Tino Wildenhain wrote:
> thomasvangurp [at] gmail wrote:
>> You could use:
>> B=list(set(A)).sort()
>> Hope that helps.
>
> That would leave a B with value None :-)
>
> B=list(sorted(set(A))
>
> could work.
>
sorted() accepts an iterable, eg a set, and returns a list:

B = sorted(set(A))
--
http://mail.python.org/mailman/listinfo/python-list


ptmcg at austin

Mar 20, 2009, 8:34 AM

Post #7 of 14 (3103 views)
Permalink
Re: get rid of duplicate elements in list without set [In reply to]

On Mar 20, 9:54 am, "thomasvang...@gmail.com"
<thomasvang...@gmail.com> wrote:
> You could use:
> B=list(set(A)).sort()
> Hope that helps.
> T

That may hurt more than help, sort() only works in-place, and does
*not* return the sorted list. For that you want the global built-in
sorted:

>>> data = map(int,"6 1 3 2 5 2 5 4 2 0".split())
>>> print sorted(list(set(data)))
[0, 1, 2, 3, 4, 5, 6]

To retain the original order, use the key argument, passing it a
function - simplest is to pass the index of the value in the original
list:

>>> print sorted(list(set(data)), key=data.index)
[6, 1, 3, 2, 5, 4, 0]

If data is long, all of those calls to data.index may get expensive.
You may want to build a lookup dict first:

>>> lookup = dict((v,k) for k,v in list(enumerate(data))[::-1])
>>> print sorted(list(set(data)), key=lookup.__getitem__)
[6, 1, 3, 2, 5, 4, 0]

-- Paul
--
http://mail.python.org/mailman/listinfo/python-list


Scott.Daniels at Acm

Mar 20, 2009, 10:06 AM

Post #8 of 14 (3104 views)
Permalink
Re: get rid of duplicate elements in list without set [In reply to]

Alexzive wrote:
> I'd like to get the same result of set() but getting an indexable
> object. How to get this in an efficient way?

Go look at Ray Hettinger's recently announced recipe for 'OrderedSet':

http://code.activestate.com/recipes/576694/


--Scott David Daniels
Scott,Daniels [at] Acm
--
http://mail.python.org/mailman/listinfo/python-list


__peter__ at web

Mar 20, 2009, 10:36 AM

Post #9 of 14 (3112 views)
Permalink
Re: get rid of duplicate elements in list without set [In reply to]

Alexzive wrote:

> I'd like to get the same result of set() but getting an indexable
> object.
> How to get this in an efficient way?
>
> Example using set
>
> A = [1, 2, 2 ,2 , 3 ,4]
> B= set(A)
> B = ([1, 2, 3, 4])
>
> B[2]
> TypeError: unindexable object

If the initial list is ordered or at least equal items are neighbours you
can use groubpy():

>>> from itertools import groupby
>>> a = [1,1,1,2,2,3,4,4,4]
>>> [key for key, group in groupby(a)]
[1, 2, 3, 4]

Here's what happens if there are equal items that are not neigbours:

>>> b = [1,1,1,2,2,2,3,3,2,1,1,1,1]
>>> [key for key, group in groupby(b)]
[1, 2, 3, 2, 1]

Peter
--
http://mail.python.org/mailman/listinfo/python-list


steve at REMOVE-THIS-cybersource

Mar 20, 2009, 2:37 PM

Post #10 of 14 (3087 views)
Permalink
Re: get rid of duplicate elements in list without set [In reply to]

On Fri, 20 Mar 2009 07:16:40 -0700, Alexzive wrote:

> Hello there,
>
> I'd like to get the same result of set() but getting an indexable
> object.
> How to get this in an efficient way?

Your question is too open-ended. Do you want to keep the items in the
original order? Are the items hashable? Do they support comparisons?

http://code.activestate.com/recipes/52560/


If all you care is that the result is indexable, then list(set(items))
will do what you want -- but beware, sets can only contain hashable
items, so if your original data contains dicts, lists or other unhashable
objects, you can't add them to a set.



--
Steven
--
http://mail.python.org/mailman/listinfo/python-list


mahs at telcopartners

Mar 20, 2009, 3:07 PM

Post #11 of 14 (3087 views)
Permalink
Re: get rid of duplicate elements in list without set [In reply to]

Alexzive wrote:
> Hello there,
>
> I'd like to get the same result of set() but getting an indexable
> object.
> How to get this in an efficient way?
>
> Example using set
>
> A = [1, 2, 2 ,2 , 3 ,4]
> B= set(A)
> B = ([1, 2, 3, 4])
>
> B[2]
> TypeError: unindexable object
>
> Many thanks, alex
> --
> http://mail.python.org/mailman/listinfo/python-list
>
Provided your list items are hashable, you could use a set to keep track of what
you've seen:

>>> A = [1, 2, 2 ,2 , 3 ,4]
...
>>> seen=set()
...
>>> B=[]
>>> for item in A:
... if not item in seen:
... B.append(item)
... seen.add(item)
...
>>> B
[1, 2, 3, 4]

And, if you really want, you can get the body of this into 1-line, noting that
seen.add returns None, so the expression (item in seen or seen.add(item))
evaluates to True if item is in seen, or None (and item is added to seen) if not.

>>> seen = set()
>>> B= [.item for item in A if not (item in seen or seen.add(item))]
>>> B
[1, 2, 3, 4]
>>>

Michael

--
http://mail.python.org/mailman/listinfo/python-list


cdalten at gmail

Mar 20, 2009, 5:08 PM

Post #12 of 14 (3088 views)
Permalink
Re: get rid of duplicate elements in list without set [In reply to]

On Mar 20, 8:34 am, Paul McGuire <pt...@austin.rr.com> wrote:
> On Mar 20, 9:54 am, "thomasvang...@gmail.com"
>
> <thomasvang...@gmail.com> wrote:
> > You could use:
> > B=list(set(A)).sort()
> > Hope that helps.
> > T
>
> That may hurt more than help, sort() only works in-place, and does
> *not* return the sorted list.  For that you want the global built-in
> sorted:

Okay,if sort() only works in-place, then how come the following seems
to return a sorted list

>>> f = [9,7,6,8]
>>> g=f
>>> g
[9, 7, 6, 8]
>>> g.sort()
>>> g
[6, 7, 8, 9]
>>> f
[6, 7, 8, 9]
>>>

Ie, when I sort g, f also seems to get sorted.
--
http://mail.python.org/mailman/listinfo/python-list


castironpi at gmail

Mar 20, 2009, 5:12 PM

Post #13 of 14 (3085 views)
Permalink
Re: get rid of duplicate elements in list without set [In reply to]

On Mar 20, 5:07 pm, Michael Spencer <m...@telcopartners.com> wrote:
> Alexzive wrote:
snip
> And, if you really want, you can get the body of this into 1-line, noting that
> seen.add returns None, so the expression (item in seen or seen.add(item))
> evaluates to True if item is in seen, or None (and item is added to seen) if not.
>
>   >>> seen = set()
>   >>> B=  [.item for item in A if not (item in seen or seen.add(item))]
>   >>> B
>   [1, 2, 3, 4]

IYO in your opinion, is '... or seen.add(item) is None' more or less
readable?

You might even want '... or ( lambda x: False )( seen.add( item ) )'.

Or: '... or seen.add(item) and False'.

This preserves order.
--
http://mail.python.org/mailman/listinfo/python-list


cdalten at gmail

Mar 20, 2009, 5:15 PM

Post #14 of 14 (3088 views)
Permalink
Re: get rid of duplicate elements in list without set [In reply to]

On Mar 20, 5:08 pm, grocery_stocker <cdal...@gmail.com> wrote:
> On Mar 20, 8:34 am, Paul McGuire <pt...@austin.rr.com> wrote:
>
> > On Mar 20, 9:54 am, "thomasvang...@gmail.com"
>
> > <thomasvang...@gmail.com> wrote:
> > > You could use:
> > > B=list(set(A)).sort()
> > > Hope that helps.
> > > T
>
> > That may hurt more than help, sort() only works in-place, and does
> > *not* return the sorted list.  For that you want the global built-in
> > sorted:
>
> Okay,if sort() only works in-place, then how come the following seems
> to return a sorted list
>
>
>
> >>> f = [9,7,6,8]
> >>> g=f
> >>> g
> [9, 7, 6, 8]
> >>> g.sort()
> >>> g
> [6, 7, 8, 9]
> >>> f
> [6, 7, 8, 9]
>
> Ie, when I sort g, f also seems to get sorted.

Wait. Never mind.
--
http://mail.python.org/mailman/listinfo/python-list

Python python RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.