Gossamer Forum
Home : General : Perl Programming :

Joining sets basd on common elements

Quote Reply
Joining sets basd on common elements
I have a many sets sharing elements among them. And I want to join all
sets sharing any given number of common elements. For example, my
example input is a file written as:

Set1 A B C
Set2 D E A
Set3 B C F
Set4 G H D
Set5 I J K
Set6 K L M
Set7 N O P

Here Set1, Set2, Set3, Set4, Set5, Set6, Set7 are the 7 sets here. And
A,B,C,D,E,F,G,H,I,J,K,L,M,N,O,P are the elements divided across these
sets. You can see common elements across diferent sets.

The output I am looking for is as follows:

Set1 Set2 Set3 Set4 A B C D E A F G H
Set5 Set6 I J K L M
Set7 N O P

How do I do it in perl ? If it were just one element common to two
sets eg. Set5 and Set6 (and thats all with no other matches among
elements of the combined set with yet another set) it was simple.
However, in this case, for example we have: first we join Set1 with
Set2 based on the common element 'A' but then there is element 'B'
common betwen Set1 and Set3......and the situation is further
complicated by presence of element 'D' in Set4 and thus 'D' would be
common between Set4 and combination of Set1, Set2, Set3 !! So it kind
of builds up and I am finding it difficult to conceptualise.

Could you please help at your earliest convenience ?


The input file can alternatively exist as (either way is fine as its
siple to transform):

Set1 A
Set1 B
Set1 C
Set2 D
Set2 E
Set2 A
Set3 B
Set3 C
Set3 F
Set4 G
Set4 H
Set4 D
Set5 I
Set5 J
Set5 K
Set6 K
Set6 L
Set6 M
Set7 N
Set7 O
Set7 P

Please let me know.

Thanks and Regards
Quote Reply
Re: [le_faquir] Joining sets basd on common elements In reply to
Quote:
If it were just one element common to two
sets eg. Set5 and Set6 (and thats all with no other matches among
elements of the combined set with yet another set) it was simple.

To rephrase the answer inherent in your question, you find your unions the same way as you would when comparing two sets, only more. I.e., recursion/looping.