[Toy] 束構造で遊ぶ(グラフを描く)


リハビリ中のトイプログラム等です。
飽和アイテム集合の列挙にはNII宇野先生の公開プログラムLCM ver.3を使っています。 詳しくは 公開プログラムを見てください。

束構造の辺を計算する

  • もっと高速に求めたいのですが、あまり真面目に考えたことがないです。
input_file = "../LCM30/test.dat"
output_file = "./output.dat"
min_support = 1
cmd = "../LCM30/fim_closed {0} {1} output.dat".format(input_file, min_support)
subprocess.call( cmd.strip().split(" ") )

lo_closed_itemsets = []

with open(output_file, "r") as f:
    for line in f:
        line = line.strip()
        llist = line.split(" ")
        if len(llist) > 1:
            int_set = set(map(int, llist[:-1]))
            lo_closed_itemsets.append( int_set )

transaction_database = []
with open(input_file, "r") as f:
    for line in f:
        line = line.strip()
        llist = line.split(",")
        int_set = set( map(int, llist) )
        transaction_database.append(int_set)

lo_index = []
for i, itemset_i in enumerate(lo_closed_itemsets):
    id_list = []
    for j, itemset_j in enumerate(transaction_database):
        if itemset_i <= itemset_j:
            id_list.append(j)
    lo_index.append(set(id_list))

# build edges
from collections import defaultdict
edge_map = {}
NC = len(lo_closed_itemsets)

# print
for i in range(min(10, NC)):
    X, Y = lo_index[i], lo_closed_itemsets[i]
    print(i, X, Y)

# make edge
for i in range(NC):
    X, Y = lo_index[i], lo_closed_itemsets[i]
    for j in range(i+1, NC):
        X2, Y2 = lo_index[j], lo_closed_itemsets[j]
        if Y <= Y2:
            if i not in edge_map:
                edge_map[i] = set({j})
            else:
                edge_map[i].add(j)

for key in edge_map:
    print(key, edge_map[key])

edge_direct_inv_map = {}
for key in edge_direct_map:
    for v in edge_direct_map[key]:
        if v not in edge_direct_inv_map:
            edge_direct_inv_map[v] = set({key})
        else:
            edge_direct_inv_map[v].add(key)

def set2str(S):
    if len(S) == 0:
        return ""
    else:
        s = ",".join(map(str, S))
        return s

# 出力
edge_output_file = "output_edge.dat"
with open(edge_output_file, "w") as f:
    for cid in range(NC):
        str_from = set2str(edge_direct_inv_map[cid])
        str_to = set2str(edge_direct_map[cid])
        f.write("{0}#{1}#{2}\n".format(cid, str_from, str_to))
  • 出力はこんな感じになります。

0#-2#2
1#-2#2,3
2#0,1#4,5
3#1#4
4#2,3#-1
5#2#-1

束構造を描いてみる

PythonのGraphviz用パッケージを利用してみました。

from graphviz import Digraph
G = Digraph(format="png")
G.attr("node", shape="circle")

for i in range(NC):
    G.node(str(i), str(i))
G.node(str(-1), str(-1))
G.node(str(-2), str(-2))
    
for i in range(NC):
    for j in edge_map[i]:
        G.edge(str(j), str(i))
    for j in edge_inv_map[i]:
        if j == -1:
            G.edge(str(i), str(j))
  • かなり手抜きですが、とりあえず描けた(-1と-2を上下逆にするように重みを設定すればいいはずだけど、とりあえず放置)

束(手抜き)

後日に続きます(続かないかもしれない)