Skip to content

An inplementation of sequential patterm mining method

Notifications You must be signed in to change notification settings

nkmrtty/episode_mining

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 

Repository files navigation

episode_mining

An inplementation of sequential patterm mining method [1].

How to use

  1. Set event sequence and episodes

    sequence = sorted([
        (31, 'E'), (32, 'D'), (33, 'F'), (35, 'A'), (37, 'B'), (38, 'C'), (39, 'E'),
        (40, 'F'), (42, 'C'), (44, 'D'), (46, 'B'), (47, 'A'), (48, 'D'), (50, 'C'),
        (53, 'E'), (54, 'F'), (55, 'C'), (57, 'B'), (58, 'E'), (59, 'A'), (60, 'E'),
        (61, 'C'), (62, 'F'), (65, 'A'), (67, 'D'),
    ], key=lambda x:x[0])
    
    episodes = sorted(['A', 'B', 'C', 'D', 'E', 'F', 'AA', 'AB', 'EF', 'CD',])
  2. Initialize WINEPI class

    >>> from episode_mining.winepi import WINEPI
    >>> w = WINEPI(sequence, episodes, 'parallel')
    # to mine serial episodes, set 'serial' insted of 'parallel'
  3. Discover frequent (parallel) episodes

    # discover_frequent_episodes(t_s, t_e, win, min_fr):
    #    t_s    : start time of target sequence
    #    t_e    : end time of target sequence
    #    win    : window size
    #    min_fr : threshold of frequency of episodes
    >>> w.discover_frequent_episodes(29, 68, 5, 0.1)
    [<ParallelEpisode: A / 0.46511627907>,
     <ParallelEpisode: B / 0.348837209302>,
     <ParallelEpisode: C / 0.558139534884>,
     <ParallelEpisode: D / 0.441860465116>,
     <ParallelEpisode: E / 0.511627906977>,
     <ParallelEpisode: F / 0.46511627907>,
     <ParallelEpisode: A B / 0.232558139535>,
     <ParallelEpisode: C D / 0.139534883721>,
     <ParallelEpisode: E F / 0.348837209302>]
  4. Generate rules

    # generate_rules(t_s, t_e, win, min_fr, min_conf)
    #    t_s      : start time of target sequence
    #    t_e      : end time of target sequence
    #    win      : window size
    #    min_fr   : threshold of frequency of episodes
    #    min_conf : threshold of confidence of rules
    >>> w.generate_rules(29, 68, 5, 0.1, 0.1)
    [<Rule: A -> A B / 0.5>,
     <Rule: B -> A B / 0.666666666667>,
     <Rule: C -> C D / 0.25>,
     <Rule: D -> C D / 0.315789473684>,
     <Rule: E -> E F / 0.681818181818>,
     <Rule: F -> E F / 0.75>]

TODO

  • Implement MINEPI method

Reference

  1. H. Mannila, H. Toivonen, and A. I. Verkamo, “Discovery of Frequent Episodes in Event Sequences,” Data Min. Knowl. Discov., vol. 1, no. 3, pp. 259–289, 1997.

About

An inplementation of sequential patterm mining method

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages