Reputation: 317
When running the algorithm on the string 'AEKEAAEKEAAEKEA$
' looking for the longest substring with at least 3 occurences all the nodes in the suffix tree have maximum 2 branches, how can that be?
The correct result should be the substring 'AEKEA
'.
You can easily see the tree in the online suffix tree builder
I just followed the Wikipedi description:
"The problem of finding the longest substring with at least k occurrences can be found by first preprocessing the tree to count the number of leaf descendants for each internal node, and then finding the deepest node with at least k descendants"
What am I missing here?
Thank you.
Upvotes: 1
Views: 2745
Reputation: 4216
I don't think that website is correct. When I run 'AEKEAAEKEAAEKEA' through my suffix tree, I get the following tree.
└── (0)
├── (27) $
├── (6) A
│ ├── (26) $
│ ├── (16) AEKEA
│ │ ├── (17) $
│ │ └── (7) AEKEA$
│ └── (18) EKEA
│ ├── (19) $
│ └── (8) AEKEA
│ ├── (9) $
│ └── (1) AEKEA$
├── (4) E
│ ├── (24) A
│ │ ├── (25) $
│ │ └── (14) AEKEA
│ │ ├── (15) $
│ │ └── (5) AEKEA$
│ └── (20) KEA
│ ├── (21) $
│ └── (10) AEKEA
│ ├── (11) $
│ └── (2) AEKEA$
└── (22) KEA
├── (23) $
└── (12) AEKEA
├── (13) $
└── (3) AEKEA$
As you can see from this branch, you've found your longest substring with 3 occurences.
└── (0)
├── (27) $
├── (6) A
│ ├── (26) $
│ ├── (16) AEKEA
│ │ ├── (17) $
│ │ └── (7) AEKEA$
│ └── (18) EKEA
│ ├── (19) $
│ └── (8) AEKEA
│ ├── (9) $
│ └── (1) AEKEA$
Upvotes: 3