Reputation: 170745
I want a representation for strings with fast concatenation and editing operations. I have read the paper "Ropes: an Alternative to Strings", but have there been any significant improvements in this area since 1995?
EDIT: One possibility I've considered before is using a 2-3 finger tree with strings as leaves, but I have not done a detailed analysis of this; this gives amortized constant-time addition/deletion on ends and logarithmic (in the number of chunks of the smaller string) concatenation, as opposed to vice versa for ropes.
Upvotes: 5
Views: 1200
Reputation:
This is an old question! I wonder if anyone reads this. But still it's intrigueing. In your comments, you say you look for:
Faster asymptotics, or constant factors, or less memory use
Well, ropes have O(1) insertion, and O(n) iteration. You can't do better than that. Substrings and indexing is obviously going to be more costly. But most use cases for large documents don't require editing or random access. If you only concatenate at the end, a 1D vector/list of strings could improve the insertion time constant. I used to use this in JavaScript because it had such slow string concatentation.
It is said that memory representation is less efficient than using strings.
I doubt that: If you work in a language that has garbage collection, the rope allows you to use the same string fragment instance in multiple places. In a rope that represents a HTML document, there will be many DIV
's, SPAN
's and LINK
elements. This might even happen automatically assuming these tags are compile time constants, and you add them to the rope directly. Even for such short phrases, the rope document will reduce in size significantly, to the same order of magnitude as the original string. Longer strings will produce a net gain.
If you also make the tree elemenst read only, you can create subropes (longer phrases expressed as ropes), that occur multiple times or are shared across rope based strings. The downside of this sharing is that such shard rope sections can't be changed: to edit them, or to balance the tree you need to copy the object graph. But that does not matter if you mostly concatenate and iterate. In a web server, you can keep a subrope that repesents the CSS stylesheet declaration that is shared across all HTML documents served by that server.
Upvotes: 1