Reputation: 293
Thanks in advance for the help!
Here's the situation:
(string-capitalize str)
(format nil "~@(~A~)" str)
Thoughts?
Upvotes: 1
Views: 69
Reputation: 9282
As others have pointed out there's no general solution to this, in any language, which does not involve some hairy library. CL is no exception.
But the format
trick does, in fact, work well enough in simple cases. Although the spec is not completely clear on this, I am pretty sure that the string capitalisation options (variations on ~(
... ~)
) use the same definition of 'word' that string-capitalize
does:
For the purposes of
string-capitalize
, a 'word' is defined to be a consecutive subsequence consisting of alphanumeric characters, delimited at each end either by a non-alphanumeric character or by an end of the string.
(From string-capitalize
)
This means that, for instance (format nil "~@(~A~)" "i'm")
will treat the string "i'm"
as two words and capitalize the first, resulting in "I'm"
. And indeed it does:
> (format nil "~@(~A~)" "i'm")
"I'm"
Assuming your implementation's unicode support is competent this will work for non-ASCII characters:
(let ((sentence '("štar" "means" "four" "in" "some" "Romani" "dialects")))
(format nil "~@(~A~)~{ ~A~}" (first sentence) (rest sentence)))
"Štar means four in some Romani dialects"
Upvotes: 3
Reputation: 52579
It's not standard, and locks you into the one implementation, but SBCL's sb-unicode
package has a titlecase
function that capitalizes each word in its argument, using Unicode rules to figure out the word and character breaks instead of string-capitalize
's rules about what words are.
CL-USER> (use-package :sb-unicode)
T
CL-USER> (sb-unicode:titlecase "I'M")
"I'm"
You can also use the sb-unicode:words
function to break a sentence up into component words more robustly than just doing things like splitting on whitespace:
(ql:quickload :str :silent t) ; for str:join
(use-package :sb-unicode)
(defun capitalize-sentence (string)
"Capitalize the first word of `string` and lowercase the rest."
(let ((words (sb-unicode:words string)))
(if words
(str:join "" (cons (sb-unicode:titlecase (car words))
(mapcar #'sb-unicode:lowercase (cdr words))))
string)))
Upvotes: 3