Reputation: 1
I have a character data that looks like this:
x= c("Clause 1 - AGREEMENT. Buyer agrees to buy, and Seller agrees to sell, the Property described below on the terms and conditions set forth in this contract.",
"Clause 2 - Buyer. Buyer, will take title to the Property described below:",
"Item 2.1 - Seller. Seller, is the current owner of the Property described below assignable by Buyer without Seller’s prior written consent.",
"Clause 3 - Inclusions. The Purchase Price includes the following items: ",
"Item 3.1 - Fixtures. If attached to the Property on the date of this Contract, the following items are included unless")
I am tryng to group all Items into the Clauses in a list. Basically, I want it to do this
x[grep("Clause . - ", x)]= c(x[1], paste(x[2], x[3]), paste(x[4], x[5]))
and this
x= x[grep("Clause . - ", x)]
but dynamically. How can I do it without specifyng the list items i want to combine? Thank you all.
Upvotes: 0
Views: 114
Reputation: 1
I solved my problem adapting the answer provided by Zelazny. With the data:
> x= c("Clause 1 - AGREEMENT. Buyer agrees to buy",
"Item 1.2 - Seller agrees to sell",
"Item 1.2 - the Property described below",
"Item 1.3 - on the terms and conditions set forth in this contract",
"Item 1.4 - If attached to the Property on the date of this Contract",
"Item 1.5 - the following items are included:",
"I - property",
"II - car",
"III - motorcycle",
"Clause 2 - Buyer, will take title to the Property described below:",
"Item 2.1 - Seller. Seller, is the current owner of the Property",
"I - this is binding contract",
"Item 2.2 - by Buyer without Seller’s prior written consent.",
"Clause 3 - The Purchase Price includes the following items",
"Clause 4 - property will be transmited",
"Clause 5 - as discribed in",
"Each party is signing this agreement on the date stated opposite that party’s signature.",
"city, date")
First find the items that are clauses:
> f= grep("Clause . - ", x)
> f
[1] 1 10 14 15 16
As the rep
dosn't allow a list of times, loop over and repeat the previous item number for all the missing itens:
> nums= f
> for (i in 1:length(f)-1){
> a= f[i+1]-f[i]-1 #times to repeat the number
> nums= c(nums, rep(f[i], times= a))
> }
> sort(nums)
[1] 1 1 1 1 1 1 1 1 1 10 10 10 10 14 15 16
Add all the numbers after the last clause:
> nums= sort(c(nums, (1+f[length(f)]):length(x)))
> nums
[1] 1 1 1 1 1 1 1 1 1 10 10 10 10 14 15 16 17 18
And finally group the items in the clause:
> grouped <- tapply(x, nums, paste, collapse='\n')
> cat(grouped[1])
Clause 1 - AGREEMENT. Buyer agrees to buy
Item 1.2 - Seller agrees to sell
Item 1.2 - the Property described below
Item 1.3 - on the terms and conditions set forth in this contract
Item 1.4 - Fixtures. If attached to the Property on the date of this Contract
Item 1.5 - the following items are included:
I - property
II - car
III - motorcycle
Upvotes: 0
Reputation: 40628
First strip out just the numbers:
> nums <- gsub("^..* (\\d+\\.*\\d*) -..*$", "\\1", x, perl = T)
> nums
[1] "1" "2" "2.1" "3" "3.1"
Group them by dropping the decimal place:
> nums <- as.integer(nums)
> nums
[1] 1 2 2 3 3
Loop over these groupings and paste them together:
> grouped <- tapply(x, nums, paste, collapse='\n')
> cat(grouped[2])
Clause 2 - Buyer. Buyer, will take title to the Property described below:
Item 2.1 - Seller. Seller, is the current owner of the Property described below assignable by Buyer without Seller’s prior written consent.
Upvotes: 1