Reputation: 1920
I have a text file where I want to grab a section of text so I can put that into two arrays one being ingredients the other is directions.
for the ingredients, I can do something like below but I can't guarantee the integrity of it.
ingredients = []
list.each_line do |l|
ingredients << l if l =~ /\d\s?\w.*/
end
this is the text blob :
635860
581543
2011-03-21T13:50:10Z
Image:black bean soup.jpg|right|Mexican Black Bean Soup
== Ingredients ==
1lb black beans
2 tbsp extra-virgin olive oil
2 onions, large, diced
6 cloves garlic, minced
1 cup tomato, peeled, seeded, and chopped (fresh or canned)
1 sprig epazote, fresh or dried (optional)
1 tbsp chipotle pepper|chipotle chiles, canned, chopped (or ¼ tsp cayenne)
1 tsp cumin, ground
1 tsp coriander seed|coriander, ground
2 tsp salt
== Directions ==
Soak the black beans for 2 hours and drain.
In a deep pot, heat the olive oil over medium heat.
Add the onions and cook about 5 minutes.
Until translucent.
Add the black beans|beans, garlic, and 6 cups cold water.
Bring to a boil, skimming any foam that rises to the surface.
Reduce to a simmer.
In an hour or when the black beans|beans are soft, add the tomato, epazote, chipotle chile peppers|chile, cumin, coriander, and salt.
Continue cooking until the black beans|beans start to break down and the broth begins to thicken.
Taste for seasoning and add salt and pepper if needed.
If you’re serving this soup immediately, you may want to thicken it by puréeing a cup or two of the black beans|beans in a blender or food processor and then recombining them with the rest of the soup.
The soup will thicken on its own if refrigerated overnight.
Category:Black bean Recipes
Category:Chile pepper Recipes
Category:Chipotle pepper Recipes
Category:Epazote Recipes
Category:Mexican Soups
Category:Tomato Recipes
bx0ztz9xbf8qr9z4gwkad26u6q3hly3
Upvotes: 0
Views: 83
Reputation: 11035
What I would do here, is instead of trying to match data that you likely have no control over, try to match that data that it looks like you might have some control over. Specifically, it looks to me like the lines == Ingredients ==
and == Directions ==
and Category:Tomato Recipes
could possibly be part of the file format, not entered by the user. So, I'd just split the text up whenever you see a line that looks like that:
sections = list.each_line.slice_before do |line|
line.match?(/\A(==|[a-zA-Z]+:)/)
end.entries
and then you can just assoc
the data out of the groups:
puts sections.assoc("== Ingredients ==\n")
puts '---'
puts sections.assoc("== Directions ==\n")
This still has some flaws (if the user enters something like Note: Preheat oven first
as part of the directions, this would end up splitting that, thinking it's metadata), but should be a large step forward, and can be tweaked from here.
Upvotes: 1