Reputation: 3
I am trying to extract the blocks of code that follow each [EventDate] tag in a given file. These all begin with 1. and end with varying numbers. The examples below begin at 1. and end at 46., and 50.. There are many more than 2 in each file.
The goal of this is to count how many moves each block of code has. In this case, 46 for the first block, and 50 for the second. Once i have extracted the blocks, I will be able to count the periods "." to get a total count of moves.
[Event "9th Masters Final 2016"]
[Site "Bilbao ESP"]
[Date "2016.07.13"]
[Round "1.1"]
[White "Karjakin, Sergey"]
[Black "So, Wesley"]
[Result "1/2-1/2"]
[WhiteTitle "GM"]
[BlackTitle "GM"]
[WhiteElo "2773"]
[BlackElo "2770"]
[ECO "C65"]
[Opening "Ruy Lopez"]
[Variation "Berlin defence"]
[WhiteFideId "14109603"]
[BlackFideId "5202213"]
[EventDate "2016.07.13"]
1. e4 e5 2. Nf3 Nc6 3. Bb5 Nf6 4. d3 Bc5 5. c3 O-O 6. O-O d6 7. h3 Ne7 8. d4 Bb6
9. Bd3 Ng6 10. Re1 Re8 11. Nbd2 c6 12. Nf1 d5 13. Bg5 dxe4 14. Rxe4 h6 15. Bxf6
Qxf6 16. Re3 Bf5 17. Bxf5 Qxf5 18. Ng3 Qd7 19. Nxe5 Nxe5 20. Rxe5 Rxe5 21. dxe5
Qe7 22. Qh5 g6 23. Qe2 Qg5 24. Kf1 Kf8 25. Re1 Re8 26. Qd3 Rxe5 27. Qd6+ Re7 28.
Ne4 Qf5 29. Re2 Bc7 30. Qd4 Qe5 31. Qxa7 Qh2 32. Ng3 Bxg3 33. Rxe7 Qh1+ 34. Ke2
Kxe7 35. Qe3+ Kf6 36. Qxg3 Qb1 37. Qf4+ Kg7 38. Qd4+ Kg8 39. Qb4 Qxa2 40. Qxb7
Qc4+ 41. Ke3 Qc5+ 42. Kf3 Qd5+ 43. Kg3 Qg5+ 44. Kh2 Qf4+ 45. Kg1 Qc1+ 46. Kh2
Qf4+ 1/2-1/2
[Event "9th Masters Final 2016"]
[Site "Bilbao ESP"]
[Date "2016.07.13"]
[Round "1.2"]
[White "Carlsen, Magnus"]
[Black "Nakamura, Hikaru"]
[Result "0-1"]
[WhiteTitle "GM"]
[BlackTitle "GM"]
[WhiteElo "2855"]
[BlackElo "2787"]
[ECO "B20"]
[Opening "Sicilian"]
[Variation "Keres variation (2.Ne2)"]
[WhiteFideId "1503014"]
[BlackFideId "2016192"]
[EventDate "2016.07.13"]
1. e4 c5 2. Ne2 d6 3. Nbc3 a6 4. g3 g6 5. Bg2 Bg7 6. d4 cxd4 7. Nxd4 Nf6 8. O-O
O-O 9. b3 Nc6 10. Nxc6 bxc6 11. Bb2 Qa5 12. Na4 Bg4 13. Qe1 Qh5 14. f3 Bh3 15.
g4 Qh6 16. Rd1 g5 17. Bc1 Bxg2 18. Kxg2 Qg6 19. h4 gxh4 20. Qxh4 d5 21. g5 dxe4
22. f4 e6 23. c4 Rfd8 24. Rde1 Ne8 25. Nc5 Nd6 26. Qf2 f5 27. Bb2 Nf7 28. Bxg7
Kxg7 29. Qg3 Rd6 30. Rd1 Rad8 31. Rxd6 Rxd6 32. Qc3+ Kg8 33. Rf2 Qh5 34. Qh3 Qd1
35. Qe3 e5 36. Qg3 Rg6 37. Kh2 exf4 38. Qxf4 Qh5+ 39. Kg1 Qd1+ 40. Kh2 Qh5+ 41.
Kg1 Nxg5 42. Qb8+ Kg7 43. Qe5+ Kh6 44. Qf4 Qd1+ 45. Kh2 Qd4 46. b4 Kg7 47. Qc7+
Kh8 48. Qc8+ Rg8 49. Qxf5 Nf3+ 50. Kh3 Qd6 0-1
Upvotes: 0
Views: 37
Reputation: 599
The sample of two games you have provided is actually a standardized format known as PGN (Portable Game Notation). You can read more about it on the wikipedia PGN article. This is important because a python parser for pgn already exists in a library known as pgnparser
which is listed on pypi here. If you're comfortable with installing the pgnparser
library, you can make this task quite trivial. The installation itself is as simple as running pip install pgnparser
if your python installation is already set up with pip
. I'll assume that you've installed the pgnparser
library and also have your two example games into a file called example_games.pgn
.
import pgn # The pgnparser library.
with open('example_games.pgn') as f:
games = pgn.loads(f.read())
print(games)
which will print out the games in the file
[<PGNGame "Karjakin, Sergey" vs "So, Wesley">,
<PGNGame "Carlsen, Magnus" vs "Nakamura, Hikaru">]
To get the count of the moves in each game, just iterate over them
for game in games:
msg = "{} vs {}, {} moves."
print(msg.format(game.white, game.black, len(game.moves)))
This prints out a nice summary of every game in the file.
Karjakin, Sergey vs So, Wesley, 93 moves.
Carlsen, Magnus vs Nakamura, Hikaru, 101 moves.
If your goal is to get the count of the moves, this will be an efficient, clean, and object-oriented way to do it.
Upvotes: 1