Reputation: 123
I wanted to make a function to count the words in a file to find the positions of each words in the file, I want the output to be,
a, position: 0
aah, position: 1
aahed, position: 2
I already tried this to count the words, but, I couldn't use it to get the positions of the words
scanner := bufio.NewScanner(strings.NewReader(input))
// Set the split function for the scanning operation.
scanner.Split(bufio.ScanWords)
// Count the words.
count := 0
for scanner.Scan() {
count++
}
if err := scanner.Err(); err != nil {
fmt.Fprintln(os.Stderr, "reading input:", err)
}
fmt.Printf("%d\n", count)
Is it possible for me to use for loop to do this? Because I would like to index the position. For example word[position]==word[position+1], to find out if a word in a specific position is the same with the word in the next position.
Upvotes: 0
Views: 2516
Reputation: 251
Imagine having a testfile.txt
:
this is fine
You can use this go-script to loop over each word and print the word with it's current position:
package main
import (
"bufio"
"fmt"
"os"
)
func main() {
// initiate file-handle to read from
fileHandle, err := os.Open("testfile.txt")
// check if file-handle was initiated correctly
if err != nil {
panic(err)
}
// make sure to close file-handle upon return
defer fileHandle.Close()
// initiate scanner from file handle
fileScanner := bufio.NewScanner(fileHandle)
// tell the scanner to split by words
fileScanner.Split(bufio.ScanWords)
// initiate counter
count := 0
// for looping through results
for fileScanner.Scan() {
fmt.Printf("word: '%s' - position: '%d'\n", fileScanner.Text(), count)
count++
}
// check if there was an error while reading words from file
if err := fileScanner.Err(); err != nil {
panic(err)
}
// print total word count
fmt.Printf("total word count: '%d'", count)
}
Output:
$ go run main.go
word: 'this' - position: '0'
word: 'is' - position: '1'
word: 'fine' - position: '2'
total word count: '3'
If you want to compare the words by index you could load them into a slice first.
Imagine having a textfile:
fine this is fine
Use this code:
package main
import (
"bufio"
"fmt"
"os"
)
func main() {
// initiate file-handle to read from
fileHandle, err := os.Open("testfile.txt")
// check if file-handle was initiated correctly
if err != nil {
panic(err)
}
// make sure to close file-handle upon return
defer fileHandle.Close()
// initiate scanner from file handle
fileScanner := bufio.NewScanner(fileHandle)
// tell the scanner to split by words
fileScanner.Split(bufio.ScanWords)
// initiate wordsSlice
var wordSlice []string
// for looping through results
for fileScanner.Scan() {
wordSlice = append(wordSlice, fileScanner.Text())
}
// check if there was an error while reading words from file
if err := fileScanner.Err(); err != nil {
panic(err)
}
// loop through word slice and print word with index
for i, w := range wordSlice {
fmt.Printf("word: '%s' - position: '%d'\n", w, i)
}
// compare words by index
firstWordPos := 0
equalsWordPos := 3
if wordSlice[firstWordPos] == wordSlice[equalsWordPos] {
fmt.Printf("word at position '%d' and '%d' is equal: '%s'\n", firstWordPos, equalsWordPos, wordSlice[firstWordPos])
}
// print total word count
fmt.Printf("total word count: '%d'", len(wordSlice))
}
Output:
$ go run main.go
word: 'fine' - position: '0'
word: 'this' - position: '1'
word: 'is' - position: '2'
word: 'fine' - position: '3'
word at position '0' and '3' is equal: 'fine'
total word count: '4'
Upvotes: 1
Reputation: 10136
You can read input string one character at a time. This way you have full control on the data you need to output. In Go characters are called runes:
b, err := ioutil.ReadFile("test.txt")
if err != nil {
panic(err)
}
reader := bytes.NewReader(b)
// Word is temporary word buffer that we use to collect characters for current word.
word := strings.Builder{}
wordPos := 0
line := 0
pos := 0
for {
// Read next character
if r, _, err := reader.ReadRune(); err != nil {
if err == io.EOF {
// Output last word if this is end of file
fmt.Println(word.String(), "line:", line, "position:", wordPos)
break
} else {
panic(err)
}
} else {
// If current character is new line reset position counters and word buffer.
if r == '\n' {
fmt.Println(word.String(), "line:", line, "position:", wordPos)
word.Reset()
pos = 0
wordPos = 0
line++
} else if r == ' ' { // Found word separator: output word, reset word buffer and set next word position.
fmt.Println(word.String(), "line:", line, "position:", wordPos)
word.Reset()
wordPos = pos + 1
pos++
} else { // Just a regular character: write it to word buffer.
word.WriteRune(r)
pos++
}
}
}
I use strings.Builder
to get rid of unnecessary string copying.
Also you have to adjust this example to work for edge cases like empty line and maybe others.
Upvotes: 2