I tried the Go Tour exercise #71
If it is run like go run 71_hang.go ok
, it works fine.
However, if you use go run 71_hang.go nogood
, it will run forever.
The only difference is the extra fmt.Print("")
in the default
in the select
statement.
I'm not sure, but I suspect some sort of infinite loop and race-condition? And here is my solution.
Note: It's not deadlock as Go didn't throw: all goroutines are asleep - deadlock!
package main
import (
"fmt"
"os"
)
type Fetcher interface {
// Fetch returns the body of URL and
// a slice of URLs found on that page.
Fetch(url string) (body string, urls []string, err error)
}
func crawl(todo Todo, fetcher Fetcher,
todoList chan Todo, done chan bool) {
body, urls, err := fetcher.Fetch(todo.url)
if err != nil {
fmt.Println(err)
} else {
fmt.Printf("found: %s %q\n", todo.url, body)
for _, u := range urls {
todoList <- Todo{u, todo.depth - 1}
}
}
done <- true
return
}
type Todo struct {
url string
depth int
}
// Crawl uses fetcher to recursively crawl
// pages starting with url, to a maximum of depth.
func Crawl(url string, depth int, fetcher Fetcher) {
visited := make(map[string]bool)
doneCrawling := make(chan bool, 100)
toDoList := make(chan Todo, 100)
toDoList <- Todo{url, depth}
crawling := 0
for {
select {
case todo := <-toDoList:
if todo.depth > 0 && !visited[todo.url] {
crawling++
visited[todo.url] = true
go crawl(todo, fetcher, toDoList, doneCrawling)
}
case <-doneCrawling:
crawling--
default:
if os.Args[1]=="ok" { // *
fmt.Print("")
}
if crawling == 0 {
goto END
}
}
}
END:
return
}
func main() {
Crawl("http://golang.org/", 4, fetcher)
}
// fakeFetcher is Fetcher that returns canned results.
type fakeFetcher map[string]*fakeResult
type fakeResult struct {
body string
urls []string
}
func (f *fakeFetcher) Fetch(url string) (string, []string, error) {
if res, ok := (*f)[url]; ok {
return res.body, res.urls, nil
}
return "", nil, fmt.Errorf("not found: %s", url)
}
// fetcher is a populated fakeFetcher.
var fetcher = &fakeFetcher{
"http://golang.org/": &fakeResult{
"The Go Programming Language",
[]string{
"http://golang.org/pkg/",
"http://golang.org/cmd/",
},
},
"http://golang.org/pkg/": &fakeResult{
"Packages",
[]string{
"http://golang.org/",
"http://golang.org/cmd/",
"http://golang.org/pkg/fmt/",
"http://golang.org/pkg/os/",
},
},
"http://golang.org/pkg/fmt/": &fakeResult{
"Package fmt",
[]string{
"http://golang.org/",
"http://golang.org/pkg/",
},
},
"http://golang.org/pkg/os/": &fakeResult{
"Package os",
[]string{
"http://golang.org/",
"http://golang.org/pkg/",
},
},
}
Putting a default
statement in your select
changes the way select works. Without a default statement select will block waiting for any messages on the channels. With a default statement select will run the default statement every time there is nothing to read from the channels. In your code I think this makes an infinite loop. Putting the fmt.Print
statement in is allowing the scheduler to schedule other goroutines.
If you change your code like this then it works properly, using select in a non blocking way which allows the other goroutines to run properly.
for {
select {
case todo := <-toDoList:
if todo.depth > 0 && !visited[todo.url] {
crawling++
visited[todo.url] = true
go crawl(todo, fetcher, toDoList, doneCrawling)
}
case <-doneCrawling:
crawling--
}
if crawling == 0 {
break
}
}
You can make your original code work if you use GOMAXPROCS=2 which is another hint that the scheduler is busy in an infinite loop.
Note that goroutines are co-operatively scheduled. What I don't fully understand about your problem is that select
is a point where the goroutine should yield - I hope someone else can explain why it isn't in your example.