I want to use Go for downloading stock price spreadsheets from Yahoo finance. I'll be making an http request for every stock in its own goroutine. I have a list of around 2500 symbols, but instead of making 2500 requests in parallel, I'd prefer making 250 at a time. In Java I'd create a thread pool and reuse threads as and when they get free. I was trying to find something similar, a goroutine pool, if you will, but was unable to find any resources. I'd appreciate if someone can tell me how to accomplish the task at hand or point me to resources for the same. Thanks!
The simplest way, I suppose, is to create 250 goroutines and pass them a channel which you can use to pass links from main goroutine to child ones, listening that channel.
When all links are passed to goroutines, you close a channel and all goroutines just finish their jobs.
To secure yourself from main goroutine get finished before children process data, you can use sync.WaitGroup
.
Here is some code to illustrate (not a final working version but shows the point) that I told above:
func worker(linkChan chan string, wg *sync.WaitGroup) {
// Decreasing internal counter for wait-group as soon as goroutine finishes
defer wg.Done()
for url := range linkChan {
// Analyze value and do the job here
}
}
func main() {
lCh := make(chan string)
wg := new(sync.WaitGroup)
// Adding routines to workgroup and running then
for i := 0; i < 250; i++ {
wg.Add(1)
go worker(lCh, wg)
}
// Processing all links by spreading them to `free` goroutines
for _, link := range yourLinksSlice {
lCh <- link
}
// Closing channel (waiting in goroutines won't continue any more)
close(lCh)
// Waiting for all goroutines to finish (otherwise they die as main routine dies)
wg.Wait()
}