get links from a google search in C#

gilibi picture gilibi · Mar 3, 2011 · Viewed 8k times · Source

I'm trying to program a simple search in google via C# that would run a query of my choice and retrieve the first 50 links. After thoroughly searching for a similar tool\correct API I realized that most of them are obsolete. My first try was to create a "simple HttpWebRequest" and scan the received WebResponse for "href=" which turned out to be not rewarding at all (redundancy) and very frustrating. I do have a Google API but I'm not sure how to use it for this purpose, although I know that there is an 1000 limit per day.



MarkKiessling picture MarkKiessling · Jan 4, 2015

Here is working code.. obviously you will have to add the proper form and a few simple controls...

using HtmlAgilityPack;
using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.IO;
using System.Linq;
using System.Net;
using System.ServiceModel.Syndication;
using System.Text;
using System.Threading.Tasks;
using System.Windows.Forms;
using System.Xml;

namespace Search
    public partial class Form1 : Form
        // load snippet
        HtmlAgilityPack.HtmlDocument htmlSnippet = new HtmlAgilityPack.HtmlDocument();

        public Form1()

        private void btn1_Click(object sender, EventArgs e)
            StringBuilder sb = new StringBuilder();
            byte[] ResultsBuffer = new byte[8192];
            string SearchResults = "" + txtKeyWords.Text.Trim();
            HttpWebRequest request = (HttpWebRequest)WebRequest.Create(SearchResults);
            HttpWebResponse response = (HttpWebResponse)request.GetResponse();

            Stream resStream = response.GetResponseStream();
            string tempString = null;
            int count = 0;
                count = resStream.Read(ResultsBuffer, 0, ResultsBuffer.Length);
                if (count != 0)
                    tempString = Encoding.ASCII.GetString(ResultsBuffer, 0, count);

            while (count > 0);
            string sbb = sb.ToString();

            HtmlAgilityPack.HtmlDocument html = new HtmlAgilityPack.HtmlDocument();
            html.OptionOutputAsXml = true;
            HtmlNode doc = html.DocumentNode;

            foreach (HtmlNode link in doc.SelectNodes("//a[@href]"))
                //HtmlAttribute att = link.Attributes["href"];
                string hrefValue = link.GetAttributeValue("href", string.Empty);
                if (!hrefValue.ToString().ToUpper().Contains("GOOGLE") && hrefValue.ToString().Contains("/url?q=") && hrefValue.ToString().ToUpper().Contains("HTTP://"))
                    int index = hrefValue.IndexOf("&");
                    if (index > 0)
                        hrefValue = hrefValue.Substring(0, index);
                        listBox1.Items.Add(hrefValue.Replace("/url?q=", ""));