How does the Google Docs PDF viewer work?

Jeeva Subburaj picture Jeeva Subburaj · Jan 26, 2010 · Viewed 35.4k times · Source

I am curious to know how the Google Docs PDF viewer works? It's not a flash like scribd.com; it looks like pure HTML. Any idea how do they did it?

Sample link to view the PDF

Answer

Ben Everard picture Ben Everard · Jan 26, 2010

Google is simply serving up an an image (right click -> save as), with an overlay to highlight text.

You should check out this SO question where others go into more detail.

You should also look through the source of your PDF link, it would appear Google are passing the PDF link through to be converted into an image.

Example:

<script type="text/javascript"> 
        var gviewElement = document.getElementById('gview');
        var config = {

          'api': false,
          'chrome': true,
          'csi': true,
          'ddUrl': "http://www.idfcmf.com/downloads/monthly_fund/2009/IDFC-Premier-Equityfund-jan10.pdf",
          'element': gviewElement,
          'embedded': false,
          'initialQuery': "",
          'oivUrl': "http://docs.google.com/viewer?url\x3dhttp%3A%2F%2Fwww.idfcmf.com%2Fdownloads%2Fmonthly_fund%2F2009%2FIDFC-Premier-Equityfund-jan10.pdf",
          'sdm': 200,
          'userAuthenticated': true
        };

        var gviewApp = _createGView(config);
        gviewApp.setProgress(50);


          window.jstiming.load.name = 'view';

          window.jstiming.load.tick('_dt');

      </script> 

Edit

Also if you were to view the PDF viewer in Firefox with Firebug, you will notice that when you 'highlight' text it's really only enabling a load of divs, I'm guessing Google scans the document using OCR, detects where the text is and provides a matrix of coordinates on which to base the div placement on, when you click and drag it introgates the mouse pointer location to determine which divs to display.