Google Books - Hints and Tips



Google Books is already a very useful resource, but in some respects it is still not very user-friendly. I have tried to collect here a few useful hints and tips for finding and using material.

[Thanks to Al Magary for generously providing some of the information below.]

Contents

The Google Books database contains three kinds of books:

  1. Books considered to be out of copyright, or "public domain".
  2. Books scanned in libraries but still in copyright.
  3. Books submitted by publishers.

(Note that Google applies different copyright criteria according to which country the visitor is in. Many books accessible to visitors in the U.S.A. are not accessible to those elsewhere. See Problems accessing material outside the U.S.A..)

The texts of all these books can be searched, but how much is shown depends on the status of the book. Out of copyright books can be viewed in their entirety. For books submitted by publishers, a page or two either side of each search result can be viewed (this may require the visitor to register before viewing the pages). For other books, at most only "snippets" - typically two or three lines containing each search result - can be seen.

Note that although this is (I think) how the system is meant to work, there are numerous anomalies. Many books do not appear to be accessible in any form, although they appear in lists of search results, and some books can be viewed only as "snippets" even though they are clearly out of copyright.

Searching

Google allows searching of the texts of books in its database, using either a basic or an advanced search form.

Currently the advanced search form, in addition to the usual Google search options, allows the visitor to supply search terms for the title, author and/or publisher, together with a range of publication dates. For recent works, the ISBN can also be specified.

It's worth bearing in mind that for older books the text that is being searched has been arrived at using O.C.R. (optical character recognition) software, and may be inaccurate if the quality of the scanned images is poor. It's also important to be aware that the details of author, title and date in Google's database are frequently inaccurate. In particular, for multi-volume works there is often no information about which volume is concerned, and the given date of publication may refer to the series as a whole. For medieval texts, Google usually credits the original author of the manuscript rather than the modern editor.

Navigating

Once a book of interest has been found, Google allows the visitor to search for text within that book. The list of search results provides links to images of the pages where the search text has been found.

Alternatively, the reader can browse through the book, viewing images of the pages. Normally (but not always), links are provided to the title page, the table of contents and the index. Once a page has been selected, above and below the image are displayed back and forward arrows to move to the previous and next pages, and an input box where a page number can be entered. The box normally accepts arabic numerals (1,2,3, ...) for pages in the main body of the book, and roman numerals (i,ii,iii, ...) for the title page, table of contents, preface and so on. The page image of the table of contents may also contain clickable links leading to the first pages of the sections.

For out-of-copyright works, the reader can also usually download the whole book as a single PDF file. In the latest version (as of November 2012) this can be done by clicking on the drop-down menu at the top right and then clicking the "Download PDF" item. But no doubt it will be different in a few weeks' time.

Problems accessing material outside the U.S.A.

As far as I understand, visitors from the U.S.A. should in theory be able view all books published in the U.S.A. before 1923 in their entirety, though in practice many such books are available only in restricted formats. In other countries, where copyright is determined by the length of time since the death of the author, Google is applying much more stringent criteria based on the date of publication. For example, for visitors in the U.K. only works published before 1865 are available.

To complicate matters further, Google also appears to be restricting access to books published outside the U.S.A. later than 1908.

Of course, the great majority of books published between 1865 and 1923 are legally out of copyright in the U.K. and most other countries. I have tried to clarify Google's policy on future access to these books, without success.

A practical solution to the problem is to use an anonymous proxy server, which makes it impossible for Google to determine which country the visitor is in - for example, ProxySite (it is necessary to select a proxy server location in the USA). The visitor simply copies and pastes the Google URL into a form on the website of the proxy server - ensuring that the domain name is google.com, not google.co.uk - and from then on can follow links normally, and even edit the Google URL to navigate to specific pages.
[Update, December 2020: Unfortunately it is now difficult to use proxy servers to visit Google Books. ProxySite is still enabling me to view and download books as PDF files, though other features of the site do not work. Some other suggested solutions can be found in this article by Klaus Graf (in German).]