Question or problem about Python programming:
I’m trying to run a google search query from a python app. Is there any python interface out there that would let me do this? If there isn’t does anyone know which Google API will enable me to do this. Thanks.
How to solve the problem:
Solution 1:
There’s a simple example here (peculiarly missing some quotes;-). Most of what you’ll see on the web is Python interfaces to the old, discontinued SOAP API — the example I’m pointing to uses the newer and supported AJAX API, that’s definitely the one you want!-)
Edit: here’s a more complete Python 2.6 example with all the needed quotes &c;-)…:
#!/usr/bin/python import json import urllib def showsome(searchfor): query = urllib.urlencode({'q': searchfor}) url = 'http://ajax.googleapis.com/ajax/services/search/web?v=1.0&%s' % query search_response = urllib.urlopen(url) search_results = search_response.read() results = json.loads(search_results) data = results['responseData'] print 'Total results: %s' % data['cursor']['estimatedResultCount'] hits = data['results'] print 'Top %d hits:' % len(hits) for h in hits: print ' ', h['url'] print 'For more results, see %s' % data['cursor']['moreResultsUrl'] showsome('ermanno olmi')
Solution 2:
Here is Alex’s answer ported to Python3
#!/usr/bin/python3 import json import urllib.request, urllib.parse def showsome(searchfor): query = urllib.parse.urlencode({'q': searchfor}) url = 'http://ajax.googleapis.com/ajax/services/search/web?v=1.0&%s' % query search_response = urllib.request.urlopen(url) search_results = search_response.read().decode("utf8") results = json.loads(search_results) data = results['responseData'] print('Total results: %s' % data['cursor']['estimatedResultCount']) hits = data['results'] print('Top %d hits:' % len(hits)) for h in hits: print(' ', h['url']) print('For more results, see %s' % data['cursor']['moreResultsUrl']) showsome('ermanno olmi')
Solution 3:
Here’s my approach to this: http://breakingcode.wordpress.com/2010/06/29/google-search-python/
A couple code examples:
# Get the first 20 hits for: "Breaking Code" WordPress blog from google import search for url in search('"Breaking Code" WordPress blog', stop=20): print(url) # Get the first 20 hits for "Mariposa botnet" in Google Spain from google import search for url in search('Mariposa botnet', tld='es', lang='es', stop=20): print(url)
Note that this code does NOT use the Google API, and is still working to date (January 2012).
Solution 4:
I am new in python and I was investigating how to do this. None of the provided examples are working properly for me. Some are blocked by google if you make many (few) requests, some are outdated.
Parsing the google search html (adding the header in the request) will work until google changes the html structure again. You can use the same logic to search in any other search engine, looking into the html (view-source).
import urllib2 def getgoogleurl(search,siteurl=False): if siteurl==False: return 'http://www.google.com/search?q='+urllib2.quote(search) else: return 'http://www.google.com/search?q=site:'+urllib2.quote(siteurl)+'%20'+urllib2.quote(search) def getgooglelinks(search,siteurl=False): #google returns 403 without user agent headers = {'User-agent':'Mozilla/11.0'} req = urllib2.Request(getgoogleurl(search,siteurl),None,headers) site = urllib2.urlopen(req) data = site.read() site.close() #no beatifulsoup because google html is generated with javascript start = data.find('
(Edit 1: Adding a parameter to narrow the google search to a specific site)
(Edit 2: When I added this answer I was coding a Python script to search subtitles. I recently uploaded it to Github:
Subseek)