multithreading - Python - Multithreaded Proxy Tester -
i'm building proxy checker using multithreads, specificly thread pool from:
from multiprocessing.dummy import pool threadpool
.
the http request using urllib2.
what want each proxy run 20 requests. if 1 threaded take time. thats multithreads power comes help. once set proxy want run 20 requests, , manage 2 things. 1 count exceptions , dump proxy if many occurs. 2nd save average response time , present later.
i don't manage implement above. have implemented 1 thread:
import socket import ssl import time import urllib import urllib2 import httplib proxylist = [] def loadproxysfromfile(filename): global proxylist open(filename) f: proxylist = [line.rstrip('\n') line in f] def seturllib2proxy(proxyaddress): proxy = urllib2.proxyhandler({ 'http': "http://" + proxyaddress, 'https': "https://" + proxyaddress }) opener = urllib2.build_opener(proxy) urllib2.install_opener(opener) def timingrequest(proxy, url): error = false seturllib2proxy(proxy) start = time.time() try: req = urllib2.request(url) urllib2.urlopen(req, timeout=5) #opening request (getting response) except (urllib2.urlerror, httplib.badstatusline, ssl.sslerror, socket.error) e: error = true end = time.time() timing = end - start if error: print "error proxy " + proxy return 0 else: print proxy + " request " + url + " took: %s" %timing + " seconds." return timing # main loadproxysfromfile("proxylist.txt") proxy in proxylist: print "testing: " + proxy print "\n" request_num = 20 error_tolerance_num = 3 resultlist = [] proxy in proxylist: avgtime = 0 errorcount = 0 x in range(0, request_num): result = timingrequest(proxy, 'https://www.google.com') if (result == 0): errorcount += 1 if (errorcount >= error_tolerance_num): break else: avgtime += result if (errorcount < error_tolerance_num): avgtime = avgtime/(request_num-errorcount) resultlist.append(proxy + " has average response time of: %s" %avgtime) print '\n' print "results summery: " print "-----------------" res in resultlist: print res
things must done are:
for every proxy: wait until 20 requests on before changing proxy. sync somehow threads when adding calculate average response time (includes not take in account exceptions)
the best solutions i've read far using from multiprocessing.dummy import pool threadpool
, pool.map(func, iterable)
cant figure out how implement in code.
Comments
Post a Comment