python

关注公众号 jb51net

关闭
首页 > 脚本专栏 > python > Python获取免费高匿代理IP及验证

Python如何获取免费高匿代理IP及验证

作者:偶尔敲代码

这篇文章主要介绍了Python如何获取免费高匿代理IP及验证问题,具有很好的参考价值,希望对大家有所帮助,如有错误或未考虑完全的地方,望不吝赐教

代理IP

代理IP的匿名性可以分为三个级别。

在一些情况下,使用高匿代理IP能够实现一些“用途”,好几年前,有个自学网站可以让好友点击邀请链接即可获取积分,网站只对点击的IP来源进行检测,所以只要使用高匿代理IP伪装一下,就可以达到刷积分的目的。

现今也不推崇去干这事了,但技术本身无错,借此做个分享,用Python实现代理IP的采集和验证。

采集代理IP

因为我平常工作不涉及这类东西,所以涉及的网站均是网上随便搜索的,免费能用就行。

我找了两个IP网站,一个采集过程中封我IP,一个还算顺利,代码如下:

def kxdaili():
    for i in range(10):#10页
        url = f'http://www.kxdaili.com/dailiip/1/{i+1}.html'
        response = requests.get(url=url)
        pattern = r'<td>([\d.]+)</td>'#匹配出<td></td>之间的内容,并只保留其中是数字和小数点.的部分
        results = re.findall(pattern, response.text)
        print(results)
        for n in range(10):
            try:
                ip_temp = results[2*n] + ":" + results[2*n+1]
                #print(ip_temp)
                ip.append(ip_temp)
            except Exception:
                print("没了")
                break
        #xpath报错,奇葩,浏览器调试正常
        #ip = result.xpath(f"/html/body/div[2]/div[2]/div[2]/div[2]/div[1]/div[2]/table/tbody/tr[{n + 1}]/td[1]/text()")[0]
        #port = result.xpath(f"/html/body/div[2]/div[2]/div[2]/div[2]/div[1]/div[2]/table/tbody/tr[{n + 1}]/td[2]/text()")[0]


        time.sleep(5)
    print(ip)

验证IP

上面采集到的代理IP还需要进行一个验证,主要是验证其匿名性和可用性,如果匿名程度不够,一下就会被对方服务器发现,不方便进行其他操作。

验证网址如下:

http://httpbin.org/get?show_env=1

下面代码中加了5秒超时判断,用于检测IP的可用性,用上面网址的返回内容检测IP的匿名程度,检测结果应该还可以接受,也可找其他接口进行检测。

def ceshi():
    #ip = ['47.100.90.127:4444', '47.96.70.163:8888', '117.74.65.207:8118', '124.70.205.56:8089', '39.104.62.128:9999', '116.63.130.30:1081', '121.37.201.60:8080', '112.124.2.212:8888', '120.79.31.133:52869', '120.31.52.68:8118', '8.134.138.108:8888', '8.213.128.6:808', '120.46.215.52:3000', '124.70.221.252:8080', '122.9.151.210:3132', '139.224.56.162:1234', '8.219.169.172:20', '58.220.95.30:10174', '139.196.151.191:8080', '123.57.1.16:59394', '39.104.57.170:10001', '115.182.212.177:80', '120.79.7.173:8888', '8.134.140.146:9999', '8.130.39.117:8080', '47.113.224.182:83', '8.209.253.237:8999', '39.104.26.204:8889', '101.132.25.152:50001', '116.63.128.247:8889', '120.46.197.14:8083', '8.219.74.58:1000', '115.29.149.2:8282', '139.196.214.238:2087', '121.37.203.216:3128', '47.109.53.253:45554', '47.106.144.184:7890', '139.9.119.20:80', '47.113.219.226:9091', '123.60.139.197:6969', '47.98.134.232:9992', '117.74.65.29:8181', '47.99.180.88:7890', '101.200.235.69:9000', '47.92.248.86:10000', '139.196.78.175:7890', '120.79.21.48:3127', '47.109.46.223:5678', '47.109.57.93:6969', '139.129.231.228:5001', '123.60.109.71:8090', '120.79.16.132:8080', '8.130.34.44:1234', '8.219.5.240:8080', '116.62.50.250:7890', '121.37.207.154:8999', '120.79.34.201:30001', '47.92.247.250:10000', '8.212.23.2:80', '39.100.120.200:7890', '120.55.49.231:20000', '123.57.1.78:10443', '121.40.115.140:8080', '115.29.148.215:8118', '101.200.187.233:19', '122.9.131.161:3128', '8.213.128.90:8080', '123.56.129.203:50001', '124.71.157.181:8888', '101.34.72.57:7890', '8.130.36.245:8080', '8.219.43.134:20201', '121.37.199.23:8089', '39.104.79.145:8499', '47.113.203.122:41890', '8.208.84.236:8080', '47.92.248.197:41890', '8.134.136.224:8080', '47.113.221.120:1080', '47.92.242.45:8999', '8.134.139.219:8080', '61.130.9.37:443', '8.130.34.237:8080', '140.210.196.193:8060', '47.92.239.69:8081', '47.113.230.224:3333', '115.29.151.41:8081', '8.213.137.155:80', '8.208.90.243:8999', '139.198.168.65:7890', '117.74.65.215:9443']
    url = 'http://httpbin.org/get?show_env=1'
    for i in range(len(ip)):
        ip_temp = ip[i].split(":")
        proxies = {
            'http': ip[i],
            #'https': 'http://60.182.184.172:8888'
        }
        try:
            response = requests.get(url, proxies=proxies, timeout=5)
            #print(response.text)
            if response.text.find(ip_temp[0]) != -1:
                print(ip[i], "-----------匿名")
            else:
                print(ip[i], "-----------非匿名")
        except requests.Timeout:
            print(ip[i],"-----------请求超时")
        except:
            print(ip[i],"-----------请求发生异常")

总结

运行结果:

免费的代理IP可用性和匿名程度肯定不稳定,对于一般用途或者涨知识还是可以了解了解。

有条件的就选择付费的服务,但切勿用于非法用途。

以上为个人经验,希望能给大家一个参考,也希望大家多多支持脚本之家。

您可能感兴趣的文章:
阅读全文