潮汐开源版本是利用了fofa的banner
首先,潮汐定义了一个cms_finger_list的cms列表,这个列表和sqllite数据表中的数据是一样的。
然后又当以了下面的东西,暂时没看懂是干啥的
self.W = '\033[0m' self.G = '\033[1;32m' self.R = '\033[1;31m' self.O = '\033[1;33m' self.B = '\033[1;34m'
而且这个潮汐指纹识别是基于http协议的识别,而不是说针对主机端口的扫描。虽然现在分析的代码是我改写了一遍的,但是基本上没啥变化,现在来进行分析。
cms = Cmsscanner(target_url, self.request_timeout, self.pwd) fofa_finger = cms.run()
来看看Cmsscanner
def __init__(self, *params): self.target, self.request_timeout, self.pwd = params self.start = time.time() self.finger = [] self.agent = {'UserAgent': 'Mozilla/5.0 (Windows; U; MSIE 9.0; WIndows NT 9.0; en-US))'} self.rtitle = re.compile(r'title="(.*)"') self.rheader = re.compile(r'header="(.*)"') self.rbody = re.compile(r'body="(.*)"') self.rbracket = re.compile(r'\((.*)\)')
调用的run方法如下
def run(self): try: header, body, title = self.get_info() for _id in range(1, int(self.count())): try: self.handle(_id, header, body, title) except Exception as e: pass except Exception as e: print(e) finally: return self.finger
里面调用了git_info方法,我们来看看它都做了啥
def get_info(self): """获取web的信息""" try: r = requests.get(url=self.target, headers=self.agent, timeout=self.request_timeout, verify=False) content = r.text try: title = BS(content, 'lxml').title.text.strip() return str(r.headers), content, title.strip('\n') except: return str(r.headers), content, '' except Exception as e: pass
利用get方法,向目标发起普通的请求,得到结果,主要包含内容,title和响应头。然后run方法中再遍历数据库中每个指纹,进行处理。详细看看handle方法
def handle(self, _id, header, body, title): """取出数据库的key进行匹配""" name, key = self.check(_id) # 满足一个条件即可的情况 if '||' in key and '&&' not in key and '(' not in key: for rule in key.split('||'): if self.check_rule(rule, header, body, title): self.finger.append(name) # print '%s[+] %s %s%s' % (G, self.target, name, W) break # 只有一个条件的情况 elif '||' not in key and '&&' not in key and '(' not in key: if self.check_rule(key, header, body, title): self.finger.append(name) # print '%s[+] %s %s%s' % (G, self.target, name, W) # 需要同时满足条件的情况 elif '&&' in key and '||' not in key and '(' not in key: num = 0 for rule in key.split('&&'): if self.check_rule(rule, header, body, title): num += 1 if num == len(key.split('&&')): self.finger.append(name) # print '%s[+] %s %s%s' % (G, self.target, name, W) else: # 与条件下存在并条件: 1||2||(3&&4) if '&&' in re.findall(self.rbracket, key)[0]: for rule in key.split('||'): if '&&' in rule: num = 0 for _rule in rule.split('&&'): if self.check_rule(_rule, header, body, title): num += 1 if num == len(rule.split('&&')): self.finger.append(name) # print '%s[+] %s %s%s' % (G, self.target, name, W) break else: if self.check_rule(rule, header, body, title): self.finger.append(name) # print '%s[+] %s %s%s' % (G, self.target, name, W) break else: # 并条件下存在与条件:1&&2&&(3||4) for rule in key.split('&&'): num = 0 if '||' in rule: for _rule in rule.split('||'): if self.check_rule(_rule, title, body, header): num += 1 break else: if self.check_rule(rule, title, body, header): num += 1 if num == len(key.split('&&')): self.finger.append(name) # print '%s[+] %s %s%s' % (G, self.target, name, W)
再来看看针对每个规则是怎么处理的check_rule函数
def check_rule(self, key, header, body, title): """指纹识别""" try: if 'title="' in key: if re.findall(self.rtitle, key)[0].lower() in title.lower(): return True elif 'body="' in key: if re.findall(self.rbody, key)[0] in body: return True else: if re.findall(self.rheader, key)[0] in header: return True except Exception as e: pass
也就是说,本质上就是在首页的title/header和body中找有没有符合的正则表达式规则,如果有,就找到一个。