This repository has been archived by the owner on Oct 24, 2020. It is now read-only.
forked from EFForg/https-everywhere
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathtest-generator.py
executable file
·53 lines (45 loc) · 1.89 KB
/
test-generator.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
#!/usr/bin/python2.7
"""
Help automate the process of generating test URLs for rulesets.
Many rulesets have complicated regexes like this:
^http://s3(?:-website)?(-ap-(?:nor|sou)theast-1|-(?:eu|us)-west-\d|-external-\d|-sa-east-1)?\.amazonaws\.com/
Under the new requirements for ruleset coverage testing, we need a test URL that
covers each of these branches. Fortunately, the exrex library can automate
expanding the branches of the regex. This script uses that library to generate a
set of plausible test URLs. NOTE: Usually these test URLs will need manual
verification. Some may not actually exist. Also, if a URL contains a wildcard,
e.g. '.', exrex will attempt to substitute all possible values, creating an
explosion of test URLs. We attempt to detect this by finding regexes that
generate more than a thousand test URLs, and not printing any output for those.
You will have to manually find test cases for URLs with broad wildcards.
Usage:
./utils/test-generator.py src/chrome/content/rules/AmazonAWS.xml
# ... Paste output into your ruleset ...
# Then test the ruleset:
python2.7 https-everywhere-checker/src/https_everywhere_checker/check_rules.py \
https-everywhere-checker/manual.checker.config
src/chrome/content/rules/AmazonAWS.xml
"""
import exrex
import lxml
from lxml import etree
import sys
def generate(regex):
i = 0
urls = []
for url in exrex.generate(regex):
i += 1
if i > 1000:
break
urls.append(url)
if i <= 1000:
for url in urls:
print "<test url=\"%s\" />" % url
for xmlFname in sys.argv[1:]:
ruleset = etree.parse(file(xmlFname)).getroot()
xpath_from = etree.XPath("/ruleset/rule/@from")
for from_attrib in xpath_from(ruleset):
generate(from_attrib)
xpath_exclusion = etree.XPath("/ruleset/exclusion/@pattern")
for pattern_attrib in xpath_exclusion(ruleset):
generate(pattern_attrib)