Version 2.00b: Many improvements
- Minor bug fix to path parsing to avoid problems with /.$foo/, - Improved PHP error detection (courtesy of Niels Heinen), - Improved dictionary logic (courtesy of Niels Heinen) and new documentation of the same, - Improved support for file.ext keywords in the dictionary, - Fixed missing content_checks() in unknown_check_callback()(courtesy of Niels Heinen), - Improved an oversight in dictionary case sensitivity, - Improved pivots.txt data, - Support for supplementary read-only dictionaries (-W +dict), - Change to directory detection to work around a certain sneaky server behavior. - TODO: Revise dictionaries!!!
This commit is contained in:
parent
b199943c9d
commit
6b2d33edca
28
ChangeLog
28
ChangeLog
|
@ -1,3 +1,29 @@
|
||||||
|
Version 2.00b:
|
||||||
|
--------------
|
||||||
|
|
||||||
|
- Minor bug fix to path parsing to avoid problems with /.$foo/,
|
||||||
|
|
||||||
|
- Improved PHP error detection (courtesy of Niels Heinen),
|
||||||
|
|
||||||
|
- Improved dictionary logic (courtesy of Niels Heinen) and new
|
||||||
|
documentation of the same,
|
||||||
|
|
||||||
|
- Improved support for file.ext keywords in the dictionary,
|
||||||
|
|
||||||
|
- Fixed missing content_checks() in unknown_check_callback()
|
||||||
|
(courtesy of Niels Heinen),
|
||||||
|
|
||||||
|
- Improved an oversight in dictionary case sensitivity,
|
||||||
|
|
||||||
|
- Improved pivots.txt data,
|
||||||
|
|
||||||
|
- Support for supplementary read-only dictionaries (-W +dict),
|
||||||
|
|
||||||
|
- Change to directory detection to work around a certain sneaky
|
||||||
|
server behavior.
|
||||||
|
|
||||||
|
- TODO: Revise dictionaries!!!
|
||||||
|
|
||||||
Version 1.94b:
|
Version 1.94b:
|
||||||
--------------
|
--------------
|
||||||
|
|
||||||
|
@ -9,7 +35,7 @@ Version 1.94b:
|
||||||
Version 1.93b:
|
Version 1.93b:
|
||||||
--------------
|
--------------
|
||||||
|
|
||||||
- Major fix to URL XSS detection logic.
|
- Major fix to URL XSS detection logic (courtesy of Niels Heinen).
|
||||||
|
|
||||||
Version 1.92b:
|
Version 1.92b:
|
||||||
--------------
|
--------------
|
||||||
|
|
2
Makefile
2
Makefile
|
@ -20,7 +20,7 @@
|
||||||
#
|
#
|
||||||
|
|
||||||
PROGNAME = skipfish
|
PROGNAME = skipfish
|
||||||
VERSION = 1.94b
|
VERSION = 2.00b
|
||||||
|
|
||||||
OBJFILES = http_client.c database.c crawler.c analysis.c report.c
|
OBJFILES = http_client.c database.c crawler.c analysis.c report.c
|
||||||
INCFILES = alloc-inl.h string-inl.h debug.h types.h http_client.h \
|
INCFILES = alloc-inl.h string-inl.h debug.h types.h http_client.h \
|
||||||
|
|
342
README
342
README
|
@ -12,30 +12,30 @@ skipfish - web application security scanner
|
||||||
1. What is skipfish?
|
1. What is skipfish?
|
||||||
--------------------
|
--------------------
|
||||||
|
|
||||||
Skipfish is an active web application security reconnaissance tool. It
|
Skipfish is an active web application security reconnaissance tool. It
|
||||||
prepares an interactive sitemap for the targeted site by carrying out a
|
prepares an interactive sitemap for the targeted site by carrying out a
|
||||||
recursive crawl and dictionary-based probes. The resulting map is then
|
recursive crawl and dictionary-based probes. The resulting map is then
|
||||||
annotated with the output from a number of active (but hopefully
|
annotated with the output from a number of active (but hopefully
|
||||||
non-disruptive) security checks. The final report generated by the tool is
|
non-disruptive) security checks. The final report generated by the tool is
|
||||||
meant to serve as a foundation for professional web application security
|
meant to serve as a foundation for professional web application security
|
||||||
assessments.
|
assessments.
|
||||||
|
|
||||||
-------------------------------------------------
|
-------------------------------------------------
|
||||||
2. Why should I bother with this particular tool?
|
2. Why should I bother with this particular tool?
|
||||||
-------------------------------------------------
|
-------------------------------------------------
|
||||||
|
|
||||||
A number of commercial and open source tools with analogous functionality is
|
A number of commercial and open source tools with analogous functionality is
|
||||||
readily available (e.g., Nikto, Nessus); stick to the one that suits you
|
readily available (e.g., Nikto, Nessus); stick to the one that suits you
|
||||||
best. That said, skipfish tries to address some of the common problems
|
best. That said, skipfish tries to address some of the common problems
|
||||||
associated with web security scanners. Specific advantages include:
|
associated with web security scanners. Specific advantages include:
|
||||||
|
|
||||||
* High performance: 500+ requests per second against responsive Internet
|
* High performance: 500+ requests per second against responsive Internet
|
||||||
targets, 2000+ requests per second on LAN / MAN networks, and 7000+ requests
|
targets, 2000+ requests per second on LAN / MAN networks, and 7000+ requests
|
||||||
against local instances have been observed, with a very modest CPU, network,
|
against local instances have been observed, with a very modest CPU, network,
|
||||||
and memory footprint. This can be attributed to:
|
and memory footprint. This can be attributed to:
|
||||||
|
|
||||||
* Multiplexing single-thread, fully asynchronous network I/O and data
|
* Multiplexing single-thread, fully asynchronous network I/O and data
|
||||||
processing model that eliminates memory management, scheduling, and IPC
|
processing model that eliminates memory management, scheduling, and IPC
|
||||||
inefficiencies present in some multi-threaded clients.
|
inefficiencies present in some multi-threaded clients.
|
||||||
|
|
||||||
* Advanced HTTP/1.1 features such as range requests, content compression,
|
* Advanced HTTP/1.1 features such as range requests, content compression,
|
||||||
|
@ -45,8 +45,8 @@ associated with web security scanners. Specific advantages include:
|
||||||
* Smart response caching and advanced server behavior heuristics are used to
|
* Smart response caching and advanced server behavior heuristics are used to
|
||||||
minimize unnecessary traffic.
|
minimize unnecessary traffic.
|
||||||
|
|
||||||
* Performance-oriented, pure C implementation, including a custom
|
* Performance-oriented, pure C implementation, including a custom
|
||||||
HTTP stack.
|
HTTP stack.
|
||||||
|
|
||||||
* Ease of use: skipfish is highly adaptive and reliable. The scanner features:
|
* Ease of use: skipfish is highly adaptive and reliable. The scanner features:
|
||||||
|
|
||||||
|
@ -60,34 +60,34 @@ associated with web security scanners. Specific advantages include:
|
||||||
* Automatic wordlist construction based on site content analysis.
|
* Automatic wordlist construction based on site content analysis.
|
||||||
|
|
||||||
* Probabilistic scanning features to allow periodic, time-bound assessments
|
* Probabilistic scanning features to allow periodic, time-bound assessments
|
||||||
of arbitrarily complex sites.
|
of arbitrarily complex sites.
|
||||||
|
|
||||||
* Well-designed security checks: the tool is meant to provide accurate
|
* Well-designed security checks: the tool is meant to provide accurate
|
||||||
and meaningful results:
|
and meaningful results:
|
||||||
|
|
||||||
* Handcrafted dictionaries offer excellent coverage and permit thorough
|
* Handcrafted dictionaries offer excellent coverage and permit thorough
|
||||||
$keyword.$extension testing in a reasonable timeframe.
|
$keyword.$extension testing in a reasonable timeframe.
|
||||||
|
|
||||||
* Three-step differential probes are preferred to signature checks for
|
* Three-step differential probes are preferred to signature checks for
|
||||||
detecting vulnerabilities.
|
detecting vulnerabilities.
|
||||||
|
|
||||||
* Ratproxy-style logic is used to spot subtle security problems:
|
* Ratproxy-style logic is used to spot subtle security problems:
|
||||||
cross-site request forgery, cross-site script inclusion, mixed content,
|
cross-site request forgery, cross-site script inclusion, mixed content,
|
||||||
issues MIME- and charset mismatches, incorrect caching directives, etc.
|
issues MIME- and charset mismatches, incorrect caching directives, etc.
|
||||||
|
|
||||||
* Bundled security checks are designed to handle tricky scenarios:
|
* Bundled security checks are designed to handle tricky scenarios:
|
||||||
stored XSS (path, parameters, headers), blind SQL or XML injection,
|
stored XSS (path, parameters, headers), blind SQL or XML injection,
|
||||||
or blind shell injection.
|
or blind shell injection.
|
||||||
|
|
||||||
* Report post-processing drastically reduces the noise caused by any
|
* Report post-processing drastically reduces the noise caused by any
|
||||||
remaining false positives or server gimmicks by identifying repetitive
|
remaining false positives or server gimmicks by identifying repetitive
|
||||||
patterns.
|
patterns.
|
||||||
|
|
||||||
That said, skipfish is not a silver bullet, and may be unsuitable for certain
|
That said, skipfish is not a silver bullet, and may be unsuitable for certain
|
||||||
purposes. For example, it does not satisfy most of the requirements outlined
|
purposes. For example, it does not satisfy most of the requirements outlined
|
||||||
in WASC Web Application Security Scanner Evaluation Criteria (some of them on
|
in WASC Web Application Security Scanner Evaluation Criteria (some of them on
|
||||||
purpose, some out of necessity); and unlike most other projects of this type,
|
purpose, some out of necessity); and unlike most other projects of this type,
|
||||||
it does not come with an extensive database of known vulnerabilities for
|
it does not come with an extensive database of known vulnerabilities for
|
||||||
banner-type checks.
|
banner-type checks.
|
||||||
|
|
||||||
-----------------------------------------------------
|
-----------------------------------------------------
|
||||||
|
@ -104,7 +104,7 @@ A rough list of the security checks offered by the tool is outlined below.
|
||||||
* Server-side XML / XPath injection (including blind vectors).
|
* Server-side XML / XPath injection (including blind vectors).
|
||||||
* Format string vulnerabilities.
|
* Format string vulnerabilities.
|
||||||
* Integer overflow vulnerabilities.
|
* Integer overflow vulnerabilities.
|
||||||
* Locations accepting HTTP PUT.
|
* Locations accepting HTTP PUT.
|
||||||
|
|
||||||
* Medium risk flaws (potentially leading to data compromise):
|
* Medium risk flaws (potentially leading to data compromise):
|
||||||
|
|
||||||
|
@ -121,7 +121,7 @@ A rough list of the security checks offered by the tool is outlined below.
|
||||||
* Generic MIME types on renderables.
|
* Generic MIME types on renderables.
|
||||||
* Incorrect or missing charsets on renderables.
|
* Incorrect or missing charsets on renderables.
|
||||||
* Conflicting MIME / charset info on renderables.
|
* Conflicting MIME / charset info on renderables.
|
||||||
* Bad caching directives on cookie setting responses.
|
* Bad caching directives on cookie setting responses.
|
||||||
|
|
||||||
* Low risk issues (limited impact or low specificity):
|
* Low risk issues (limited impact or low specificity):
|
||||||
|
|
||||||
|
@ -135,7 +135,7 @@ A rough list of the security checks offered by the tool is outlined below.
|
||||||
* HTML forms with no XSRF protection.
|
* HTML forms with no XSRF protection.
|
||||||
* Self-signed SSL certificates.
|
* Self-signed SSL certificates.
|
||||||
* SSL certificate host name mismatches.
|
* SSL certificate host name mismatches.
|
||||||
* Bad caching directives on less sensitive content.
|
* Bad caching directives on less sensitive content.
|
||||||
|
|
||||||
* Internal warnings:
|
* Internal warnings:
|
||||||
|
|
||||||
|
@ -144,7 +144,7 @@ A rough list of the security checks offered by the tool is outlined below.
|
||||||
* Failed 404 behavior checks.
|
* Failed 404 behavior checks.
|
||||||
* IPS filtering detected.
|
* IPS filtering detected.
|
||||||
* Unexpected response variations.
|
* Unexpected response variations.
|
||||||
* Seemingly misclassified crawl nodes.
|
* Seemingly misclassified crawl nodes.
|
||||||
|
|
||||||
* Non-specific informational entries:
|
* Non-specific informational entries:
|
||||||
|
|
||||||
|
@ -170,14 +170,14 @@ A rough list of the security checks offered by the tool is outlined below.
|
||||||
* Generic MIME type on less significant content.
|
* Generic MIME type on less significant content.
|
||||||
* Incorrect or missing charset on less significant content.
|
* Incorrect or missing charset on less significant content.
|
||||||
* Conflicting MIME / charset information on less significant content.
|
* Conflicting MIME / charset information on less significant content.
|
||||||
* OGNL-like parameter passing conventions.
|
* OGNL-like parameter passing conventions.
|
||||||
|
|
||||||
Along with a list of identified issues, skipfish also provides summary
|
Along with a list of identified issues, skipfish also provides summary
|
||||||
overviews of document types and issue types found; and an interactive
|
overviews of document types and issue types found; and an interactive
|
||||||
sitemap, with nodes discovered through brute-force denoted in a distinctive
|
sitemap, with nodes discovered through brute-force denoted in a distinctive
|
||||||
way.
|
way.
|
||||||
|
|
||||||
NOTE: As a conscious design decision, skipfish will not redundantly complain
|
NOTE: As a conscious design decision, skipfish will not redundantly complain
|
||||||
about highly non-specific issues, including but not limited to:
|
about highly non-specific issues, including but not limited to:
|
||||||
|
|
||||||
* Non-httponly or non-secure cookies,
|
* Non-httponly or non-secure cookies,
|
||||||
|
@ -186,51 +186,51 @@ about highly non-specific issues, including but not limited to:
|
||||||
* Filesystem path disclosure in error messages,
|
* Filesystem path disclosure in error messages,
|
||||||
* Server of framework version disclosure,
|
* Server of framework version disclosure,
|
||||||
* Servers supporting TRACE or OPTIONS requests,
|
* Servers supporting TRACE or OPTIONS requests,
|
||||||
* Mere presence of certain technologies, such as WebDAV.
|
* Mere presence of certain technologies, such as WebDAV.
|
||||||
|
|
||||||
Most of these aspects are easy to inspect in a report if so desired - for
|
Most of these aspects are easy to inspect in a report if so desired - for
|
||||||
example, all the HTML forms are listed separately, so are new cookies or
|
example, all the HTML forms are listed separately, so are new cookies or
|
||||||
interesting HTTP headers - and the expectation is that the auditor may opt to
|
interesting HTTP headers - and the expectation is that the auditor may opt to
|
||||||
make certain design recommendations based on this data where appropriate.
|
make certain design recommendations based on this data where appropriate.
|
||||||
That said, these occurrences are not highlighted as a specific security flaw.
|
That said, these occurrences are not highlighted as a specific security flaw.
|
||||||
|
|
||||||
-----------------------------------------------------------
|
-----------------------------------------------------------
|
||||||
4. All right, I want to try it out. What do I need to know?
|
4. All right, I want to try it out. What do I need to know?
|
||||||
-----------------------------------------------------------
|
-----------------------------------------------------------
|
||||||
|
|
||||||
First and foremost, please do not be evil. Use skipfish only against services
|
First and foremost, please do not be evil. Use skipfish only against services
|
||||||
you own, or have a permission to test.
|
you own, or have a permission to test.
|
||||||
|
|
||||||
Keep in mind that all types of security testing can be disruptive. Although
|
Keep in mind that all types of security testing can be disruptive. Although
|
||||||
the scanner is designed not to carry out malicious attacks, it may
|
the scanner is designed not to carry out malicious attacks, it may
|
||||||
accidentally interfere with the operations of the site. You must accept the
|
accidentally interfere with the operations of the site. You must accept the
|
||||||
risk, and plan accordingly. Run the scanner against test instances where
|
risk, and plan accordingly. Run the scanner against test instances where
|
||||||
feasible, and be prepared to deal with the consequences if things go wrong.
|
feasible, and be prepared to deal with the consequences if things go wrong.
|
||||||
|
|
||||||
Also note that the tool is meant to be used by security professionals, and is
|
Also note that the tool is meant to be used by security professionals, and is
|
||||||
experimental in nature. It may return false positives or miss obvious
|
experimental in nature. It may return false positives or miss obvious
|
||||||
security problems - and even when it operates perfectly, it is simply not
|
security problems - and even when it operates perfectly, it is simply not
|
||||||
meant to be a point-and-click application. Do not take its output at face
|
meant to be a point-and-click application. Do not take its output at face
|
||||||
value.
|
value.
|
||||||
|
|
||||||
Running the tool against vendor-supplied demo sites is not a good way to
|
Running the tool against vendor-supplied demo sites is not a good way to
|
||||||
evaluate it, as they usually approximate vulnerabilities very imperfectly; we
|
evaluate it, as they usually approximate vulnerabilities very imperfectly; we
|
||||||
made no effort to accommodate these cases.
|
made no effort to accommodate these cases.
|
||||||
|
|
||||||
Lastly, the scanner is simply not designed for dealing with rogue and
|
Lastly, the scanner is simply not designed for dealing with rogue and
|
||||||
misbehaving HTTP servers - and offers no guarantees of safe (or sane)
|
misbehaving HTTP servers - and offers no guarantees of safe (or sane)
|
||||||
behavior there.
|
behavior there.
|
||||||
|
|
||||||
--------------------------
|
--------------------------
|
||||||
5. How to run the scanner?
|
5. How to run the scanner?
|
||||||
--------------------------
|
--------------------------
|
||||||
|
|
||||||
To compile it, simply unpack the archive and try make. Chances are, you will
|
To compile it, simply unpack the archive and try make. Chances are, you will
|
||||||
need to install libidn first.
|
need to install libidn first.
|
||||||
|
|
||||||
Next, you need to copy the desired dictionary file from dictionaries/ to
|
Next, you need to copy the desired dictionary file from dictionaries/ to
|
||||||
skipfish.wl. Please read dictionaries/README-FIRST carefully to make the
|
skipfish.wl. Please read dictionaries/README-FIRST carefully to make the
|
||||||
right choice. This step has a profound impact on the quality of scan results
|
right choice. This step has a profound impact on the quality of scan results
|
||||||
later on.
|
later on.
|
||||||
|
|
||||||
Once you have the dictionary selected, you can try:
|
Once you have the dictionary selected, you can try:
|
||||||
|
@ -243,16 +243,16 @@ the following syntax:
|
||||||
|
|
||||||
$ ./skipfish -o output_dir @../path/to/url_list.txt
|
$ ./skipfish -o output_dir @../path/to/url_list.txt
|
||||||
|
|
||||||
The tool will display some helpful stats while the scan is in progress. You
|
The tool will display some helpful stats while the scan is in progress. You
|
||||||
can also switch to a list of in-flight HTTP requests by pressing return.
|
can also switch to a list of in-flight HTTP requests by pressing return.
|
||||||
|
|
||||||
In the example above, skipfish will scan the entire www.example.com
|
In the example above, skipfish will scan the entire www.example.com
|
||||||
(including services on other ports, if linked to from the main page), and
|
(including services on other ports, if linked to from the main page), and
|
||||||
write a report to output_dir/index.html. You can then view this report with
|
write a report to output_dir/index.html. You can then view this report with
|
||||||
your favorite browser (JavaScript must be enabled; and because of recent
|
your favorite browser (JavaScript must be enabled; and because of recent
|
||||||
file:/// security improvements in certain browsers, you might need to access
|
file:/// security improvements in certain browsers, you might need to access
|
||||||
results over HTTP). The index.html file is static; actual results are stored
|
results over HTTP). The index.html file is static; actual results are stored
|
||||||
as a hierarchy of JSON files, suitable for machine processing or different
|
as a hierarchy of JSON files, suitable for machine processing or different
|
||||||
presentation frontends if needs be. In addition, a list of all the discovered
|
presentation frontends if needs be. In addition, a list of all the discovered
|
||||||
URLs will be saved to a single file, pivots.txt, for easy postprocessing.
|
URLs will be saved to a single file, pivots.txt, for easy postprocessing.
|
||||||
|
|
||||||
|
@ -262,40 +262,40 @@ report will be non-destructively annotated by adding red background to all
|
||||||
new or changed nodes; and blue background to all new or changed issues
|
new or changed nodes; and blue background to all new or changed issues
|
||||||
found.
|
found.
|
||||||
|
|
||||||
Some sites may require authentication; for simple HTTP credentials, you can
|
Some sites may require authentication; for simple HTTP credentials, you can
|
||||||
try:
|
try:
|
||||||
|
|
||||||
$ ./skipfish -A user:pass ...other parameters...
|
$ ./skipfish -A user:pass ...other parameters...
|
||||||
|
|
||||||
Alternatively, if the site relies on HTTP cookies instead, log in in your
|
Alternatively, if the site relies on HTTP cookies instead, log in in your
|
||||||
browser or using a simple curl script, and then provide skipfish with a
|
browser or using a simple curl script, and then provide skipfish with a
|
||||||
session cookie:
|
session cookie:
|
||||||
|
|
||||||
$ ./skipfish -C name=val ...other parameters...
|
$ ./skipfish -C name=val ...other parameters...
|
||||||
|
|
||||||
Other session cookies may be passed the same way, one per each -C option.
|
Other session cookies may be passed the same way, one per each -C option.
|
||||||
|
|
||||||
Certain URLs on the site may log out your session; you can combat this in two
|
Certain URLs on the site may log out your session; you can combat this in two
|
||||||
ways: by using the -N option, which causes the scanner to reject attempts to
|
ways: by using the -N option, which causes the scanner to reject attempts to
|
||||||
set or delete cookies; or with the -X parameter, which prevents matching URLs
|
set or delete cookies; or with the -X parameter, which prevents matching URLs
|
||||||
from being fetched:
|
from being fetched:
|
||||||
|
|
||||||
$ ./skipfish -X /logout/logout.aspx ...other parameters...
|
$ ./skipfish -X /logout/logout.aspx ...other parameters...
|
||||||
|
|
||||||
The -X option is also useful for speeding up your scans by excluding /icons/,
|
The -X option is also useful for speeding up your scans by excluding /icons/,
|
||||||
/doc/, /manuals/, and other standard, mundane locations along these lines. In
|
/doc/, /manuals/, and other standard, mundane locations along these lines. In
|
||||||
general, you can use -X and -I (only spider URLs matching a substring) to
|
general, you can use -X and -I (only spider URLs matching a substring) to
|
||||||
limit the scope of a scan any way you like - including restricting it only to
|
limit the scope of a scan any way you like - including restricting it only to
|
||||||
a specific protocol and port:
|
a specific protocol and port:
|
||||||
|
|
||||||
$ ./skipfish -I http://example.com:1234/ ...other parameters...
|
$ ./skipfish -I http://example.com:1234/ ...other parameters...
|
||||||
|
|
||||||
A related function, -K, allows you to specify parameter names not to fuzz
|
A related function, -K, allows you to specify parameter names not to fuzz
|
||||||
(useful for applications that put session IDs in the URL, to minimize noise).
|
(useful for applications that put session IDs in the URL, to minimize noise).
|
||||||
|
|
||||||
Another useful scoping option is -D - allowing you to specify additional
|
Another useful scoping option is -D - allowing you to specify additional
|
||||||
hosts or domains to consider in-scope for the test. By default, all hosts
|
hosts or domains to consider in-scope for the test. By default, all hosts
|
||||||
appearing in the command-line URLs are added to the list - but you can use -D
|
appearing in the command-line URLs are added to the list - but you can use -D
|
||||||
to broaden these rules, for example:
|
to broaden these rules, for example:
|
||||||
|
|
||||||
$ ./skipfish -D test2.example.com -o output-dir http://test1.example.com/
|
$ ./skipfish -D test2.example.com -o output-dir http://test1.example.com/
|
||||||
|
@ -304,61 +304,61 @@ $ ./skipfish -D test2.example.com -o output-dir http://test1.example.com/
|
||||||
|
|
||||||
$ ./skipfish -D .example.com -o output-dir http://test1.example.com/
|
$ ./skipfish -D .example.com -o output-dir http://test1.example.com/
|
||||||
|
|
||||||
In some cases, you do not want to actually crawl a third-party domain, but
|
In some cases, you do not want to actually crawl a third-party domain, but
|
||||||
you trust the owner of that domain enough not to worry about cross-domain
|
you trust the owner of that domain enough not to worry about cross-domain
|
||||||
content inclusion from that location. To suppress warnings, you can use the
|
content inclusion from that location. To suppress warnings, you can use the
|
||||||
-B option, for example:
|
-B option, for example:
|
||||||
|
|
||||||
$ ./skipfish -B .google-analytics.com -B .googleapis.com ...other
|
$ ./skipfish -B .google-analytics.com -B .googleapis.com ...other
|
||||||
parameters...
|
parameters...
|
||||||
|
|
||||||
By default, skipfish sends minimalistic HTTP headers to reduce the amount of
|
By default, skipfish sends minimalistic HTTP headers to reduce the amount of
|
||||||
data exchanged over the wire; some sites examine User-Agent strings or header
|
data exchanged over the wire; some sites examine User-Agent strings or header
|
||||||
ordering to reject unsupported clients, however. In such a case, you can use
|
ordering to reject unsupported clients, however. In such a case, you can use
|
||||||
-b ie, -b ffox, or -b phone to mimic one of the two popular browsers (or
|
-b ie, -b ffox, or -b phone to mimic one of the two popular browsers (or
|
||||||
iPhone).
|
iPhone).
|
||||||
|
|
||||||
When it comes to customizing your HTTP requests, you can also use the -H
|
When it comes to customizing your HTTP requests, you can also use the -H
|
||||||
option to insert any additional, non-standard headers; or -F to define a
|
option to insert any additional, non-standard headers; or -F to define a
|
||||||
custom mapping between a host and an IP (bypassing the resolver). The latter
|
custom mapping between a host and an IP (bypassing the resolver). The latter
|
||||||
feature is particularly useful for not-yet-launched or legacy services.
|
feature is particularly useful for not-yet-launched or legacy services.
|
||||||
|
|
||||||
Some sites may be too big to scan in a reasonable timeframe. If the site
|
Some sites may be too big to scan in a reasonable timeframe. If the site
|
||||||
features well-defined tarpits - for example, 100,000 nearly identical user
|
features well-defined tarpits - for example, 100,000 nearly identical user
|
||||||
profiles as a part of a social network - these specific locations can be
|
profiles as a part of a social network - these specific locations can be
|
||||||
excluded with -X or -S. In other cases, you may need to resort to other
|
excluded with -X or -S. In other cases, you may need to resort to other
|
||||||
settings: -d limits crawl depth to a specified number of subdirectories; -c
|
settings: -d limits crawl depth to a specified number of subdirectories; -c
|
||||||
limits the number of children per directory; -x limits the total number of
|
limits the number of children per directory; -x limits the total number of
|
||||||
descendants per crawl tree branch; and -r limits the total number of requests
|
descendants per crawl tree branch; and -r limits the total number of requests
|
||||||
to send in a scan.
|
to send in a scan.
|
||||||
|
|
||||||
An interesting option is available for repeated assessments: -p. By
|
An interesting option is available for repeated assessments: -p. By
|
||||||
specifying a percentage between 1 and 100%, it is possible to tell the
|
specifying a percentage between 1 and 100%, it is possible to tell the
|
||||||
crawler to follow fewer than 100% of all links, and try fewer than 100% of
|
crawler to follow fewer than 100% of all links, and try fewer than 100% of
|
||||||
all dictionary entries. This - naturally - limits the completeness of a scan,
|
all dictionary entries. This - naturally - limits the completeness of a scan,
|
||||||
but unlike most other settings, it does so in a balanced, non-deterministic
|
but unlike most other settings, it does so in a balanced, non-deterministic
|
||||||
manner. It is extremely useful when you are setting up time-bound, but
|
manner. It is extremely useful when you are setting up time-bound, but
|
||||||
periodic assessments of your infrastructure. Another related option is -q,
|
periodic assessments of your infrastructure. Another related option is -q,
|
||||||
which sets the initial random seed for the crawler to a specified value. This
|
which sets the initial random seed for the crawler to a specified value. This
|
||||||
can be used to exactly reproduce a previous scan to compare results.
|
can be used to exactly reproduce a previous scan to compare results.
|
||||||
Randomness is relied upon most heavily in the -p mode, but also for making a
|
Randomness is relied upon most heavily in the -p mode, but also for making a
|
||||||
couple of other scan management decisions elsewhere.
|
couple of other scan management decisions elsewhere.
|
||||||
|
|
||||||
Some particularly complex (or broken) services may involve a very high number
|
Some particularly complex (or broken) services may involve a very high number
|
||||||
of identical or nearly identical pages. Although these occurrences are by
|
of identical or nearly identical pages. Although these occurrences are by
|
||||||
default grayed out in the report, they still use up some screen estate and
|
default grayed out in the report, they still use up some screen estate and
|
||||||
take a while to process on JavaScript level. In such extreme cases, you may
|
take a while to process on JavaScript level. In such extreme cases, you may
|
||||||
use the -Q option to suppress reporting of duplicate nodes altogether, before
|
use the -Q option to suppress reporting of duplicate nodes altogether, before
|
||||||
the report is written. This may give you a less comprehensive understanding
|
the report is written. This may give you a less comprehensive understanding
|
||||||
of how the site is organized, but has no impact on test coverage.
|
of how the site is organized, but has no impact on test coverage.
|
||||||
|
|
||||||
In certain quick assessments, you might also have no interest in paying any
|
In certain quick assessments, you might also have no interest in paying any
|
||||||
particular attention to the desired functionality of the site - hoping to
|
particular attention to the desired functionality of the site - hoping to
|
||||||
explore non-linked secrets only. In such a case, you may specify -P to
|
explore non-linked secrets only. In such a case, you may specify -P to
|
||||||
inhibit all HTML parsing. This limits the coverage and takes away the ability
|
inhibit all HTML parsing. This limits the coverage and takes away the ability
|
||||||
for the scanner to learn new keywords by looking at the HTML, but speeds up
|
for the scanner to learn new keywords by looking at the HTML, but speeds up
|
||||||
the test dramatically. Another similarly crippling option that reduces the
|
the test dramatically. Another similarly crippling option that reduces the
|
||||||
risk of persistent effects of a scan is -O, which inhibits all form parsing
|
risk of persistent effects of a scan is -O, which inhibits all form parsing
|
||||||
and submission steps.
|
and submission steps.
|
||||||
|
|
||||||
Some sites that handle sensitive user data care about SSL - and about getting
|
Some sites that handle sensitive user data care about SSL - and about getting
|
||||||
|
@ -368,45 +368,45 @@ this. The scanner will complain about situations such as http:// scripts
|
||||||
being loaded on https:// pages - but will disregard non-risk scenarios such
|
being loaded on https:// pages - but will disregard non-risk scenarios such
|
||||||
as images.
|
as images.
|
||||||
|
|
||||||
Likewise, certain pedantic sites may care about cases where caching is
|
Likewise, certain pedantic sites may care about cases where caching is
|
||||||
restricted on HTTP/1.1 level, but no explicit HTTP/1.0 caching directive is
|
restricted on HTTP/1.1 level, but no explicit HTTP/1.0 caching directive is
|
||||||
given on specifying -E in the command-line causes skipfish to log all such
|
given on specifying -E in the command-line causes skipfish to log all such
|
||||||
cases carefully.
|
cases carefully.
|
||||||
|
|
||||||
Lastly, in some assessments that involve self-contained sites without
|
Lastly, in some assessments that involve self-contained sites without
|
||||||
extensive user content, the auditor may care about any external e-mails or
|
extensive user content, the auditor may care about any external e-mails or
|
||||||
HTTP links seen, even if they have no immediate security impact. Use the -U
|
HTTP links seen, even if they have no immediate security impact. Use the -U
|
||||||
option to have these logged.
|
option to have these logged.
|
||||||
|
|
||||||
Dictionary management is a special topic, and - as mentioned - is covered in
|
Dictionary management is a special topic, and - as mentioned - is covered in
|
||||||
more detail in dictionaries/README-FIRST. Please read that file before
|
more detail in dictionaries/README-FIRST. Please read that file before
|
||||||
proceeding. Some of the relevant options include -W to specify a custom
|
proceeding. Some of the relevant options include -W to specify a custom
|
||||||
wordlist, -L to suppress auto-learning, -V to suppress dictionary updates, -G
|
wordlist, -L to suppress auto-learning, -V to suppress dictionary updates, -G
|
||||||
to limit the keyword guess jar size, -R to drop old dictionary entries, and
|
to limit the keyword guess jar size, -R to drop old dictionary entries, and
|
||||||
-Y to inhibit expensive $keyword.$extension fuzzing.
|
-Y to inhibit expensive $keyword.$extension fuzzing.
|
||||||
|
|
||||||
Skipfish also features a form auto-completion mechanism in order to maximize
|
Skipfish also features a form auto-completion mechanism in order to maximize
|
||||||
scan coverage. The values should be non-malicious, as they are not meant to
|
scan coverage. The values should be non-malicious, as they are not meant to
|
||||||
implement security checks - but rather, to get past input validation logic.
|
implement security checks - but rather, to get past input validation logic.
|
||||||
You can define additional rules, or override existing ones, with the -T
|
You can define additional rules, or override existing ones, with the -T
|
||||||
option (-T form_field_name=field_value, e.g. -T login=test123 -T
|
option (-T form_field_name=field_value, e.g. -T login=test123 -T
|
||||||
password=test321 - although note that -C and -A are a much better method of
|
password=test321 - although note that -C and -A are a much better method of
|
||||||
logging in).
|
logging in).
|
||||||
|
|
||||||
There is also a handful of performance-related options. Use -g to set the
|
There is also a handful of performance-related options. Use -g to set the
|
||||||
maximum number of connections to maintain, globally, to all targets (it is
|
maximum number of connections to maintain, globally, to all targets (it is
|
||||||
sensible to keep this under 50 or so to avoid overwhelming the TCP/IP stack
|
sensible to keep this under 50 or so to avoid overwhelming the TCP/IP stack
|
||||||
on your system or on the nearby NAT / firewall devices); and -m to set the
|
on your system or on the nearby NAT / firewall devices); and -m to set the
|
||||||
per-IP limit (experiment a bit: 2-4 is usually good for localhost, 4-8 for
|
per-IP limit (experiment a bit: 2-4 is usually good for localhost, 4-8 for
|
||||||
local networks, 10-20 for external targets, 30+ for really lagged or
|
local networks, 10-20 for external targets, 30+ for really lagged or
|
||||||
non-keep-alive hosts). You can also use -w to set the I/O timeout (i.e.,
|
non-keep-alive hosts). You can also use -w to set the I/O timeout (i.e.,
|
||||||
skipfish will wait only so long for an individual read or write), and -t to
|
skipfish will wait only so long for an individual read or write), and -t to
|
||||||
set the total request timeout, to account for really slow or really fast
|
set the total request timeout, to account for really slow or really fast
|
||||||
sites.
|
sites.
|
||||||
|
|
||||||
Lastly, -f controls the maximum number of consecutive HTTP errors you are
|
Lastly, -f controls the maximum number of consecutive HTTP errors you are
|
||||||
willing to see before aborting the scan; and -s sets the maximum length of a
|
willing to see before aborting the scan; and -s sets the maximum length of a
|
||||||
response to fetch and parse (longer responses will be truncated).
|
response to fetch and parse (longer responses will be truncated).
|
||||||
|
|
||||||
When scanning large, multimedia-heavy sites, you may also want to specify -e.
|
When scanning large, multimedia-heavy sites, you may also want to specify -e.
|
||||||
This prevents binary documents from being kept in memory for reporting
|
This prevents binary documents from being kept in memory for reporting
|
||||||
|
@ -421,20 +421,20 @@ Oh, and real-time scan statistics can be suppressed with -u.
|
||||||
6. But seriously, how to run it?
|
6. But seriously, how to run it?
|
||||||
--------------------------------
|
--------------------------------
|
||||||
|
|
||||||
A standard, authenticated scan of a well-designed and self-contained site
|
A standard, authenticated scan of a well-designed and self-contained site
|
||||||
(warns about all external links, e-mails, mixed content, and caching header
|
(warns about all external links, e-mails, mixed content, and caching header
|
||||||
issues):
|
issues):
|
||||||
|
|
||||||
$ ./skipfish -MEU -C "AuthCookie=value" -X /logout.aspx -o output_dir \
|
$ ./skipfish -MEU -C "AuthCookie=value" -X /logout.aspx -o output_dir \
|
||||||
http://www.example.com/
|
http://www.example.com/
|
||||||
|
|
||||||
Five-connection crawl, but no brute-force; pretending to be MSIE and and
|
Five-connection crawl, but no brute-force; pretending to be MSIE and and
|
||||||
trusting example.com content):
|
trusting example.com content:
|
||||||
|
|
||||||
$ ./skipfish -m 5 -LV -W /dev/null -o output_dir -b ie -B example.com \
|
$ ./skipfish -m 5 -LV -W /dev/null -o output_dir -b ie -B example.com \
|
||||||
http://www.example.com/
|
http://www.example.com/
|
||||||
|
|
||||||
Brute force only (no HTML link extraction), limited to a single directory and
|
Brute force only (no HTML link extraction), limited to a single directory and
|
||||||
timing out after 5 seconds:
|
timing out after 5 seconds:
|
||||||
|
|
||||||
$ ./skipfish -P -I http://www.example.com/dir1/ -o output_dir -t 5 -I \
|
$ ./skipfish -P -I http://www.example.com/dir1/ -o output_dir -t 5 -I \
|
||||||
|
@ -471,15 +471,15 @@ applications.
|
||||||
8. Known limitations / feature wishlist
|
8. Known limitations / feature wishlist
|
||||||
---------------------------------------
|
---------------------------------------
|
||||||
|
|
||||||
Below is a list of features currently missing in skipfish. If you wish to
|
Below is a list of features currently missing in skipfish. If you wish to
|
||||||
improve the tool by contributing code in one of these areas, please let me
|
improve the tool by contributing code in one of these areas, please let me
|
||||||
know:
|
know:
|
||||||
|
|
||||||
* Buffer overflow checks: after careful consideration, I suspect there is
|
* Buffer overflow checks: after careful consideration, I suspect there is
|
||||||
no reliable way to test for buffer overflows remotely. Much like the actual
|
no reliable way to test for buffer overflows remotely. Much like the actual
|
||||||
fault condition we are looking for, proper buffer size checks may also
|
fault condition we are looking for, proper buffer size checks may also
|
||||||
result in uncaught exceptions, 500 messages, etc. I would love to be proved
|
result in uncaught exceptions, 500 messages, etc. I would love to be proved
|
||||||
wrong, though.
|
wrong, though.
|
||||||
|
|
||||||
* Fully-fledged JavaScript XSS detection: several rudimentary checks are
|
* Fully-fledged JavaScript XSS detection: several rudimentary checks are
|
||||||
present in the code, but there is no proper script engine to evaluate
|
present in the code, but there is no proper script engine to evaluate
|
||||||
|
@ -490,15 +490,15 @@ know:
|
||||||
they were much lower priority at the time of this writing.
|
they were much lower priority at the time of this writing.
|
||||||
|
|
||||||
* Security checks and link extraction for third-party, plugin-based
|
* Security checks and link extraction for third-party, plugin-based
|
||||||
content (Flash, Java, PDF, etc).
|
content (Flash, Java, PDF, etc).
|
||||||
|
|
||||||
* Password brute-force and numerical filename brute-force probes.
|
* Password brute-force and numerical filename brute-force probes.
|
||||||
|
|
||||||
* Search engine integration (vhosts, starting paths).
|
* Search engine integration (vhosts, starting paths).
|
||||||
|
|
||||||
* VIEWSTATE decoding.
|
* VIEWSTATE decoding.
|
||||||
|
|
||||||
* NTLM and digest authentication.
|
* NTLM and digest authentication.
|
||||||
|
|
||||||
* More specific PHP tests (eval injection, RFI).
|
* More specific PHP tests (eval injection, RFI).
|
||||||
|
|
||||||
|
@ -506,7 +506,7 @@ know:
|
||||||
a #define directive in config.h. Adding support for HTTPS proxying is
|
a #define directive in config.h. Adding support for HTTPS proxying is
|
||||||
more complicated, and still in the works.
|
more complicated, and still in the works.
|
||||||
|
|
||||||
* Scan resume option.
|
* Scan resume option, better runtime info.
|
||||||
|
|
||||||
* Option to limit document sampling or save samples directly to disk.
|
* Option to limit document sampling or save samples directly to disk.
|
||||||
|
|
||||||
|
@ -514,16 +514,20 @@ know:
|
||||||
|
|
||||||
* Config file support.
|
* Config file support.
|
||||||
|
|
||||||
* A database for banner / version checks?
|
* Scheduling and management web UI.
|
||||||
|
|
||||||
|
* QPS throttling and maximum scan time limit.
|
||||||
|
|
||||||
|
* A database for banner / version checks or other configurable rules?
|
||||||
|
|
||||||
-------------------------------------
|
-------------------------------------
|
||||||
9. Oy! Something went horribly wrong!
|
9. Oy! Something went horribly wrong!
|
||||||
-------------------------------------
|
-------------------------------------
|
||||||
|
|
||||||
There is no web crawler so good that there wouldn't be a web framework to one
|
There is no web crawler so good that there wouldn't be a web framework to one
|
||||||
day set it on fire. If you encounter what appears to be bad behavior (e.g., a
|
day set it on fire. If you encounter what appears to be bad behavior (e.g., a
|
||||||
scan that takes forever and generates too many requests, completely bogus
|
scan that takes forever and generates too many requests, completely bogus
|
||||||
nodes in scan output, or outright crashes), please first check our known
|
nodes in scan output, or outright crashes), please first check our known
|
||||||
issues page:
|
issues page:
|
||||||
|
|
||||||
http://code.google.com/p/skipfish/wiki/KnownIssues
|
http://code.google.com/p/skipfish/wiki/KnownIssues
|
||||||
|
@ -536,8 +540,8 @@ $ make clean debug
|
||||||
|
|
||||||
$ ./skipfish [...previous options...] 2>logfile.txt
|
$ ./skipfish [...previous options...] 2>logfile.txt
|
||||||
|
|
||||||
You can then inspect logfile.txt to get an idea what went wrong; if it looks
|
You can then inspect logfile.txt to get an idea what went wrong; if it looks
|
||||||
like a scanner problem, please scrub any sensitive information from the log
|
like a scanner problem, please scrub any sensitive information from the log
|
||||||
file and send it to the author.
|
file and send it to the author.
|
||||||
|
|
||||||
If the scanner crashed, please recompile it as indicated above, and then type:
|
If the scanner crashed, please recompile it as indicated above, and then type:
|
||||||
|
@ -552,8 +556,8 @@ $ gdb --batch -ex back ./skipfish core
|
||||||
10. Credits and feedback
|
10. Credits and feedback
|
||||||
------------------------
|
------------------------
|
||||||
|
|
||||||
Skipfish is made possible thanks to the contributions of, and valuable
|
Skipfish is made possible thanks to the contributions of, and valuable
|
||||||
feedback from, Google's information security engineering team.
|
feedback from, Google's information security engineering team.
|
||||||
|
|
||||||
If you have any bug reports, questions, suggestions, or concerns regarding
|
If you have any bug reports, questions, suggestions, or concerns regarding
|
||||||
the application, the author can be reached at lcamtuf@google.com.
|
the application, the author can be reached at lcamtuf@google.com.
|
||||||
|
|
47
analysis.c
47
analysis.c
|
@ -930,7 +930,7 @@ add_link:
|
||||||
|
|
||||||
i = 0;
|
i = 0;
|
||||||
|
|
||||||
while ((ext = wordlist_get_extension(i++))) {
|
while ((ext = wordlist_get_extension(i++, 0))) {
|
||||||
u32 ext_len = strlen((char*)ext);
|
u32 ext_len = strlen((char*)ext);
|
||||||
|
|
||||||
if (clean_len > ext_len + 2 &&
|
if (clean_len > ext_len + 2 &&
|
||||||
|
@ -2280,11 +2280,32 @@ static void check_for_stuff(struct http_request* req,
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
|
|
||||||
if (strstr((char*)res->payload, "<b>Fatal error</b>:") ||
|
if ((tmp = (u8*)strstr((char*)res->payload, " on line "))) {
|
||||||
strstr((char*)res->payload, "<b>Parse error</b>:") ||
|
u32 off = 512;
|
||||||
strstr((char*)res->payload, "</b> on line <b>")) {
|
|
||||||
problem(PROB_ERROR_POI, req, res, (u8*)"PHP error", req->pivot, 0);
|
while (tmp - 1 > res->payload && !strchr("\r\n", tmp[-1])
|
||||||
return;
|
&& off--) tmp--;
|
||||||
|
|
||||||
|
if (off && (!prefix(tmp, "Warning: ") || !prefix(tmp, "Notice: ") ||
|
||||||
|
!prefix(tmp, "Fatal error: ") || !prefix(tmp, "Parse error: ") ||
|
||||||
|
!prefix(tmp, "Deprecated: ") ||
|
||||||
|
!prefix(tmp, "Strict Standards: ") ||
|
||||||
|
!prefix(tmp, "Catchable fatal error: "))) {
|
||||||
|
problem(PROB_ERROR_POI, req, res, (u8*)"PHP error (text)", req->pivot, 0);
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
if (off && !prefix(tmp, "<b>") && (!prefix(tmp + 3, "Warning</b>: ") ||
|
||||||
|
!prefix(tmp + 3, "Notice</b>: ") ||
|
||||||
|
!prefix(tmp + 3, "Fatal error</b>: ") ||
|
||||||
|
!prefix(tmp + 3, "Parse error</b>: ") ||
|
||||||
|
!prefix(tmp + 3, "Deprecated</b>: ") ||
|
||||||
|
!prefix(tmp + 3, "Strict Standards</b>: ") ||
|
||||||
|
!prefix(tmp + 3, "Catchable fatal error</b>: "))) {
|
||||||
|
problem(PROB_ERROR_POI, req, res, (u8*)"PHP error (HTML)", req->pivot, 0);
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
}
|
}
|
||||||
|
|
||||||
if (strstr((char*)res->payload, "<b>Warning</b>: MySQL: ") ||
|
if (strstr((char*)res->payload, "<b>Warning</b>: MySQL: ") ||
|
||||||
|
@ -2326,12 +2347,26 @@ static void check_for_stuff(struct http_request* req,
|
||||||
if (strstr((char*)sniffbuf, "<cross-domain-policy>")) {
|
if (strstr((char*)sniffbuf, "<cross-domain-policy>")) {
|
||||||
problem(PROB_FILE_POI, req, res, (u8*)
|
problem(PROB_FILE_POI, req, res, (u8*)
|
||||||
"Flash cross-domain policy", req->pivot, 0);
|
"Flash cross-domain policy", req->pivot, 0);
|
||||||
|
|
||||||
|
/*
|
||||||
|
if (strstr((char*)res->payload, "domain=\"*\""))
|
||||||
|
problem(PROB_CROSS_WILD, req, res, (u8*)
|
||||||
|
"Cross-domain policy with wildcard rules", req->pivot, 0);
|
||||||
|
*/
|
||||||
|
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
|
|
||||||
if (strstr((char*)sniffbuf, "<access-policy>")) {
|
if (strstr((char*)sniffbuf, "<access-policy>")) {
|
||||||
problem(PROB_FILE_POI, req, res, (u8*)"Silverlight cross-domain policy",
|
problem(PROB_FILE_POI, req, res, (u8*)"Silverlight cross-domain policy",
|
||||||
req->pivot, 0);
|
req->pivot, 0);
|
||||||
|
|
||||||
|
/*
|
||||||
|
if (strstr((char*)res->payload, "uri=\"*\""))
|
||||||
|
problem(PROB_CROSS_WILD, req, res, (u8*)
|
||||||
|
"Cross-domain policy with wildcard rules", req->pivot, 0);
|
||||||
|
*/
|
||||||
|
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
4
config.h
4
config.h
|
@ -27,9 +27,9 @@
|
||||||
|
|
||||||
#define SHOW_SPLASH 1 /* Annoy user with a splash screen */
|
#define SHOW_SPLASH 1 /* Annoy user with a splash screen */
|
||||||
|
|
||||||
/* Define this to enable experimental HTTP proxy support, through the -J
|
/* Define this to enable experimental HTTP proxy support, through the -J
|
||||||
option in the command line. This mode will not work as expected for
|
option in the command line. This mode will not work as expected for
|
||||||
HTTPS requests at this point. */
|
HTTPS requests at this time - sorry. */
|
||||||
|
|
||||||
// #define PROXY_SUPPORT 1
|
// #define PROXY_SUPPORT 1
|
||||||
|
|
||||||
|
|
100
crawler.c
100
crawler.c
|
@ -354,7 +354,7 @@ static void secondary_ext_init(struct pivot_desc* pv, struct http_request* req,
|
||||||
|
|
||||||
i = 0;
|
i = 0;
|
||||||
|
|
||||||
while ((ex = wordlist_get_extension(i))) {
|
while ((ex = wordlist_get_extension(i, 0))) {
|
||||||
u8* tmp = ck_alloc(strlen((char*)base_name) + strlen((char*)ex) + 2);
|
u8* tmp = ck_alloc(strlen((char*)base_name) + strlen((char*)ex) + 2);
|
||||||
u32 c;
|
u32 c;
|
||||||
|
|
||||||
|
@ -382,6 +382,7 @@ static void secondary_ext_init(struct pivot_desc* pv, struct http_request* req,
|
||||||
n->par.v[tpar] = tmp;
|
n->par.v[tpar] = tmp;
|
||||||
|
|
||||||
n->user_val = 1;
|
n->user_val = 1;
|
||||||
|
n->with_ext = 1;
|
||||||
|
|
||||||
memcpy(&n->same_sig, &res->sig, sizeof(struct http_sig));
|
memcpy(&n->same_sig, &res->sig, sizeof(struct http_sig));
|
||||||
|
|
||||||
|
@ -1814,6 +1815,7 @@ static void crawl_par_dict_init(struct pivot_desc* pv) {
|
||||||
struct http_request* n;
|
struct http_request* n;
|
||||||
u8 *kw, *ex;
|
u8 *kw, *ex;
|
||||||
u32 i, c;
|
u32 i, c;
|
||||||
|
u8 specific;
|
||||||
|
|
||||||
/* Too many requests still pending, or already done? */
|
/* Too many requests still pending, or already done? */
|
||||||
|
|
||||||
|
@ -1832,7 +1834,7 @@ restart_dict:
|
||||||
i = 0;
|
i = 0;
|
||||||
|
|
||||||
kw = (pv->pdic_guess ? wordlist_get_guess : wordlist_get_word)
|
kw = (pv->pdic_guess ? wordlist_get_guess : wordlist_get_word)
|
||||||
(pv->pdic_cur_key);
|
(pv->pdic_cur_key, &specific);
|
||||||
|
|
||||||
if (!kw) {
|
if (!kw) {
|
||||||
|
|
||||||
|
@ -1878,10 +1880,11 @@ restart_dict:
|
||||||
|
|
||||||
/* Schedule probes for all extensions for the current word, but
|
/* Schedule probes for all extensions for the current word, but
|
||||||
only if the original parameter contained '.' somewhere,
|
only if the original parameter contained '.' somewhere,
|
||||||
and only if string is not on the try list. */
|
and only if string is not on the try list. Special handling
|
||||||
|
for specific keywords with '.' inside. */
|
||||||
|
|
||||||
if (strchr((char*)TPAR(pv->req), '.'))
|
if (!no_fuzz_ext && strchr((char*)TPAR(pv->req), '.'))
|
||||||
while (!no_fuzz_ext && (ex = wordlist_get_extension(i))) {
|
while ((ex = wordlist_get_extension(i, specific))) {
|
||||||
|
|
||||||
u8* tmp = ck_alloc(strlen((char*)kw) + strlen((char*)ex) + 2);
|
u8* tmp = ck_alloc(strlen((char*)kw) + strlen((char*)ex) + 2);
|
||||||
|
|
||||||
|
@ -1901,6 +1904,7 @@ restart_dict:
|
||||||
ck_free(TPAR(n));
|
ck_free(TPAR(n));
|
||||||
TPAR(n) = tmp;
|
TPAR(n) = tmp;
|
||||||
n->callback = par_dict_callback;
|
n->callback = par_dict_callback;
|
||||||
|
n->with_ext = 1;
|
||||||
pv->pdic_pending++;
|
pv->pdic_pending++;
|
||||||
in_dict_init = 1;
|
in_dict_init = 1;
|
||||||
async_request(n);
|
async_request(n);
|
||||||
|
@ -2333,6 +2337,7 @@ static u8 dir_404_callback(struct http_request* req,
|
||||||
}
|
}
|
||||||
|
|
||||||
memcpy(&req->pivot->r404[i], &res->sig, sizeof(struct http_sig));
|
memcpy(&req->pivot->r404[i], &res->sig, sizeof(struct http_sig));
|
||||||
|
|
||||||
req->pivot->r404_cnt++;
|
req->pivot->r404_cnt++;
|
||||||
|
|
||||||
/* Is this a new signature not seen on parent? Notify if so,
|
/* Is this a new signature not seen on parent? Notify if so,
|
||||||
|
@ -2379,7 +2384,7 @@ schedule_next:
|
||||||
|
|
||||||
/* Aaand schedule all the remaining probes. */
|
/* Aaand schedule all the remaining probes. */
|
||||||
|
|
||||||
while ((nk = wordlist_get_extension(cur_ext++))) {
|
while ((nk = wordlist_get_extension(cur_ext++, 0))) {
|
||||||
u8* tmp = ck_alloc(strlen(BOGUS_FILE) + strlen((char*)nk) + 2);
|
u8* tmp = ck_alloc(strlen(BOGUS_FILE) + strlen((char*)nk) + 2);
|
||||||
|
|
||||||
n = req_copy(RPREQ(req), req->pivot, 1);
|
n = req_copy(RPREQ(req), req->pivot, 1);
|
||||||
|
@ -2388,6 +2393,7 @@ schedule_next:
|
||||||
replace_slash(n, tmp);
|
replace_slash(n, tmp);
|
||||||
ck_free(tmp);
|
ck_free(tmp);
|
||||||
n->callback = dir_404_callback;
|
n->callback = dir_404_callback;
|
||||||
|
n->with_ext = 1;
|
||||||
n->user_val = 1;
|
n->user_val = 1;
|
||||||
|
|
||||||
/* r404_pending is at least 1 to begin with, so this is safe
|
/* r404_pending is at least 1 to begin with, so this is safe
|
||||||
|
@ -2655,6 +2661,7 @@ static void crawl_dir_dict_init(struct pivot_desc* pv) {
|
||||||
struct http_request* n;
|
struct http_request* n;
|
||||||
u8 *kw, *ex;
|
u8 *kw, *ex;
|
||||||
u32 i, c;
|
u32 i, c;
|
||||||
|
u8 specific;
|
||||||
|
|
||||||
/* Too many requests still pending, or already moved on to
|
/* Too many requests still pending, or already moved on to
|
||||||
parametric tests? */
|
parametric tests? */
|
||||||
|
@ -2682,7 +2689,8 @@ static void crawl_dir_dict_init(struct pivot_desc* pv) {
|
||||||
|
|
||||||
restart_dict:
|
restart_dict:
|
||||||
|
|
||||||
kw = (pv->guess ? wordlist_get_guess : wordlist_get_word)(pv->cur_key);
|
kw = (pv->guess ? wordlist_get_guess : wordlist_get_word)
|
||||||
|
(pv->cur_key, &specific);
|
||||||
|
|
||||||
if (!kw) {
|
if (!kw) {
|
||||||
|
|
||||||
|
@ -2739,39 +2747,42 @@ restart_dict:
|
||||||
}
|
}
|
||||||
|
|
||||||
/* Schedule probes for all extensions for the current word,
|
/* Schedule probes for all extensions for the current word,
|
||||||
likewise. */
|
likewise. Make an exception for specific keywords that
|
||||||
|
already contain a period. */
|
||||||
|
|
||||||
i = 0;
|
i = 0;
|
||||||
|
|
||||||
while (!no_fuzz_ext && (ex = wordlist_get_extension(i))) {
|
if (!no_fuzz_ext)
|
||||||
|
while ((ex = wordlist_get_extension(i, specific))) {
|
||||||
|
|
||||||
u8* tmp = ck_alloc(strlen((char*)kw) + strlen((char*)ex) + 2);
|
u8* tmp = ck_alloc(strlen((char*)kw) + strlen((char*)ex) + 2);
|
||||||
|
|
||||||
sprintf((char*)tmp, "%s.%s", kw, ex);
|
sprintf((char*)tmp, "%s.%s", kw, ex);
|
||||||
|
|
||||||
for (c=0;c<pv->child_cnt;c++)
|
for (c=0;c<pv->child_cnt;c++)
|
||||||
if (!((is_c_sens(pv) ? strcmp : strcasecmp)((char*)tmp,
|
if (!((is_c_sens(pv) ? strcmp : strcasecmp)((char*)tmp,
|
||||||
(char*)pv->child[c]->name))) break;
|
(char*)pv->child[c]->name))) break;
|
||||||
|
|
||||||
if (pv->fuzz_par != -1 &&
|
if (pv->fuzz_par != -1 &&
|
||||||
!((is_c_sens(pv) ? strcmp : strcasecmp)((char*)tmp,
|
!((is_c_sens(pv) ? strcmp : strcasecmp)((char*)tmp,
|
||||||
(char*)pv->req->par.v[pv->fuzz_par]))) c = pv->child_cnt;
|
(char*)pv->req->par.v[pv->fuzz_par]))) c = pv->child_cnt;
|
||||||
|
|
||||||
if (c == pv->child_cnt) {
|
if (c == pv->child_cnt) {
|
||||||
n = req_copy(pv->req, pv, 1);
|
n = req_copy(pv->req, pv, 1);
|
||||||
replace_slash(n, tmp);
|
replace_slash(n, tmp);
|
||||||
n->callback = dir_dict_callback;
|
n->callback = dir_dict_callback;
|
||||||
pv->pending++;
|
n->with_ext = 1;
|
||||||
in_dict_init = 1;
|
pv->pending++;
|
||||||
async_request(n);
|
in_dict_init = 1;
|
||||||
in_dict_init = 0;
|
async_request(n);
|
||||||
|
in_dict_init = 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
ck_free(tmp);
|
||||||
|
|
||||||
|
i++;
|
||||||
}
|
}
|
||||||
|
|
||||||
ck_free(tmp);
|
|
||||||
|
|
||||||
i++;
|
|
||||||
}
|
|
||||||
|
|
||||||
}
|
}
|
||||||
|
|
||||||
pv->cur_key++;
|
pv->cur_key++;
|
||||||
|
@ -2917,6 +2928,7 @@ u8 fetch_unknown_callback(struct http_request* req, struct http_response* res) {
|
||||||
n = req_copy(req, req->pivot, 1);
|
n = req_copy(req, req->pivot, 1);
|
||||||
set_value(PARAM_PATH, NULL, (u8*)"", -1, &n->par);
|
set_value(PARAM_PATH, NULL, (u8*)"", -1, &n->par);
|
||||||
n->callback = unknown_check_callback;
|
n->callback = unknown_check_callback;
|
||||||
|
n->with_ext = req->with_ext;
|
||||||
async_request(n);
|
async_request(n);
|
||||||
|
|
||||||
/* This is the initial callback, keep the response. */
|
/* This is the initial callback, keep the response. */
|
||||||
|
@ -2974,13 +2986,34 @@ static u8 unknown_check_callback(struct http_request* req,
|
||||||
|
|
||||||
}
|
}
|
||||||
|
|
||||||
if (par)
|
if (par) {
|
||||||
for (i=0;i<par->r404_cnt;i++)
|
for (i=0;i<par->r404_cnt;i++)
|
||||||
if (same_page(&res->sig, &par->r404[i])) break;
|
if (same_page(&res->sig, &par->r404[i])) break;
|
||||||
|
|
||||||
|
/* Do not use extension-originating signatures for settling non-extension
|
||||||
|
cases. */
|
||||||
|
|
||||||
|
if (i && !req->with_ext) i = par->r404_cnt;
|
||||||
|
|
||||||
|
}
|
||||||
|
|
||||||
if ((!par && res->code == 404) || (par && i != par->r404_cnt) ||
|
if ((!par && res->code == 404) || (par && i != par->r404_cnt) ||
|
||||||
(RPRES(req)->code < 300 && res->code >= 300 && RPRES(req)->pay_len)) {
|
(RPRES(req)->code < 300 && res->code >= 300 && RPRES(req)->pay_len)) {
|
||||||
|
|
||||||
|
DEBUG("REASON X\n");
|
||||||
|
if (par) DEBUG("same_404 = %d\n", i != par->r404_cnt);
|
||||||
|
DEBUG("par = %p\n", par);
|
||||||
|
if (par) DEBUG("par->r404_cnt = %d\n", par->r404_cnt);
|
||||||
|
DEBUG("res->code = %d\n", res->code);
|
||||||
|
DEBUG("parent code = %d\n", RPRES(req)->code);
|
||||||
|
DEBUG("parent len = %d\n", RPRES(req)->pay_len);
|
||||||
|
|
||||||
|
// (!par && res->code == 404) || - NIE
|
||||||
|
// (par && i != par->r404_cnt) || - TAK
|
||||||
|
// (RPRES(req)->code < 300 && res->code >= 300 && RPRES(req)->pay_len))
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
req->pivot->type = PIVOT_FILE;
|
req->pivot->type = PIVOT_FILE;
|
||||||
|
|
||||||
} else {
|
} else {
|
||||||
|
@ -2999,6 +3032,11 @@ assume_dir:
|
||||||
|
|
||||||
req->pivot->type = PIVOT_DIR;
|
req->pivot->type = PIVOT_DIR;
|
||||||
|
|
||||||
|
/* Perform content checks before discarding the old payload. */
|
||||||
|
|
||||||
|
if (!same_page(&RPRES(req)->sig, &res->sig))
|
||||||
|
content_checks(RPREQ(req), RPRES(req));
|
||||||
|
|
||||||
/* Replace original request, response with new data. */
|
/* Replace original request, response with new data. */
|
||||||
|
|
||||||
destroy_request(RPREQ(req));
|
destroy_request(RPREQ(req));
|
||||||
|
|
135
database.c
135
database.c
|
@ -57,11 +57,17 @@ u32 max_depth = MAX_DEPTH,
|
||||||
|
|
||||||
u8 dont_add_words; /* No auto dictionary building */
|
u8 dont_add_words; /* No auto dictionary building */
|
||||||
|
|
||||||
|
#define KW_SPECIFIC 0
|
||||||
|
#define KW_GENERIC 1
|
||||||
|
#define KW_GEN_AUTO 2
|
||||||
|
|
||||||
struct kw_entry {
|
struct kw_entry {
|
||||||
u8* word; /* Keyword itself */
|
u8* word; /* Keyword itself */
|
||||||
u32 hit_cnt; /* Number of confirmed sightings */
|
u32 hit_cnt; /* Number of confirmed sightings */
|
||||||
u8 is_ext; /* Is an extension? */
|
u8 is_ext; /* Is an extension? */
|
||||||
u8 hit_already; /* Had its hit count bumped up? */
|
u8 hit_already; /* Had its hit count bumped up? */
|
||||||
|
u8 read_only; /* Read-only dictionary? */
|
||||||
|
u8 class; /* KW_* */
|
||||||
u32 total_age; /* Total age (in scan cycles) */
|
u32 total_age; /* Total age (in scan cycles) */
|
||||||
u32 last_age; /* Age since last hit */
|
u32 last_age; /* Age since last hit */
|
||||||
};
|
};
|
||||||
|
@ -71,11 +77,19 @@ static struct kw_entry*
|
||||||
|
|
||||||
static u32 keyword_cnt[WORD_HASH]; /* Per-bucket keyword counts */
|
static u32 keyword_cnt[WORD_HASH]; /* Per-bucket keyword counts */
|
||||||
|
|
||||||
static u8 **extension, /* Extension list */
|
struct ext_entry {
|
||||||
**guess; /* Keyword candidate list */
|
u32 bucket;
|
||||||
|
u32 index;
|
||||||
|
};
|
||||||
|
|
||||||
|
static struct ext_entry *extension, /* Extension list */
|
||||||
|
*sp_extension;
|
||||||
|
|
||||||
|
static u8 **guess; /* Keyword candidate list */
|
||||||
|
|
||||||
u32 guess_cnt, /* Number of keyword candidates */
|
u32 guess_cnt, /* Number of keyword candidates */
|
||||||
extension_cnt, /* Number of extensions */
|
extension_cnt, /* Number of extensions */
|
||||||
|
sp_extension_cnt, /* Number of specific extensions */
|
||||||
keyword_total_cnt, /* Current keyword count */
|
keyword_total_cnt, /* Current keyword count */
|
||||||
keyword_orig_cnt; /* At-boot keyword count */
|
keyword_orig_cnt; /* At-boot keyword count */
|
||||||
|
|
||||||
|
@ -818,7 +832,7 @@ static inline u32 hash_word(u8* str) {
|
||||||
|
|
||||||
|
|
||||||
/* Adds a new keyword candidate to the global "guess" list. This
|
/* Adds a new keyword candidate to the global "guess" list. This
|
||||||
list is always case-insensitive. */
|
list is case-sensitive. */
|
||||||
|
|
||||||
void wordlist_add_guess(u8* text) {
|
void wordlist_add_guess(u8* text) {
|
||||||
u32 target, i, kh;
|
u32 target, i, kh;
|
||||||
|
@ -830,7 +844,7 @@ void wordlist_add_guess(u8* text) {
|
||||||
if (!text || !text[0] || strlen((char*)text) > MAX_WORD) return;
|
if (!text || !text[0] || strlen((char*)text) > MAX_WORD) return;
|
||||||
|
|
||||||
for (i=0;i<guess_cnt;i++)
|
for (i=0;i<guess_cnt;i++)
|
||||||
if (!strcasecmp((char*)text, (char*)guess[i])) return;
|
if (!strcmp((char*)text, (char*)guess[i])) return;
|
||||||
|
|
||||||
kh = hash_word(text);
|
kh = hash_word(text);
|
||||||
|
|
||||||
|
@ -853,10 +867,10 @@ void wordlist_add_guess(u8* text) {
|
||||||
|
|
||||||
|
|
||||||
/* Adds a single, sanitized keyword to the list, or increases its hit count.
|
/* Adds a single, sanitized keyword to the list, or increases its hit count.
|
||||||
Keyword list is case-insensitive - first capitalization wins. */
|
Keyword list is case-sensitive. */
|
||||||
|
|
||||||
static void wordlist_confirm_single(u8* text, u8 is_ext, u32 add_hits,
|
static void wordlist_confirm_single(u8* text, u8 is_ext, u8 class, u8 read_only,
|
||||||
u32 total_age, u32 last_age) {
|
u32 add_hits, u32 total_age, u32 last_age) {
|
||||||
u32 kh, i;
|
u32 kh, i;
|
||||||
|
|
||||||
if (!text || !text[0] || strlen((char*)text) > MAX_WORD) return;
|
if (!text || !text[0] || strlen((char*)text) > MAX_WORD) return;
|
||||||
|
@ -866,7 +880,7 @@ static void wordlist_confirm_single(u8* text, u8 is_ext, u32 add_hits,
|
||||||
kh = hash_word(text);
|
kh = hash_word(text);
|
||||||
|
|
||||||
for (i=0;i<keyword_cnt[kh];i++)
|
for (i=0;i<keyword_cnt[kh];i++)
|
||||||
if (!strcasecmp((char*)text, (char*)keyword[kh][i].word)) {
|
if (!strcmp((char*)text, (char*)keyword[kh][i].word)) {
|
||||||
|
|
||||||
/* Known! Increase hit count, and if this is now
|
/* Known! Increase hit count, and if this is now
|
||||||
tagged as an extension, add to extension list. */
|
tagged as an extension, add to extension list. */
|
||||||
|
@ -875,13 +889,19 @@ static void wordlist_confirm_single(u8* text, u8 is_ext, u32 add_hits,
|
||||||
keyword[kh][i].hit_cnt += add_hits;
|
keyword[kh][i].hit_cnt += add_hits;
|
||||||
keyword[kh][i].hit_already = 1;
|
keyword[kh][i].hit_already = 1;
|
||||||
keyword[kh][i].last_age = 0;
|
keyword[kh][i].last_age = 0;
|
||||||
|
|
||||||
|
if (!keyword[kh][i].read_only && read_only)
|
||||||
|
keyword[kh][i].read_only = 1;
|
||||||
|
|
||||||
}
|
}
|
||||||
|
|
||||||
if (!keyword[kh][i].is_ext && is_ext) {
|
if (!keyword[kh][i].is_ext && is_ext) {
|
||||||
keyword[kh][i].is_ext = 1;
|
keyword[kh][i].is_ext = 1;
|
||||||
|
|
||||||
extension = ck_realloc(extension, (extension_cnt + 1) * sizeof(u8*));
|
extension = ck_realloc(extension, (extension_cnt + 1) *
|
||||||
extension[extension_cnt++] = keyword[kh][i].word;
|
sizeof(struct ext_entry));
|
||||||
|
extension[extension_cnt].bucket = kh;
|
||||||
|
extension[extension_cnt++].index = i;
|
||||||
}
|
}
|
||||||
|
|
||||||
return;
|
return;
|
||||||
|
@ -896,6 +916,8 @@ static void wordlist_confirm_single(u8* text, u8 is_ext, u32 add_hits,
|
||||||
|
|
||||||
keyword[kh][i].word = ck_strdup(text);
|
keyword[kh][i].word = ck_strdup(text);
|
||||||
keyword[kh][i].is_ext = is_ext;
|
keyword[kh][i].is_ext = is_ext;
|
||||||
|
keyword[kh][i].class = class;
|
||||||
|
keyword[kh][i].read_only = read_only;
|
||||||
keyword[kh][i].hit_cnt = add_hits;
|
keyword[kh][i].hit_cnt = add_hits;
|
||||||
keyword[kh][i].total_age = total_age;
|
keyword[kh][i].total_age = total_age;
|
||||||
keyword[kh][i].last_age = last_age;
|
keyword[kh][i].last_age = last_age;
|
||||||
|
@ -906,8 +928,21 @@ static void wordlist_confirm_single(u8* text, u8 is_ext, u32 add_hits,
|
||||||
if (!total_age) keyword[kh][i].hit_already = 1;
|
if (!total_age) keyword[kh][i].hit_already = 1;
|
||||||
|
|
||||||
if (is_ext) {
|
if (is_ext) {
|
||||||
extension = ck_realloc(extension, (extension_cnt + 1) * sizeof(u8*));
|
|
||||||
extension[extension_cnt++] = keyword[kh][i].word;
|
extension = ck_realloc(extension, (extension_cnt + 1) *
|
||||||
|
sizeof(struct ext_entry));
|
||||||
|
extension[extension_cnt].bucket = kh;
|
||||||
|
extension[extension_cnt++].index = i;
|
||||||
|
|
||||||
|
if (class == KW_SPECIFIC) {
|
||||||
|
|
||||||
|
sp_extension = ck_realloc(sp_extension, (sp_extension_cnt + 1) *
|
||||||
|
sizeof(struct ext_entry));
|
||||||
|
sp_extension[sp_extension_cnt].bucket = kh;
|
||||||
|
sp_extension[sp_extension_cnt++].index = i;
|
||||||
|
|
||||||
|
}
|
||||||
|
|
||||||
}
|
}
|
||||||
|
|
||||||
}
|
}
|
||||||
|
@ -946,6 +981,18 @@ void wordlist_confirm_word(u8* text) {
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/* If the format is foo.bar, check if the entire string is a known keyword.
|
||||||
|
If yes, don't try to look up and add individual components. */
|
||||||
|
|
||||||
|
if (ppos != -1) {
|
||||||
|
|
||||||
|
u32 kh = hash_word(text);
|
||||||
|
|
||||||
|
for (i=0;i<keyword_cnt[kh];i++)
|
||||||
|
if (!strcasecmp((char*)text, (char*)keyword[kh][i].word)) return;
|
||||||
|
|
||||||
|
}
|
||||||
|
|
||||||
/* Too many dots? Tokenize class paths and domains as individual keywords,
|
/* Too many dots? Tokenize class paths and domains as individual keywords,
|
||||||
still. */
|
still. */
|
||||||
|
|
||||||
|
@ -972,22 +1019,22 @@ void wordlist_confirm_word(u8* text) {
|
||||||
if (tlen == 1 || tlen - ppos > 12) return;
|
if (tlen == 1 || tlen - ppos > 12) return;
|
||||||
|
|
||||||
if (ppos && ppos != tlen - 1 && !isdigit(text[ppos] + 1)) {
|
if (ppos && ppos != tlen - 1 && !isdigit(text[ppos] + 1)) {
|
||||||
wordlist_confirm_single(text + ppos + 1, 1, 1, 0, 0);
|
wordlist_confirm_single(text + ppos + 1, 1, KW_GEN_AUTO, 0, 1, 0, 0);
|
||||||
text[ppos] = 0;
|
text[ppos] = 0;
|
||||||
wordlist_confirm_single(text, 0, 1, 0, 0);
|
wordlist_confirm_single(text, 0, KW_GEN_AUTO, 0, 1, 0, 0);
|
||||||
text[ppos] = '.';
|
text[ppos] = '.';
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
|
|
||||||
}
|
}
|
||||||
|
|
||||||
wordlist_confirm_single(text, 0, 1, 0, 0);
|
wordlist_confirm_single(text, 0, KW_GEN_AUTO, 0, 1, 0, 0);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
/* Returns wordlist item at a specified offset (NULL if no more available). */
|
/* Returns wordlist item at a specified offset (NULL if no more available). */
|
||||||
|
|
||||||
u8* wordlist_get_word(u32 offset) {
|
u8* wordlist_get_word(u32 offset, u8* specific) {
|
||||||
u32 cur_off = 0, kh;
|
u32 cur_off = 0, kh;
|
||||||
|
|
||||||
for (kh=0;kh<WORD_HASH;kh++) {
|
for (kh=0;kh<WORD_HASH;kh++) {
|
||||||
|
@ -997,32 +1044,42 @@ u8* wordlist_get_word(u32 offset) {
|
||||||
|
|
||||||
if (kh == WORD_HASH) return NULL;
|
if (kh == WORD_HASH) return NULL;
|
||||||
|
|
||||||
|
*specific = (keyword[kh][offset - cur_off].is_ext == 0 &&
|
||||||
|
keyword[kh][offset - cur_off].class == KW_SPECIFIC);
|
||||||
|
|
||||||
return keyword[kh][offset - cur_off].word;
|
return keyword[kh][offset - cur_off].word;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
/* Returns keyword candidate at a specified offset (or NULL). */
|
/* Returns keyword candidate at a specified offset (or NULL). */
|
||||||
|
|
||||||
u8* wordlist_get_guess(u32 offset) {
|
u8* wordlist_get_guess(u32 offset, u8* specific) {
|
||||||
if (offset >= guess_cnt) return NULL;
|
if (offset >= guess_cnt) return NULL;
|
||||||
|
*specific = 0;
|
||||||
return guess[offset];
|
return guess[offset];
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
/* Returns extension at a specified offset (or NULL). */
|
/* Returns extension at a specified offset (or NULL). */
|
||||||
|
|
||||||
u8* wordlist_get_extension(u32 offset) {
|
u8* wordlist_get_extension(u32 offset, u8 specific) {
|
||||||
if (offset >= extension_cnt) return NULL;
|
|
||||||
return extension[offset];
|
if (!specific) {
|
||||||
|
if (offset >= extension_cnt) return NULL;
|
||||||
|
return keyword[extension[offset].bucket][extension[offset].index].word;
|
||||||
|
}
|
||||||
|
|
||||||
|
if (offset >= sp_extension_cnt) return NULL;
|
||||||
|
return keyword[sp_extension[offset].bucket][sp_extension[offset].index].word;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
/* Loads keywords from file. */
|
/* Loads keywords from file. */
|
||||||
|
|
||||||
void load_keywords(u8* fname, u32 purge_age) {
|
void load_keywords(u8* fname, u8 read_only, u32 purge_age) {
|
||||||
FILE* in;
|
FILE* in;
|
||||||
u32 hits, total_age, last_age, lines = 0;
|
u32 hits, total_age, last_age, lines = 0;
|
||||||
u8 type;
|
u8 type[3];
|
||||||
s32 fields;
|
s32 fields;
|
||||||
u8 kword[MAX_WORD + 1];
|
u8 kword[MAX_WORD + 1];
|
||||||
char fmt[32];
|
char fmt[32];
|
||||||
|
@ -1036,19 +1093,28 @@ void load_keywords(u8* fname, u32 purge_age) {
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
|
|
||||||
sprintf(fmt, "%%c %%u %%u %%u %%%u[^\x01-\x1f]", MAX_WORD);
|
sprintf(fmt, "%%2s %%u %%u %%u %%%u[^\x01-\x1f]", MAX_WORD);
|
||||||
|
|
||||||
while ((fields = fscanf(in, fmt, &type, &hits, &total_age, &last_age, kword))
|
while ((fields = fscanf(in, fmt, type, &hits, &total_age, &last_age, kword))
|
||||||
== 5) {
|
== 5) {
|
||||||
|
|
||||||
|
u8 class = KW_GEN_AUTO;
|
||||||
|
|
||||||
|
if (type[0] != 'e' && type[0] != 'w')
|
||||||
|
FATAL("Wordlist '%s': bad keyword type in line %u.\n", fname, lines + 1);
|
||||||
|
|
||||||
|
if (type[1] == 's') class = KW_SPECIFIC; else
|
||||||
|
if (type[1] == 'g') class = KW_GENERIC;
|
||||||
|
|
||||||
if (!purge_age || last_age < purge_age)
|
if (!purge_age || last_age < purge_age)
|
||||||
wordlist_confirm_single(kword, (type == 'e'), hits,
|
wordlist_confirm_single(kword, (type[0] == 'e'), class, read_only, hits,
|
||||||
total_age + 1, last_age + 1);
|
total_age + 1, last_age + 1);
|
||||||
lines++;
|
lines++;
|
||||||
fgetc(in); /* sink \n */
|
fgetc(in); /* sink \n */
|
||||||
}
|
}
|
||||||
|
|
||||||
if (fields != -1 && fields != 5)
|
if (fields != -1 && fields != 5)
|
||||||
FATAL("Wordlist '%s': syntax error in line %u.\n", fname, lines + 1);
|
FATAL("Wordlist '%s': syntax error in line %u.\n", fname, lines);
|
||||||
|
|
||||||
if (!lines)
|
if (!lines)
|
||||||
WARN("Wordlist '%s' contained no valid entries.", fname);
|
WARN("Wordlist '%s' contained no valid entries.", fname);
|
||||||
|
@ -1110,11 +1176,21 @@ void save_keywords(u8* fname) {
|
||||||
}
|
}
|
||||||
|
|
||||||
for (kh=0;kh<WORD_HASH;kh++)
|
for (kh=0;kh<WORD_HASH;kh++)
|
||||||
for (i=0;i<keyword_cnt[kh];i++)
|
for (i=0;i<keyword_cnt[kh];i++) {
|
||||||
fprintf(out,"%c %u %u %u %s\n", keyword[kh][i].is_ext ? 'e' : 'w',
|
u8 class = '?';
|
||||||
|
|
||||||
|
if (keyword[kh][i].read_only) continue;
|
||||||
|
|
||||||
|
if (keyword[kh][i].class == KW_SPECIFIC) class = 's'; else
|
||||||
|
if (keyword[kh][i].class == KW_GENERIC) class = 'g';
|
||||||
|
|
||||||
|
fprintf(out,"%c%c %u %u %u %s\n", keyword[kh][i].is_ext ? 'e' : 'w',
|
||||||
|
class,
|
||||||
keyword[kh][i].hit_cnt, keyword[kh][i].total_age,
|
keyword[kh][i].hit_cnt, keyword[kh][i].total_age,
|
||||||
keyword[kh][i].last_age, keyword[kh][i].word);
|
keyword[kh][i].last_age, keyword[kh][i].word);
|
||||||
|
|
||||||
|
}
|
||||||
|
|
||||||
SAY(cLGN "[+] " cNOR "Wordlist '%s' updated (%u new words added).\n",
|
SAY(cLGN "[+] " cNOR "Wordlist '%s' updated (%u new words added).\n",
|
||||||
fname, keyword_total_cnt - keyword_orig_cnt);
|
fname, keyword_total_cnt - keyword_orig_cnt);
|
||||||
|
|
||||||
|
@ -1409,8 +1485,9 @@ void destroy_database() {
|
||||||
ck_free(keyword[kh]);
|
ck_free(keyword[kh]);
|
||||||
}
|
}
|
||||||
|
|
||||||
/* Extensions just referenced keyword[][].word entries. */
|
/* Extensions just referenced keyword[][] entries. */
|
||||||
ck_free(extension);
|
ck_free(extension);
|
||||||
|
ck_free(sp_extension);
|
||||||
|
|
||||||
for (i=0;i<guess_cnt;i++) ck_free(guess[i]);
|
for (i=0;i<guess_cnt;i++) ck_free(guess[i]);
|
||||||
ck_free(guess);
|
ck_free(guess);
|
||||||
|
|
|
@ -375,19 +375,19 @@ void wordlist_confirm_word(u8* text);
|
||||||
|
|
||||||
/* Returns wordlist item at a specified offset (NULL if no more available). */
|
/* Returns wordlist item at a specified offset (NULL if no more available). */
|
||||||
|
|
||||||
u8* wordlist_get_word(u32 offset);
|
u8* wordlist_get_word(u32 offset, u8* specific);
|
||||||
|
|
||||||
/* Returns keyword candidate at a specified offset (or NULL). */
|
/* Returns keyword candidate at a specified offset (or NULL). */
|
||||||
|
|
||||||
u8* wordlist_get_guess(u32 offset);
|
u8* wordlist_get_guess(u32 offset, u8* specific);
|
||||||
|
|
||||||
/* Returns extension at a specified offset (or NULL). */
|
/* Returns extension at a specified offset (or NULL). */
|
||||||
|
|
||||||
u8* wordlist_get_extension(u32 offset);
|
u8* wordlist_get_extension(u32 offset, u8 specific);
|
||||||
|
|
||||||
/* Loads keywords from file. */
|
/* Loads keywords from file. */
|
||||||
|
|
||||||
void load_keywords(u8* fname, u32 purge_age);
|
void load_keywords(u8* fname, u8 read_only, u32 purge_age);
|
||||||
|
|
||||||
/* Saves all keywords to a file. */
|
/* Saves all keywords to a file. */
|
||||||
|
|
||||||
|
|
|
@ -3,11 +3,75 @@ This directory contains four alternative, hand-picked Skipfish dictionaries.
|
||||||
PLEASE READ THIS FILE CAREFULLY BEFORE PICKING ONE. This is *critical* to
|
PLEASE READ THIS FILE CAREFULLY BEFORE PICKING ONE. This is *critical* to
|
||||||
getting good results in your work.
|
getting good results in your work.
|
||||||
|
|
||||||
----------------
|
------------------------
|
||||||
Dictionary modes
|
Key command-line options
|
||||||
----------------
|
------------------------
|
||||||
|
|
||||||
The basic modes you should be aware of (in order of request cost):
|
The dictionary to be used by the tool can be specified with the -W option,
|
||||||
|
and must conform to the format outlined at the end of this document. If you
|
||||||
|
omit -W in the command line, 'skipfish.wl' is assumed. This file does not
|
||||||
|
exist by default. That part is by design: THE SCANNER WILL MODIFY THE
|
||||||
|
SUPPLIED FILE UNLESS SPECIFICALLY INSTRUCTED NOT TO.
|
||||||
|
|
||||||
|
That's because the scanner automatically learns new keywords and extensions
|
||||||
|
based on any links discovered during the scan, and on random sampling of
|
||||||
|
site contents. The information is consequently stored in the dictionary
|
||||||
|
for future reuse, along with other bookkeeping information useful for
|
||||||
|
determining which keywords perform well, and which ones don't.
|
||||||
|
|
||||||
|
All this means that it is very important to maintain a separate dictionary
|
||||||
|
for every separate set of unrelated target sites. Otherwise, undesirable
|
||||||
|
interference will occur.
|
||||||
|
|
||||||
|
With this out of the way, let's quickly review the options that may be used
|
||||||
|
to fine-tune various aspects of dictionary handling:
|
||||||
|
|
||||||
|
-L - do not automatically learn new keywords based on site content.
|
||||||
|
|
||||||
|
This option should not be normally used in most scanning
|
||||||
|
modes; if supplied, the scanner will not be able to discover
|
||||||
|
and leverage technology-specific terms and file extensions
|
||||||
|
unique to the architecture of the targeted site.
|
||||||
|
|
||||||
|
-G num - change jar size for keyword candidates.
|
||||||
|
|
||||||
|
Up to <num> candidates are randomly selected from site
|
||||||
|
content, and periodically retried during brute-force checks;
|
||||||
|
when one of them results in a unique non-404 response, it is
|
||||||
|
promoted to the dictionary proper. Unsuccessful candidates are
|
||||||
|
gradually replaced with new picks, and then discarded at the
|
||||||
|
end of the scan. The default jar size is 256.
|
||||||
|
|
||||||
|
-V - prevent the scanner from updating the dictionary file.
|
||||||
|
|
||||||
|
Normally, the primary read-write dictionary specified with the
|
||||||
|
-W option is updated at the end of the scan to add any newly
|
||||||
|
discovered keywords, and to update keyword usage stats. Using
|
||||||
|
this option eliminates this step.
|
||||||
|
|
||||||
|
-R num - purge all dictionary entries that had no non-404 hits for
|
||||||
|
the last <num> scans.
|
||||||
|
|
||||||
|
This option prevents dictionary creep in repeated assessments,
|
||||||
|
but needs to be used with care: it will permanently nuke a
|
||||||
|
part of the dictionary!
|
||||||
|
|
||||||
|
-Y - inhibit full ${filename}.${extension} brute-force.
|
||||||
|
|
||||||
|
In this mode, the scanner will only brute-force one component
|
||||||
|
at a time, trying all possible keywords without any extension,
|
||||||
|
and then trying to append extensions to any otherwise discovered
|
||||||
|
content.
|
||||||
|
|
||||||
|
This greatly improves scan times, but reduces coverage. Scan modes
|
||||||
|
2 and 3 shown in the next section make use of this flag.
|
||||||
|
|
||||||
|
--------------
|
||||||
|
Scanning modes
|
||||||
|
--------------
|
||||||
|
|
||||||
|
The basic dictionary-dependent modes you should be aware of (in order of the
|
||||||
|
associated request cost):
|
||||||
|
|
||||||
1) Orderly crawl with no DirBuster-like brute-force at all. In this mode, the
|
1) Orderly crawl with no DirBuster-like brute-force at all. In this mode, the
|
||||||
scanner will not discover non-linked resources such as /admin,
|
scanner will not discover non-linked resources such as /admin,
|
||||||
|
@ -20,24 +84,25 @@ The basic modes you should be aware of (in order of request cost):
|
||||||
|
|
||||||
2) Orderly scan with minimal extension brute-force. In this mode, the scanner
|
2) Orderly scan with minimal extension brute-force. In this mode, the scanner
|
||||||
will not discover resources such as /admin, but will discover cases such as
|
will not discover resources such as /admin, but will discover cases such as
|
||||||
/index.php.old:
|
/index.php.old (once index.php itself is spotted during an orderly crawl):
|
||||||
|
|
||||||
cp dictionaries/extensions-only.wl dictionary.wl
|
cp dictionaries/extensions-only.wl dictionary.wl
|
||||||
./skipfish -W dictionary.wl -Y [...other options...]
|
./skipfish -W dictionary.wl -Y [...other options...]
|
||||||
|
|
||||||
This method is only slightly more request-intensive than #1, and therefore,
|
This method is only slightly more request-intensive than #1, and therefore,
|
||||||
generally recommended in cases where time is of essence. The cost is about
|
is a marginally better alternative in cases where time is of essence. It's
|
||||||
100 requests per fuzzed location.
|
still not recommended for most uses. The cost is about 100 requests per
|
||||||
|
fuzzed location.
|
||||||
|
|
||||||
3) Directory OR extension brute-force only. In this mode, the scanner will only
|
3) Directory OR extension brute-force only. In this mode, the scanner will only
|
||||||
try fuzzing the file name, or the extension, at any given time - but will
|
try fuzzing the file name, or the extension, at any given time - but will
|
||||||
not try every possible ${filename}.${extension} pair from the dictionary.
|
not try every possible ${filename}.${extension} pair from the dictionary.
|
||||||
|
|
||||||
cp dictionaries/complete.wl dictionary.wl
|
cp dictionaries/complete.wl dictionary.wl
|
||||||
./skipfish -W dictionary.wl -Y [...other options...]
|
./skipfish -W dictionary.wl -Y [...other options...]
|
||||||
|
|
||||||
This method has a cost of about 2,000 requests per fuzzed location, and is
|
This method has a cost of about 2,000 requests per fuzzed location, and is
|
||||||
recommended for rapid assessments, especially when working with slow
|
recommended for rapid assessments, especially when working with slow
|
||||||
servers or very large services.
|
servers or very large services.
|
||||||
|
|
||||||
4) Normal dictionary fuzzing. In this mode, every ${filename}.${extension}
|
4) Normal dictionary fuzzing. In this mode, every ${filename}.${extension}
|
||||||
|
@ -61,41 +126,29 @@ The basic modes you should be aware of (in order of request cost):
|
||||||
reasonably responsive servers; but it may be prohibitively expensive
|
reasonably responsive servers; but it may be prohibitively expensive
|
||||||
when dealing with very large or very slow sites.
|
when dealing with very large or very slow sites.
|
||||||
|
|
||||||
As should be obvious, the -W option points to a dictionary to be used; the
|
----------------------------------
|
||||||
scanner updates the file based on scan results, so please always make a
|
Using separate master dictionaries
|
||||||
target-specific copy - do not use the master file directly, or it may be
|
----------------------------------
|
||||||
polluted with keywords not relevant to other targets.
|
|
||||||
|
|
||||||
Additional options supported by the aforementioned modes:
|
A recently introduced feature allows you to load any number of read-only
|
||||||
|
supplementary dictionaries in addition to the "main" read-write one (-W
|
||||||
|
dictionary.wl).
|
||||||
|
|
||||||
-L - do not automatically learn new keywords based on site content.
|
This is a convenient way to isolate (and be able to continually update) your
|
||||||
This option should not be normally used in most scanning
|
customized top-level wordlist, whilst still acquiring site-specific data in
|
||||||
modes; *not* using it significantly improves the coverage of
|
a separate file. The following syntax may be used to accomplish this:
|
||||||
minimal.wl.
|
|
||||||
|
|
||||||
-G num - specifies jar size for keyword candidates selected from the
|
./skipfish -W initially_empty_site_specific_dict.wl -W +supplementary_dict1.wl \
|
||||||
content; up to <num> candidates are kept and tried during
|
-W +supplementary_dict2.wl [...other options...]
|
||||||
brute-force checks; when one of them results in a unique
|
|
||||||
non-404 response, it is promoted to the dictionary proper.
|
|
||||||
|
|
||||||
-V - prevents the scanner from updating the dictionary file with
|
Only the main dictionary will be modified as a result of the scan, and only
|
||||||
newly discovered keywords and keyword usage stats (i.e., all
|
newly discovered site-specific keywords will be appended there.
|
||||||
new findings are discarded on exit).
|
|
||||||
|
|
||||||
-Y - inhibits full ${filename}.${extension} brute-force: the scanner
|
----------------------------
|
||||||
will only brute-force one component at a time. This greatly
|
More about dictionary design
|
||||||
improves scan times, but reduces coverage. Modes 2 and 3
|
----------------------------
|
||||||
shown above make use of this flag.
|
|
||||||
|
|
||||||
-R num - purges all dictionary entries that had no non-404 hits for
|
Each dictionary may consist of a number of extensions, and a number of
|
||||||
the last <num> scans. Prevents dictionary creep in repeated
|
|
||||||
assessments, but use with care!
|
|
||||||
|
|
||||||
-----------------------------
|
|
||||||
More about dictionary design:
|
|
||||||
-----------------------------
|
|
||||||
|
|
||||||
Each dictionary may consist of a number of extensions, and a number of
|
|
||||||
"regular" keywords. Extensions are considered just a special subset of the
|
"regular" keywords. Extensions are considered just a special subset of the
|
||||||
keyword list.
|
keyword list.
|
||||||
|
|
||||||
|
@ -103,29 +156,74 @@ You can create custom dictionaries, conforming to this format:
|
||||||
|
|
||||||
type hits total_age last_age keyword
|
type hits total_age last_age keyword
|
||||||
|
|
||||||
...where 'type' is either 'e' or 'w' (extension or wordlist); 'hits' is the
|
...where 'type' is either 'e' or 'w' (extension or keyword), followed by a
|
||||||
total number of times this keyword resulted in a non-404 hit in all previous
|
qualifier (explained below); 'hits' is the total number of times this keyword
|
||||||
scans; 'total_age' is the number of scan cycles this word is in the dictionary;
|
resulted in a non-404 hit in all previous scans; 'total_age' is the number of scan
|
||||||
'last_age' is the number of scan cycles since the last 'hit'; and 'keyword' is
|
cycles this word is in the dictionary; 'last_age' is the number of scan cycles
|
||||||
the actual keyword.
|
since the last 'hit'; and 'keyword' is the actual keyword.
|
||||||
|
|
||||||
Do not duplicate extensions as keywords - if you already have 'html' as an 'e'
|
Qualifiers alter the meaning of an entry in the following way:
|
||||||
entry, there is no need to also create a 'w' one.
|
|
||||||
|
|
||||||
There must be no empty or malformed lines, comments in the wordlist file.
|
wg - generic keyword that is not associated with any specific server-side
|
||||||
Extension keywords must have no leading dot (e.g., 'exe', not '.exe'), and all
|
technology. Examples include 'backup', 'accounting', or 'logs'. These
|
||||||
keywords should be NOT url-encoded (e.g., 'Program Files', not
|
will be indiscriminately combined with every known extension (e.g.,
|
||||||
'Program%20Files'). No keyword should exceed 64 characters.
|
'backup.php') during the fuzzing process.
|
||||||
|
|
||||||
If you omit -W in the command line, 'skipfish.wl' is assumed. This file does
|
ws - technology-specific keyword that are unlikely to have a random
|
||||||
not exist by default; this is by design.
|
extension; for example, with 'cgi-bin', testing for 'cgi-bin.php' is
|
||||||
|
usually a waste of time. Keywords tagged this way will be combined only
|
||||||
|
with a small set of technology-agnostic extensions - e.g., 'cgi-bin.old'.
|
||||||
|
|
||||||
The scanner will automatically learn new keywords and extensions based on any
|
NOTE: Technology-specific keywords that in the real world, are always
|
||||||
links discovered during the scan; and will also analyze pages and extract
|
paired with a single, specific extension, should be combined with said
|
||||||
words to use as keyword candidates.
|
extension in the 'ws' entry itself, rather than trying to accommodate
|
||||||
|
them with 'wg' rules. For example, 'MANIFEST.MF' is OK.
|
||||||
|
|
||||||
Tread carefully; poor wordlists are one of the reasons why some web security
|
eg - generic extension that is not specific to any well-defined technology,
|
||||||
scanners perform worse than expected. You will almost always be better off
|
or may pop-up in administrator- or developer-created auxiliary content.
|
||||||
narrowing down or selectively extending the supplied set (and possibly
|
Examples include 'bak', 'old', 'txt', or 'log'.
|
||||||
contributing back your changes upstream!), than importing a giant wordlist
|
|
||||||
scored elsewhere.
|
es - technology-specific extension, such as 'php', or 'cgi', that are
|
||||||
|
unlikely to spontaneously accompany random 'ws' keywords.
|
||||||
|
|
||||||
|
Skipfish leverages this distinction by only trying the following brute-force
|
||||||
|
combinations:
|
||||||
|
|
||||||
|
/some/path/wg_keyword ('index')
|
||||||
|
/some/path/ws_keyword ('cgi-bin')
|
||||||
|
/some/path/wg_extension ('old')
|
||||||
|
/some/path/ws_extension ('php')
|
||||||
|
|
||||||
|
/some/path/wg_keyword.wg_extension ('index.old')
|
||||||
|
/some/path/wg_keyword.ws_extension ('index.php')
|
||||||
|
|
||||||
|
/some/path/ws_keyword.ws_extension ('cgi-bin.old')
|
||||||
|
|
||||||
|
To decide between 'wg' and 'ws', consider if you are likely to ever encounter
|
||||||
|
files such as ${this_word}.php or ${this_word}.class. If not, tag the keyword
|
||||||
|
as 'ws'.
|
||||||
|
|
||||||
|
Similarly, to decide between 'eg' and 'es', think about the possibility of
|
||||||
|
encoutering cgi-bin.${this_ext} or formmail.${this_ext}. If it seems unlikely,
|
||||||
|
choose 'es'.
|
||||||
|
|
||||||
|
For your convenience, all legacy keywords and extensions, as well as any entries
|
||||||
|
detected automatically, will be stored in the dictionary with a '?' qualifier.
|
||||||
|
This is equivalent to 'g', and is meant to assist the user in reviewing and
|
||||||
|
triaging any automatically acquired dictionary data.
|
||||||
|
|
||||||
|
Other notes about dictionaries:
|
||||||
|
|
||||||
|
- Do not duplicate extensions as keywords - if you already have 'html' as an
|
||||||
|
'e' entry, there is no need to also create a 'w' one.
|
||||||
|
|
||||||
|
- There must be no empty or malformed lines, or comments, in the wordlist
|
||||||
|
file. Extension keywords must have no leading dot (e.g., 'exe', not '.exe'),
|
||||||
|
and all keywords should be NOT url-encoded (e.g., 'Program Files', not
|
||||||
|
'Program%20Files'). No keyword should exceed 64 characters.
|
||||||
|
|
||||||
|
- Tread carefully; poor wordlists are one of the reasons why some web security
|
||||||
|
scanners perform worse than expected. You will almost always be better off
|
||||||
|
narrowing down or selectively extending the supplied set (and possibly
|
||||||
|
contributing back your changes upstream!), than importing a giant wordlist
|
||||||
|
scored elsewhere.
|
||||||
|
|
|
@ -16,6 +16,7 @@ e 1 1 1 class
|
||||||
e 1 1 1 cnf
|
e 1 1 1 cnf
|
||||||
e 1 1 1 conf
|
e 1 1 1 conf
|
||||||
e 1 1 1 config
|
e 1 1 1 config
|
||||||
|
e 1 1 1 core
|
||||||
e 1 1 1 cpp
|
e 1 1 1 cpp
|
||||||
e 1 1 1 cs
|
e 1 1 1 cs
|
||||||
e 1 1 1 csproj
|
e 1 1 1 csproj
|
||||||
|
@ -587,7 +588,6 @@ w 1 1 1 cookies
|
||||||
w 1 1 1 copies
|
w 1 1 1 copies
|
||||||
w 1 1 1 copy
|
w 1 1 1 copy
|
||||||
w 1 1 1 copyright
|
w 1 1 1 copyright
|
||||||
w 1 1 1 core
|
|
||||||
w 1 1 1 corp
|
w 1 1 1 corp
|
||||||
w 1 1 1 corpo
|
w 1 1 1 corpo
|
||||||
w 1 1 1 corporate
|
w 1 1 1 corporate
|
||||||
|
|
|
@ -16,6 +16,7 @@ e 1 1 1 class
|
||||||
e 1 1 1 cnf
|
e 1 1 1 cnf
|
||||||
e 1 1 1 conf
|
e 1 1 1 conf
|
||||||
e 1 1 1 config
|
e 1 1 1 config
|
||||||
|
e 1 1 1 core
|
||||||
e 1 1 1 cpp
|
e 1 1 1 cpp
|
||||||
e 1 1 1 cs
|
e 1 1 1 cs
|
||||||
e 1 1 1 csproj
|
e 1 1 1 csproj
|
||||||
|
|
|
@ -11,6 +11,7 @@ e 1 1 1 class
|
||||||
e 1 1 1 cnf
|
e 1 1 1 cnf
|
||||||
e 1 1 1 conf
|
e 1 1 1 conf
|
||||||
e 1 1 1 config
|
e 1 1 1 config
|
||||||
|
e 1 1 1 core
|
||||||
e 1 1 1 cpp
|
e 1 1 1 cpp
|
||||||
e 1 1 1 csproj
|
e 1 1 1 csproj
|
||||||
e 1 1 1 csv
|
e 1 1 1 csv
|
||||||
|
@ -556,7 +557,6 @@ w 1 1 1 cookies
|
||||||
w 1 1 1 copies
|
w 1 1 1 copies
|
||||||
w 1 1 1 copy
|
w 1 1 1 copy
|
||||||
w 1 1 1 copyright
|
w 1 1 1 copyright
|
||||||
w 1 1 1 core
|
|
||||||
w 1 1 1 corp
|
w 1 1 1 corp
|
||||||
w 1 1 1 corpo
|
w 1 1 1 corpo
|
||||||
w 1 1 1 corporate
|
w 1 1 1 corporate
|
||||||
|
|
|
@ -565,6 +565,22 @@ void tokenize_path(u8* str, struct http_request* req, u8 add_slash) {
|
||||||
value = url_decode_token(cur + !first_el, next_seg - !first_el, 0);
|
value = url_decode_token(cur + !first_el, next_seg - !first_el, 0);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/* If the extracted segment is just '.' or '..', but is followed by
|
||||||
|
something else than '/', skip one separator. */
|
||||||
|
|
||||||
|
if (!name && cur[next_seg] && cur[next_seg] != '/' &&
|
||||||
|
(!strcmp((char*)value, ".") || !strcmp((char*)value, ".."))) {
|
||||||
|
|
||||||
|
next_seg = strcspn((char*)cur + next_seg + 1, "/;,!$?#") + next_seg + 1,
|
||||||
|
|
||||||
|
ck_free(name);
|
||||||
|
ck_free(value);
|
||||||
|
|
||||||
|
value = url_decode_token(cur + !first_el, next_seg - !first_el, 0);
|
||||||
|
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
switch (first_el ? '/' : *cur) {
|
switch (first_el ? '/' : *cur) {
|
||||||
|
|
||||||
case ';': set_value(PARAM_PATH_S, name, value, -1, &req->par); break;
|
case ';': set_value(PARAM_PATH_S, name, value, -1, &req->par); break;
|
||||||
|
|
|
@ -97,12 +97,12 @@ struct http_request {
|
||||||
u16 port; /* Port number to connect to */
|
u16 port; /* Port number to connect to */
|
||||||
|
|
||||||
u8* orig_url; /* Copy of the original URL */
|
u8* orig_url; /* Copy of the original URL */
|
||||||
|
|
||||||
struct param_array par; /* Parameters, headers, cookies */
|
struct param_array par; /* Parameters, headers, cookies */
|
||||||
|
|
||||||
struct pivot_desc *pivot; /* Pivot descriptor */
|
struct pivot_desc *pivot; /* Pivot descriptor */
|
||||||
|
|
||||||
u32 user_val; /* Can be used freely */
|
u32 user_val; /* Can be used freely */
|
||||||
|
u8 with_ext; /* Extension-based probe? */
|
||||||
|
|
||||||
u8 (*callback)(struct http_request*, struct http_response*);
|
u8 (*callback)(struct http_request*, struct http_response*);
|
||||||
/* Callback to invoke when done */
|
/* Callback to invoke when done */
|
||||||
|
|
31
report.c
31
report.c
|
@ -303,7 +303,7 @@ static void compute_counts(struct pivot_desc* pv) {
|
||||||
/* Helper to JS-escape data. Static buffer, will be destroyed on
|
/* Helper to JS-escape data. Static buffer, will be destroyed on
|
||||||
subsequent calls. */
|
subsequent calls. */
|
||||||
|
|
||||||
static inline u8* js_escape(u8* str) {
|
static inline u8* js_escape(u8* str, u8 sp) {
|
||||||
u32 len;
|
u32 len;
|
||||||
static u8* ret;
|
static u8* ret;
|
||||||
u8* opos;
|
u8* opos;
|
||||||
|
@ -316,7 +316,7 @@ static inline u8* js_escape(u8* str) {
|
||||||
opos = ret = __DFL_ck_alloc(len * 4 + 1);
|
opos = ret = __DFL_ck_alloc(len * 4 + 1);
|
||||||
|
|
||||||
while (len--) {
|
while (len--) {
|
||||||
if (*str > 0x1f && *str < 0x80 && !strchr("<>\\'\"", *str)) {
|
if (*str > (sp ? 0x20 : 0x1f) && *str < 0x80 && !strchr("<>\\'\"", *str)) {
|
||||||
*(opos++) = *(str++);
|
*(opos++) = *(str++);
|
||||||
} else {
|
} else {
|
||||||
sprintf((char*)opos, "\\x%02x", *(str++));
|
sprintf((char*)opos, "\\x%02x", *(str++));
|
||||||
|
@ -343,7 +343,7 @@ static void output_scan_info(u64 scan_time, u32 seed) {
|
||||||
if (!f) PFATAL("Cannot open 'summary.js'");
|
if (!f) PFATAL("Cannot open 'summary.js'");
|
||||||
|
|
||||||
fprintf(f, "var sf_version = '%s';\n", VERSION);
|
fprintf(f, "var sf_version = '%s';\n", VERSION);
|
||||||
fprintf(f, "var scan_date = '%s';\n", js_escape(ct));
|
fprintf(f, "var scan_date = '%s';\n", js_escape(ct, 0));
|
||||||
fprintf(f, "var scan_seed = '0x%08x';\n", seed);
|
fprintf(f, "var scan_seed = '0x%08x';\n", seed);
|
||||||
fprintf(f, "var scan_ms = %llu;\n", (long long)scan_time);
|
fprintf(f, "var scan_ms = %llu;\n", (long long)scan_time);
|
||||||
|
|
||||||
|
@ -370,12 +370,12 @@ static void describe_res(FILE* f, struct http_response* res) {
|
||||||
case STATE_OK:
|
case STATE_OK:
|
||||||
fprintf(f, "'fetched': true, 'code': %u, 'len': %u, 'decl_mime': '%s', ",
|
fprintf(f, "'fetched': true, 'code': %u, 'len': %u, 'decl_mime': '%s', ",
|
||||||
res->code, res->pay_len,
|
res->code, res->pay_len,
|
||||||
js_escape(res->header_mime));
|
js_escape(res->header_mime, 0));
|
||||||
|
|
||||||
fprintf(f, "'sniff_mime': '%s', 'cset': '%s'",
|
fprintf(f, "'sniff_mime': '%s', 'cset': '%s'",
|
||||||
res->sniffed_mime ? res->sniffed_mime : (u8*)"[none]",
|
res->sniffed_mime ? res->sniffed_mime : (u8*)"[none]",
|
||||||
js_escape(res->header_charset ? res->header_charset
|
js_escape(res->header_charset ? res->header_charset
|
||||||
: res->meta_charset));
|
: res->meta_charset, 0));
|
||||||
break;
|
break;
|
||||||
|
|
||||||
case STATE_DNSERR:
|
case STATE_DNSERR:
|
||||||
|
@ -514,18 +514,18 @@ static void output_crawl_tree(struct pivot_desc* pv) {
|
||||||
|
|
||||||
fprintf(f, " { 'dupe': %s, 'type': %u, 'name': '%s%s",
|
fprintf(f, " { 'dupe': %s, 'type': %u, 'name': '%s%s",
|
||||||
pv->child[i]->dupe ? "true" : "false",
|
pv->child[i]->dupe ? "true" : "false",
|
||||||
pv->child[i]->type, js_escape(pv->child[i]->name),
|
pv->child[i]->type, js_escape(pv->child[i]->name, 0),
|
||||||
(pv->child[i]->fuzz_par == -1 || pv->child[i]->type == PIVOT_VALUE)
|
(pv->child[i]->fuzz_par == -1 || pv->child[i]->type == PIVOT_VALUE)
|
||||||
? (u8*)"" : (u8*)"=");
|
? (u8*)"" : (u8*)"=");
|
||||||
|
|
||||||
fprintf(f, "%s', 'dir': '%s', 'linked': %d, ",
|
fprintf(f, "%s', 'dir': '%s', 'linked': %d, ",
|
||||||
(pv->child[i]->fuzz_par == -1 || pv->child[i]->type == PIVOT_VALUE)
|
(pv->child[i]->fuzz_par == -1 || pv->child[i]->type == PIVOT_VALUE)
|
||||||
? (u8*)"" :
|
? (u8*)"" :
|
||||||
js_escape(pv->child[i]->req->par.v[pv->child[i]->fuzz_par]),
|
js_escape(pv->child[i]->req->par.v[pv->child[i]->fuzz_par], 0),
|
||||||
tmp, pv->child[i]->linked);
|
tmp, pv->child[i]->linked);
|
||||||
|
|
||||||
p = serialize_path(pv->child[i]->req, 1, 1);
|
p = serialize_path(pv->child[i]->req, 1, 1);
|
||||||
fprintf(f, "'url': '%s', ", js_escape(p));
|
fprintf(f, "'url': '%s', ", js_escape(p, 0));
|
||||||
ck_free(p);
|
ck_free(p);
|
||||||
|
|
||||||
describe_res(f, pv->child[i]->res);
|
describe_res(f, pv->child[i]->res);
|
||||||
|
@ -557,7 +557,7 @@ static void output_crawl_tree(struct pivot_desc* pv) {
|
||||||
|
|
||||||
fprintf(f, " { 'severity': %u, 'type': %u, 'extra': '%s', ",
|
fprintf(f, " { 'severity': %u, 'type': %u, 'extra': '%s', ",
|
||||||
PSEV(pv->issue[i].type) - 1, pv->issue[i].type,
|
PSEV(pv->issue[i].type) - 1, pv->issue[i].type,
|
||||||
pv->issue[i].extra ? js_escape(pv->issue[i].extra) : (u8*)"");
|
pv->issue[i].extra ? js_escape(pv->issue[i].extra, 0) : (u8*)"");
|
||||||
|
|
||||||
describe_res(f, pv->issue[i].res);
|
describe_res(f, pv->issue[i].res);
|
||||||
|
|
||||||
|
@ -658,7 +658,7 @@ static void output_summary_views() {
|
||||||
save_req_res(m_samp[i].req[c], m_samp[i].res[c], 0);
|
save_req_res(m_samp[i].req[c], m_samp[i].res[c], 0);
|
||||||
if (chdir("..")) PFATAL("chdir unexpectedly fails!");
|
if (chdir("..")) PFATAL("chdir unexpectedly fails!");
|
||||||
fprintf(f, " { 'url': '%s', 'dir': '%s/%s', 'linked': %d, 'len': %d"
|
fprintf(f, " { 'url': '%s', 'dir': '%s/%s', 'linked': %d, 'len': %d"
|
||||||
" }%s\n", js_escape(p), tmp, tmp2,
|
" }%s\n", js_escape(p, 0), tmp, tmp2,
|
||||||
m_samp[i].req[c]->pivot->linked, m_samp[i].res[c]->pay_len,
|
m_samp[i].req[c]->pivot->linked, m_samp[i].res[c]->pay_len,
|
||||||
(c == use_samp - 1) ? " ]" : ",");
|
(c == use_samp - 1) ? " ]" : ",");
|
||||||
ck_free(p);
|
ck_free(p);
|
||||||
|
@ -693,9 +693,9 @@ static void output_summary_views() {
|
||||||
if (chdir((char*)tmp2)) PFATAL("chdir unexpectedly fails!");
|
if (chdir((char*)tmp2)) PFATAL("chdir unexpectedly fails!");
|
||||||
save_req_res(i_samp[i].i[c]->req, i_samp[i].i[c]->res, 0);
|
save_req_res(i_samp[i].i[c]->req, i_samp[i].i[c]->res, 0);
|
||||||
if (chdir("..")) PFATAL("chdir unexpectedly fails!");
|
if (chdir("..")) PFATAL("chdir unexpectedly fails!");
|
||||||
fprintf(f, " { 'url': '%s', ", js_escape(p));
|
fprintf(f, " { 'url': '%s', ", js_escape(p, 0));
|
||||||
fprintf(f, "'extra': '%s', 'dir': '%s/%s' }%s\n",
|
fprintf(f, "'extra': '%s', 'dir': '%s/%s' }%s\n",
|
||||||
i_samp[i].i[c]->extra ? js_escape(i_samp[i].i[c]->extra) :
|
i_samp[i].i[c]->extra ? js_escape(i_samp[i].i[c]->extra, 0) :
|
||||||
(u8*)"", tmp, tmp2,
|
(u8*)"", tmp, tmp2,
|
||||||
(c == use_samp - 1) ? " ]" : ",");
|
(c == use_samp - 1) ? " ]" : ",");
|
||||||
ck_free(p);
|
ck_free(p);
|
||||||
|
@ -763,10 +763,12 @@ static void save_pivots(FILE* f, struct pivot_desc* cur) {
|
||||||
u8* url = serialize_path(cur->req, 1, 1);
|
u8* url = serialize_path(cur->req, 1, 1);
|
||||||
|
|
||||||
fprintf(f, "%s %s ", cur->req->method ? cur->req->method : (u8*)"GET",
|
fprintf(f, "%s %s ", cur->req->method ? cur->req->method : (u8*)"GET",
|
||||||
js_escape(url));
|
js_escape(url, 0));
|
||||||
|
|
||||||
ck_free(url);
|
ck_free(url);
|
||||||
|
|
||||||
|
fprintf(f, "name=%s ", js_escape(cur->name, 1));
|
||||||
|
|
||||||
switch (cur->type) {
|
switch (cur->type) {
|
||||||
case PIVOT_SERV: fprintf(f, "type=serv "); break;
|
case PIVOT_SERV: fprintf(f, "type=serv "); break;
|
||||||
case PIVOT_DIR: fprintf(f, "type=dir "); break;
|
case PIVOT_DIR: fprintf(f, "type=dir "); break;
|
||||||
|
@ -785,7 +787,8 @@ static void save_pivots(FILE* f, struct pivot_desc* cur) {
|
||||||
}
|
}
|
||||||
|
|
||||||
if (cur->res)
|
if (cur->res)
|
||||||
fprintf(f, "dup=%u %scode=%u len=%u notes=%u\n", cur->dupe,
|
fprintf(f, "dup=%u %s%scode=%u len=%u notes=%u\n", cur->dupe,
|
||||||
|
cur->bogus_par ? "bogus " : "",
|
||||||
cur->missing ? "returns_404 " : "",
|
cur->missing ? "returns_404 " : "",
|
||||||
cur->res->code, cur->res->pay_len, cur->issue_cnt);
|
cur->res->code, cur->res->pay_len, cur->issue_cnt);
|
||||||
else
|
else
|
||||||
|
|
|
@ -83,10 +83,6 @@ do not parse HTML and other documents to find new links
|
||||||
.B \-o dir
|
.B \-o dir
|
||||||
write output to specified directory (required)
|
write output to specified directory (required)
|
||||||
.TP
|
.TP
|
||||||
.B \-J
|
|
||||||
be less noisy about MIME / charset mismatches on probably
|
|
||||||
static content
|
|
||||||
.TP
|
|
||||||
.B \-M
|
.B \-M
|
||||||
log warnings about mixed content or non-SSL password forms
|
log warnings about mixed content or non-SSL password forms
|
||||||
.TP
|
.TP
|
||||||
|
@ -147,6 +143,9 @@ timeout on idle HTTP connections (default: 10 s)
|
||||||
.TP
|
.TP
|
||||||
.B \-s s_limit
|
.B \-s s_limit
|
||||||
response size limit (default: 200000 B)
|
response size limit (default: 200000 B)
|
||||||
|
.TP
|
||||||
|
.B \-e
|
||||||
|
do not keep binary responses for reporting
|
||||||
|
|
||||||
.TP
|
.TP
|
||||||
.B \-h, \-\-help
|
.B \-h, \-\-help
|
||||||
|
|
13
skipfish.c
13
skipfish.c
|
@ -239,7 +239,7 @@ int main(int argc, char** argv) {
|
||||||
u32 loop_cnt = 0, purge_age = 0, seed;
|
u32 loop_cnt = 0, purge_age = 0, seed;
|
||||||
u8 dont_save_words = 0, show_once = 0, be_quiet = 0, display_mode = 0,
|
u8 dont_save_words = 0, show_once = 0, be_quiet = 0, display_mode = 0,
|
||||||
has_fake = 0;
|
has_fake = 0;
|
||||||
u8 *wordlist = (u8*)DEF_WORDLIST, *output_dir = NULL;
|
u8 *wordlist = NULL, *output_dir = NULL;
|
||||||
|
|
||||||
struct termios term;
|
struct termios term;
|
||||||
struct timeval tv;
|
struct timeval tv;
|
||||||
|
@ -421,7 +421,12 @@ int main(int argc, char** argv) {
|
||||||
break;
|
break;
|
||||||
|
|
||||||
case 'W':
|
case 'W':
|
||||||
wordlist = (u8*)optarg;
|
if (optarg[0] == '+') load_keywords((u8*)optarg + 1, 1, 0);
|
||||||
|
else {
|
||||||
|
if (wordlist)
|
||||||
|
FATAL("Only one -W parameter permitted (unless '+' used).");
|
||||||
|
wordlist = (u8*)optarg;
|
||||||
|
}
|
||||||
break;
|
break;
|
||||||
|
|
||||||
case 'b':
|
case 'b':
|
||||||
|
@ -526,7 +531,9 @@ int main(int argc, char** argv) {
|
||||||
if (max_connections < max_conn_host)
|
if (max_connections < max_conn_host)
|
||||||
max_connections = max_conn_host;
|
max_connections = max_conn_host;
|
||||||
|
|
||||||
load_keywords((u8*)wordlist, purge_age);
|
if (!wordlist) wordlist = (u8*)DEF_WORDLIST;
|
||||||
|
|
||||||
|
load_keywords(wordlist, 0, purge_age);
|
||||||
|
|
||||||
/* Schedule all URLs in the command line for scanning. */
|
/* Schedule all URLs in the command line for scanning. */
|
||||||
|
|
||||||
|
|
Loading…
Reference in New Issue