Skip to content
Snippets Groups Projects
Commit 819b5be0 authored by Mario Rimann's avatar Mario Rimann Committed by Georg Ringer
Browse files

[BUGFIX] Links on external pages don't get indexed

Allows the crawler to start indexing a specific file like
www.domain.tld/foobar.html instead of just www.domain.tld/

This is just about the comparison against the base URL and
enables the Crawler to start crawling at e.g. a file that contains
a manually generated list of links to follow. Before that change,
even links to targets on the same domain were rejected by
the checkUrl() method in case the base Url was pointing to some
file instead of "/". This was because the base URL was then not
part of the target URL.
After stripping off any path from the base URL for this comparison
this can now also be used to start crawling from a file.

Change-Id: I2727a9a447754b88d2c279c24b32b5c3a2df26c0
Resolves: #16534
Releases: 6.2, 6.1, 6.0, 4.7, 4.5
Reviewed-on: https://review.typo3.org/6990
Reviewed-by: Michael Stucki
Tested-by: Michael Stucki
Reviewed-by: Georg Ringer
Tested-by: Georg Ringer
parent 485c07f0
No related merge requests found
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment