Package us.codecraft.webmagic.utils
Class UrlUtils
java.lang.Object
us.codecraft.webmagic.utils.UrlUtils
url and html utils.
- Since:
- 0.1.0
- Author:
- code4crafter@gmail.com
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionstatic String
canonicalizeUrl
(String url, String refer) canonicalizeUrl
Borrowed from Jsoup.convertToRequests
(Collection<String> urls) convertToUrls
(Collection<Request> requests) static String
Deprecated.static String
static String
getCharset
(String contentType) static String
static String
static String
removePort
(String domain) static String
removeProtocol
(String url)
-
Constructor Details
-
UrlUtils
public UrlUtils()
-
-
Method Details
-
canonicalizeUrl
canonicalizeUrl
Borrowed from Jsoup.- Parameters:
url
- urlrefer
- refer- Returns:
- canonicalizeUrl
-
encodeIllegalCharacterInUrl
Deprecated.- Parameters:
url
- url- Returns:
- new url
-
fixIllegalCharacterInUrl
-
getHost
-
removeProtocol
-
getDomain
-
removePort
-
convertToRequests
-
convertToUrls
-
getCharset
-