Package us.codecraft.webmagic.selector
Class Xpath2Selector
- java.lang.Object
-
- us.codecraft.webmagic.selector.Xpath2Selector
-
- All Implemented Interfaces:
NodeSelector
,us.codecraft.webmagic.selector.Selector
public class Xpath2Selector extends java.lang.Object implements us.codecraft.webmagic.selector.Selector, NodeSelector
支持xpath2.0的选择器。包装了HtmlCleaner和Saxon HE。- Author:
- code4crafter@gmail.com, hooy
Date: 13-4-21 Time: 上午9:39
-
-
Constructor Summary
Constructors Constructor Description Xpath2Selector(java.lang.String xpathStr)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description static Xpath2Selector
newInstance(java.lang.String xpathStr)
protected static org.w3c.dom.Document
parse(java.lang.String text)
java.lang.String
select(java.lang.String text)
java.lang.String
select(org.w3c.dom.Node node)
Extract single result in text.
If there are more than one result, only the first will be chosen.java.util.List<java.lang.String>
selectList(java.lang.String text)
java.util.List<java.lang.String>
selectList(org.w3c.dom.Node node)
Extract all results in text.org.w3c.dom.Node
selectNode(java.lang.String text)
org.w3c.dom.Node
selectNode(org.w3c.dom.Node node)
java.util.List<org.w3c.dom.Node>
selectNodes(java.lang.String text)
java.util.List<org.w3c.dom.Node>
selectNodes(org.w3c.dom.Node node)
-
-
-
Method Detail
-
newInstance
public static Xpath2Selector newInstance(java.lang.String xpathStr)
-
select
public java.lang.String select(java.lang.String text)
- Specified by:
select
in interfaceus.codecraft.webmagic.selector.Selector
-
select
public java.lang.String select(org.w3c.dom.Node node)
Description copied from interface:NodeSelector
Extract single result in text.
If there are more than one result, only the first will be chosen.- Specified by:
select
in interfaceNodeSelector
- Parameters:
node
- node- Returns:
- result
-
selectList
public java.util.List<java.lang.String> selectList(java.lang.String text)
- Specified by:
selectList
in interfaceus.codecraft.webmagic.selector.Selector
-
selectList
public java.util.List<java.lang.String> selectList(org.w3c.dom.Node node)
Description copied from interface:NodeSelector
Extract all results in text.- Specified by:
selectList
in interfaceNodeSelector
- Parameters:
node
- node- Returns:
- results
-
selectNode
public org.w3c.dom.Node selectNode(java.lang.String text)
-
selectNode
public org.w3c.dom.Node selectNode(org.w3c.dom.Node node)
-
selectNodes
public java.util.List<org.w3c.dom.Node> selectNodes(java.lang.String text)
-
selectNodes
public java.util.List<org.w3c.dom.Node> selectNodes(org.w3c.dom.Node node)
-
parse
protected static org.w3c.dom.Document parse(java.lang.String text) throws javax.xml.parsers.ParserConfigurationException
- Throws:
javax.xml.parsers.ParserConfigurationException
-
-