脚本探索初步试验成功!

脚本探索初步试验成功!

一直以来,FlvExport的一大局限在于只能我自己添加探索站点,不利于扩展,而对于探索不同的站点来说,其实都是采用一些写好的模块化的东西,所不同的仅仅是组合的方式,正是这个原因,我最近一直在研究利用脚本探索,把写好的东西更加灵活的组合在一起。
 
今天,初步实验成功,可以根据我写的脚本探索出网页的具体内容了。
 
脚本采用xml语言,附上一个例子:
<?xml version = "1.0" encoding = "UTF-8" standalone = "no"?>
<target name = "Sina Digest" default = "getNameAndURL">
 <task name = "init">
  <define name = "flvNames" type = "vector"/>
  <define name = "flvURLs" type = "vector"/>
  <define name = "curLine" type = "string"/>
  <define name = "curURL" type = "string"/>
  <define name = "curName" type = "string"/>
 </task>
 <task name = "getNameAndURL" depends = "init">
  <connection type = "url" value = "http://v.blog.sina.com.cn/">
   <getLine dest = "curLine"/>
   <while condition = "NE" left = "curLine" right = "null">
    <regex src = "curLine" dest = "curURL"> <![CDATA[http://v.blog.sina.com.cn/b/[^"]*html]]>
      </regex>
    <if condition = "NE" left = "curURL" right = "null">
     <vector method = "add" src = "curURL" dest = "flvURLs"/>
     <call name = "findName" src = "curLine" dest = "curName"/>
     <if condition = "NE" left = "curName" right = "null">
      <vector method = "add" src = "curName" dest =
       "flvNames"/>
     </if>
    </if>
    <getLine dest = "curLine"/>
   </while>
  </connection>
 </task>
</target>

Leave a Reply

Your email address will not be published. Required fields are marked *