|User: bruce lee -- 2011-09-09 << 851 853 >>|
|Type: Text file parser|
|Search all Text file parser examples|
|I have many web URL and want to know if there are "about","product"and "contact" |
hyperlink in every web and if there is "chemical" in "about" page.If not then delete it.
|Hint: You need to Download and install "Replace Pioneer" on windows platform to finish following steps.|
|The question is a little complicated, here we only provide solution to remove all web pages that does not contain some words, like 'about', 'product', 'contact'. |
1. prepare a webpage list file, each line contain an address start with 'http'
2. open 'Tools->Batch Runner' menu, and select 'import list' to import webpage list file
3. click 'fast replace' button open 'fast replace' window
4. click 'add' to add new rule:
* set 'search' to:
* set 'replace' to:
5. check option of 'reg exp' and 'extract'
6. click 'start' and select 'output to single file' and select a file for output, done.
Screenshot 1: Fast_Replace_Window