Replace Pioneer Home   All Examples   Free Download

 New request --free  RSS: Replace Pioneer Examples

1392.Text file parser -- How to extract tables from many html files into one csv file?

User: jame -- 2017-04-14          << 1391  1393 >>
Hits: 3422
Type: Text file parser   
Search all Text file parser examples
Description:
How to extract tables from many html files into one csv file? 
I have some html files downloaded from website, that has some tables inside it, all tables look like </thead> <tbody>xxxxxxxxxx</tbody>  
Input Sample:
html file 1: 
... 
</thead> 
<tbody><tr><td><a href="fjzt-x.html?uid=xxxxxx">data11</a></td> 
<td class="bzt">data12</td> 
<td>data13</td> 
    <td>data14</td> 
<td>data15</td> 
<td>data16</td> 
<td>data17</td> 
<td class="tdb"><span id="sxxxxxxx"></span></td> 
<td class="tdb"><span id="zfxxxxxxx"></span></td> 
<td class="bzt">--</td><td></td> 
</tr> 
<script src="https://hq.sinajs.cn/list=data18" type="text/javascript" charset="gbk"></script> 
<script type="text/javascript">getprice1('xxxxxxxx',xxxxxxx,x
Output Sample:
data11,data12,data13,data14,data15,data16,data17,data18 
data21,data22,data23,data24,data25,data26,data27,data28 
data31,data32,data33,data34,data35,data36,data37,data38 
data41,data42,data43,data44,data45,data46,data47,data48
Answer:
Hint: You need to Download and install "Replace Pioneer" on windows platform to finish following steps.
1. open "Tools->Batch Runner" menu 
2. drag multiple files from windows file browser to "batch runner" window 
3. click "add" to add new rules 
* set "search" to: 
 
* set "replace" to: 
 
* click "ok" 
4. make sure following options are checked: 
"Reg exp", "Cross line" and "Extract" 
5. click "start" 
6. click "output to single file" and select output file, done. 
 
Note:  
1. The output is very close to your required format, you can do another replacement easily. 
2. if you just need to extract a standard table, set 'search' to:  
set 'replace' to: 

Screenshot 1:  Fast_Replace_Window


Similar Examples:
How to extract titles from many html files into a txt file? (87%)
How to extract tables from html files into csv file automatically? (79%)
How to extract titles of all html files and save them to one file? (75%)
How to merge many files into one file? (73%)
How to extract/parse title from many html files and join together? (72%)
How to extract multiple fields from data file and create a csv file? (67%)
How to extract text from many webpage files and form a dabase file? (66%)
How to merge files in all folders into one folder? (65%)

Check Demo of Text file parser
Keywords:
extract table  body  xxxxx  replacement  website  table  site  inside  xxx  many html files  extract table from html file  extract tables from html  extract csv from html table  extract table html csv  extract table from html  extract html tables to csv  extract html table from multiple files  batch html table extract csv