读书人

php怎么对文件做处理并提取内容生成

发布时间: 2013-11-03 15:39:14 作者: rapoo

php如何对文件做处理,并提取内容生成文件?
这是一个原html文件


<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=gb2312" />
<title>无标题文档</title>
</head>
<body>
<div class="header"></div>

<div class="warp">
<div><br /></div>
<div><br /></div>
<div><br /></div>
<div><br /></div>
</div>

<div class="footer"></div>
</body>
</html>



这是一个被处理过的html文件



<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<!-- Start -->
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=gb2312" />
<title>无标题文档</title>
</head>
<body>
<div class="header"></div>
<!-- End -->


<!-- Start -->
<div class="warp">
<div><br /></div>
<div><br /></div>
<div><br /></div>
<div><br /></div>
</div>
<!-- End -->


<!-- Start -->
<div class="footer"></div>
</body>
</html>
<!-- End -->




然后提取内容片段,生成分别生成以下三个文件、


提取这个部分,生成 head.html

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<!-- Start -->
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=gb2312" />
<title>无标题文档</title>
</head>
<body>
<div class="header"></div>
<!-- End -->




提取这个部分,生成 index.html

<!-- Start -->
<div class="warp">
<div><br /></div>
<div><br /></div>
<div><br /></div>
<div><br /></div>
</div>
<!-- End -->




提取这个部分,生成 foot.html
<!-- Start -->
<div class="footer"></div>
</body>
</html>
<!-- End -->


php生成文件?php
[解决办法]
$s =<<< HTML
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<!-- Start -->
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=gb2312" />
<title>无标题文档</title>
</head>
<body>
<div class="header"></div>
<!-- End -->


<!-- Start -->
<div class="warp">
<div><br /></div>
<div><br /></div>
<div><br /></div>
<div><br /></div>
</div>
<!-- End -->


<!-- Start -->
<div class="footer"></div>
</body>
</html>
<!-- End -->
HTML;
preg_match_all('/.*?<!-- End -->/is', $s, $r);
print_r($r[0]);
得到
Array
(
[0] => <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<!-- Start -->
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=gb2312" />
<title>无标题文档</title>
</head>
<body>
<div class="header"></div>
<!-- End -->
[1] =>


<!-- Start -->
<div class="warp">
<div><br /></div>
<div><br /></div>
<div><br /></div>
<div><br /></div>


</div>
<!-- End -->
[2] =>


<!-- Start -->
<div class="footer"></div>
</body>
</html>
<!-- End -->
)


写到文件
$fn = array('head.html', ' index.html', 'foot.html');
foreach($r[0] as $i=>$s)
file_put_contents($fn[$i], $s);

[解决办法]
$s =<<< HTML
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=gb2312" />
<title>无标题文档</title>
</head>
<body>
<div class="header"></div>

<div class="warp">
<div><br /></div>
<div><br /></div>
<div><br /></div>
<div><br /></div>
</div>

<div class="footer"></div>
</body>
</html>
HTML;
$ar = preg_split("/(<!.+?>\s+
[解决办法]
\r?\n\s\r?\n)/s", $s, -1, PREG_SPLIT_NO_EMPTY
[解决办法]
PREG_SPLIT_DELIM_CAPTURE);
$d = array(PHP_EOL.'<!-- End -->', '<!-- Start -->'.PHP_EOL);
foreach($ar as $i=>$v) {
if($i) echo $d[$i%2];
echo $v;
}
echo $d[0];
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<!-- Start -->
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=gb2312" />
<title>无标题文档</title>
</head>
<body>
<div class="header"></div>
<!-- End -->

<!-- Start -->
<div class="warp">
<div><br /></div>
<div><br /></div>
<div><br /></div>
<div><br /></div>
</div>
<!-- End -->

<!-- Start -->
<div class="footer"></div>
</body>
</html>
<!-- End -->

读书人网 >PHP

热点推荐