酷!學園

精華區 => 拾人牙慧 => 主題作者是: 劍客 於 2001-10-19 10:23

主題: 篩選中文字
作者: 劍客2001-10-19 10:23
From: statue.bbs@bbs.cynix.com.tw (statue)
Subject:[Cle-devel] Re: b5strip: strip out Chinese big5 chars
To: cle-devel@linux.org.tw

※ 引述《jidanni@deadspam.com (Dan Jacobson)》之銘言:
> Dan> How can I strip out all big5 characters in a file on GNU Linux?  I do
> Dan> $ strings file, but it isn't perfect.  I'm trying to make a lazy
> Dan> alternative English only version of my webpage :smile:
> OK, thanks for the tips that were posted. Here's what I will
> contribute to linux.org.tw, and to anyone else, attached.
> Sorry I didn't make a man page.  I will also put it on my website,
> http://www.geocities.com/jidanni/ Tel+886-4-25854780 積丹尼
you may try this: http://freebsd.sinica.edu.tw/~statue/zh-tut/perl.html

    #!/usr/bin/perl -w
    # ./bg5rm.pl filename
    # and it's will generate a filename.bg5rm
    $ifname=$ARGV[0];
    open(IF,"$ifname");
    open(OF,">${ifname}.bg5rm");
    $big5 = "[xA1-xF9][x40-x7ExA1-xFE]";
    while(<IF>) {
      s/$big5//g;
      print OF $_;
      print $_;
    }
    close(IF);
    close(OF);
--

[ 這篇文章被編輯過:  劍客 在 2002-01-25 15:29 ]