作者 主題: 篩選中文字  (閱讀 4686 次)

0 會員 與 1 訪客 正在閱讀本文。

劍客

  • 活潑的大學生
  • ***
  • 文章數: 238
    • 檢視個人資料
    • http://kalug.ks.edu.tw
篩選中文字
« 於: 2001-10-19 10:23 »
From: statue.bbs@bbs.cynix.com.tw (statue)
Subject:[Cle-devel] Re: b5strip: strip out Chinese big5 chars
To: cle-devel@linux.org.tw

※ 引述《jidanni@deadspam.com (Dan Jacobson)》之銘言:
> Dan> How can I strip out all big5 characters in a file on GNU Linux?  I do
> Dan> $ strings file, but it isn't perfect.  I'm trying to make a lazy
> Dan> alternative English only version of my webpage :smile:
> OK, thanks for the tips that were posted. Here's what I will
> contribute to linux.org.tw, and to anyone else, attached.
> Sorry I didn't make a man page.  I will also put it on my website,
> http://www.geocities.com/jidanni/ Tel+886-4-25854780 積丹尼
you may try this: http://freebsd.sinica.edu.tw/~statue/zh-tut/perl.html

    #!/usr/bin/perl -w
    # ./bg5rm.pl filename
    # and it's will generate a filename.bg5rm
    $ifname=$ARGV[0];
    open(IF,"$ifname");
    open(OF,">${ifname}.bg5rm");
    $big5 = "[xA1-xF9][x40-x7ExA1-xFE]";
    while(<IF>) {
      s/$big5//g;
      print OF $_;
      print $_;
    }
    close(IF);
    close(OF);
--

[ 這篇文章被編輯過:  劍客 在 2002-01-25 15:29 ]