读书人

用java中的IO类读入并统计英文小短文的

发布时间: 2012-01-19 00:22:27 作者: rapoo

用java中的IO类读入并统计英文小短文的单词数的程序,大家思考一下
用java中的IO类读入并统计英文小短文的单词数.
In late 1995, the Java programming language burst onto the Internet scene and gained instant

celebrity status. The promise of Java technology was that it would become the universal glue

that connects users with information, whether that information comes from web servers,

databases, information providers, or any other imaginable source. Indeed, Java is in a

unique position to fulfill this promise. It is an extremely solidly engineered language that

has gained acceptance by all major vendors, except for Microsoft. Its built-in security and

safety features are reassuring both to programmers and to the users of Java programs. Java

even has built-in support that makes advanced programming tasks, such as network

programming, database connectivity, and multithreading, straightforward.

Since 1995, Sun Microsystems has released six major revisions of the Java Development Kit.

Over the course of the last nine years, the Application Programming Interface (API) has

grown from about 200 to over 3,000 classes. The API now spans such diverse areas as user

interface construction, database management, internationalization, security, and XML

processing. JDK 5.0, released in 2004, is the most impressive update of the Java language

since the original Java release.

The book you have in your hand is the first volume of the seventh edition of the Core Java 2

book. With the publishing of each edition, the book followed the release of the Java

Development Kit as quickly as possible, and each time, we rewrote the book to take advantage

of the newest Java features. In this edition, we are enthusiastic users of generic

collections, the enhanced for loop, and other exciting features of JDK 5.0.

As with the previous editions of this book, we still target serious programmers who want to



put Java to work on real projects. We still guarantee no nervous text or dancing tooth-

shaped characters. We think of you, our reader, as a programmer with a solid background in a

programming language. But you do not need to know C++ or object-oriented programming. Based

on the responses we have received to the earlier editions of this book, we remain confident

that experienced Visual Basic, C, or COBOL programmers will have no trouble with this book.

(You don 't even need any experience in building graphical user interfaces for Windows, UNIX,

or the Macintosh.)




[解决办法]
共有369个单词

import java.io.*;
import java.util.*;
import java.util.regex.*;

public class MyTest {

/**
* @param args
*/
public static void main(String[] args) {
// TODO Auto-generated method stub
try{
BufferedReader br = new BufferedReader(new FileReader( "d:\\test\\hello.txt "));
ArrayList <String> article = new ArrayList <String> ();
String s;
while((s=br.readLine())!=null) {
if(s.length()> 0) article.add(s);
}
br.close();
Pattern p = Pattern.compile( "[a-zA-Z]\\w+|\\d+[,\\.]\\d+ ");
Matcher m;
int count = 0;
for(String t:article) {
m = p.matcher(t);
int pos = 0;
while(m.find(pos)) {
pos = m.end();
count++;
}
}
System.out.println(count);
}catch(Exception e) {
e.printStackTrace();
}

}

}
[解决办法]
正解 楼上
[解决办法]

public static void main(String[] args) throws IOException { 
  BufferedReader br = new BufferedReader(new FileReader( "english.txt "));
  int i = 1;    
  String s;
  while( (s = br.readLine()) != null) {    
    Pattern p = Pattern.compile( "([^\\s\\.]+) ");
    Matcher m = p.matcher(s);      
    while(m.find()){
      i++;
    }
  }
  System.out.println(i);
}

输出结果是:381,不知道对不
[解决办法]
呵呵,我一直在考虑能不能不用正则匹配来计算单词数目,以为我还没学到正则表达,刚学完IO这章,所以写的这个程序,因该是用问题的,希望指正,看能不能不用正则来计算

import java.io.*;
public class Count_words{
public static void main(String args[])throws Exception{
BufferedReader br = new BufferedReader(new FileReader(new File( "english.txt ")));

String str = " ";


int count = 0;
while((str=br.readLine())!=null){
if(str.equals( " ")){
continue;
}else{
String[] c = str.trim().split( " ");

count = count + c.length;
}
}


br.close();
System.out.println( "单词总数为 " + count);
}
}

结果是377个单词
[解决办法]
这个问题很复杂的,涉及单词重复和时态变化的问题。
[解决办法]
我认为bao110908(bao)(bao) 的程序不错,就是正则可能有点问题


Pattern p = Pattern.compile( "([^\\s\\.]+) ");
按小数点来区分可能在这里也至少把最后一个 ") "算作一个单词了 其次,还有诸如5.0 会看作两个单词,这样计算下来单词数会偏多


我觉得这样 以不含white space的串为区分
Pattern p = Pattern.compile( "[^\\s]+ ");
[解决办法]
作业帖
鉴定完毕

读书人网 >J2SE开发

热点推荐