读书人

自个儿编写接口用于获取Hadoop Job co

发布时间: 2012-08-29 08:40:14 作者: rapoo

自己编写接口用于获取Hadoop Job conf 信息

Hadoop Job完成后可以设置回调接口,一个自定义的URL,比如我的:

?

public static RunningJob getJobById(String jobId) {Configuration conf = new Configuration();conf.set("mapred.job.tracker", Constants.MAP_REDUCE_URL);JobClient client;try {client = new JobClient(new JobConf(conf));return client.getJob(jobId);} catch (IOException e) {throw new RuntimeException(e);}}?

关键是这个RunningJob对象只可以获取jobname信息,但是无法获取我们设置的conf信息。为了解决这个问题,我写了一个jsp,放到namenode上,用于读取本地log文件,并将结果反馈给调用者。代码如下,希望对大家有帮助:

?

<%@ page  contentType="text/xml; charset=UTF-8"  import="javax.servlet.*"  import="javax.servlet.http.*"  import="java.io.*"  import="java.net.URL"  import="org.apache.hadoop.util.*"  import="javax.servlet.jsp.JspWriter"%><%!private static final long serialVersionUID = 1L;        public File getHistoryFile(final String jobId) {                File dir = new File("/opt/hadoop/logs/history/done/");                File[] files = dir.listFiles(new FilenameFilter() {                        public boolean accept(File dir, String name) {if (name.indexOf(jobId) >= 0 && name.endsWith(".xml")) {return true;}                                return false;                        }                });                if (files != null && files.length > 0) {                        return files[0];                }                return null;        }public void printXML(JspWriter out, String jobId) throws IOException {FileInputStream jobFile = null;String[] outputKeys = { "db_id", "channel_id", "channel_name", "user_id", "user_name", "job_day", "pub_format_type", "mapred.output.dir" };String line = "";try {jobFile = new FileInputStream(getLogFilePath(jobId));BufferedReader reader = new BufferedReader(new InputStreamReader(jobFile));while ((line = reader.readLine()) != null) {for (String key : outputKeys) {if (!line.startsWith("<property>") || line.indexOf("<name>" + key + "</name>") >= 0) {out.println(line);break;}}}} catch (Exception e) {out.println("Failed to retreive job configuration for job '" + jobId + "!");out.println(e);} finally {if (jobFile != null) {try {jobFile.close();} catch (IOException e) {}}}}private File getLogFilePath(String jobId) {String logDir = System.getProperty("hadoop.log.dir");if (logDir == null || logDir.length() == 0) {logDir = "/opt/hadoop/logs/";}File logFile = new File(logDir + File.separator + jobId + "_conf.xml");return logFile.exists() ? logFile : getHistoryFile(jobId);}%><%  response.setContentType("text/xml");  final String jobId = request.getParameter("jobid");  if (jobId == null) {    out.println("<h2>Missing 'jobid' for fetching job configuration!</h2>");    return;  }  printXML(out, jobId);%>
?

? ? 这里有个要点,运行中和刚完成的job xml文件放到了"/opt/hadoop/logs"下,归档的job xml放到了“/opt/hadoop/logs/history/done/”,所以要判断第一个地方找不到文件,去第二个地方找。

?

? ? 功能很简单,但是很有用。

?

--heipark

?

?

?

读书人网 >软件架构设计

热点推荐