资讯

精准传达 • 有效沟通

从品牌网站建设到网络营销策划,从策略到执行的一站式服务

pig的基本操作介绍

本篇内容介绍了“pig的基本操作介绍”的有关知识,在实际案例的操作过程中,不少人都会遇到这样的困境,接下来就让小编带领大家学习一下如何处理这些情况吧!希望大家仔细阅读,能够学有所成!

创新互联坚持“要么做到,要么别承诺”的工作理念,服务领域包括:成都网站设计、网站建设、外贸网站建设、企业官网、英文网站、手机端网站、网站推广等服务,满足客户于互联网时代的子长网站设计、移动媒体设计的需求,帮助企业找到有效的互联网解决方案。努力成为您成熟可靠的网络建设合作伙伴!

pig是什么?

我的理解是: pig就相当于 shell ,  hadoop就相当于linux  (所以我尽可能的会使用pig操作hadoop的文件)

1.进入HADOOP_HOME目录。
2.执行sh bin/hadoop
我们可以看到更多命令的说明信息:
Usage: hadoop [--config confdir] COMMAND
where COMMAND is one of:
  namenode -format     format the DFS filesystem
  secondarynamenode    run the DFS secondary namenode
  namenode             run the DFS namenode
  datanode             run a DFS datanode
  dfsadmin             run a DFS admin client
  fsck                 run a DFS filesystem checking utility
  fs                   run a generic filesystem user client
  balancer             run a cluster balancing utility
  jobtracker           run the MapReduce job Tracker node
  pipes                run a Pipes job
  tasktracker          run a MapReduce task Tracker node
  job                  manipulate MapReduce jobs
  queue                get information regarding JobQueues
  version              print the version
  jar            run a jar file
  distcp copy file or directories recursively
  archive -archiveName NAME * create a hadoop archive
  daemonlog            get/set the log level for each daemon
 or
  CLASSNAME            run the class named CLASSNAME
Most commands print help when invoked w/o parameters.

常用pig命令

ls/ pwd/ cd 

例如: 查看文件大小

grunt> fs -du -h -s 文件名
19.4 G  文件名

grunt> help

Commands:
; - See the PigLatin manual for details: http://hadoop.apache.org/pig
File system commands:
    fs - Equivalent to Hadoop dfs command: http://hadoop.apache.org/common/docs/current/hdfs_shell.html
Diagnostic commands:
    describe [::    explain [-script ] [-out ] [-brief] [-dot|-xml] [-param =]
        [-param_file ] [] - Show the execution plan to compute the alias or for entire script.
        -script - Explain the entire script.
        -out - Store the output into directory rather than print to stdout.
        -brief - Don't expand nested plans (presenting a smaller graph for overview).
        -dot - Generate the output in .dot format. Default is text format.
        -xml - Generate the output in .xml format. Default is text format.
        -param         -param_file - See parameter substitution for details.
        alias - Alias to explain.
    dump - Compute the alias and writes the results to stdout.
Utility Commands:
    exec [-param =param_value] [-param_file ]