26 May 2015

info

  1. efficiently transferring bulk data between hadoop and structured datastores such as relational databases

  2. sqoop.apache.org

install use brew

  1. install

         $ brew install sqoop
         ==> Installing dependencies for sqoop: hive, zookeeper
         ==> Installing sqoop dependency: hive
         ==> Downloading https://www.apache.org/dyn/closer.cgi?path=hive/hive-1.1.0/apache-hive-1.1.0-bin.tar.gz
         ==> Best Mirror http://apache.fayea.com/hive/hive-1.1.0/apache-hive-1.1.0-bin.tar.gz
         ######################################################################## 100.0%
         ==> Caveats
         Hadoop must be in your path for hive executable to work.
         After installation, set $HIVE_HOME in your profile:
           export HIVE_HOME=/usr/local/Cellar/hive/1.1.0/libexec
    
         If you want to use HCatalog with Pig, set $HCAT_HOME in your profile:
           export HCAT_HOME=/usr/local/Cellar/hive/1.1.0/libexec/hcatalog
    
         You may need to set JAVA_HOME:
           export JAVA_HOME="$(/usr/libexec/java_home)"
         ==> Summary
         🍺  /usr/local/Cellar/hive/1.1.0: 701 files, 99M, built in 4.9 minutes
         ==> Installing sqoop dependency: zookeeper
         ==> Downloading https://homebrew.bintray.com/bottles/zookeeper-3.4.6_1.yosemite.bottle.1.tar.gz
         ######################################################################## 100.0%
         ==> Pouring zookeeper-3.4.6_1.yosemite.bottle.1.tar.gz
         ==> Caveats
         To have launchd start zookeeper at login:
             ln -sfv /usr/local/opt/zookeeper/*.plist ~/Library/LaunchAgents
         Then to load zookeeper now:
             launchctl load ~/Library/LaunchAgents/homebrew.mxcl.zookeeper.plist
         Or, if you don't want/need launchctl, you can just run:
             zkServer start
         ==> Summary
         🍺  /usr/local/Cellar/zookeeper/3.4.6_1: 207 files, 13M
         ==> Installing sqoop
         ==> Downloading http://www.apache.org/dyn/closer.cgi?path=sqoop/1.4.5/sqoop-1.4.5.bin__hadoop-2.0.4-alpha.tar.gz
         ==> Best Mirror http://apache.dataguru.cn/sqoop/1.4.5/sqoop-1.4.5.bin__hadoop-2.0.4-alpha.tar.gz
    
         curl: (28) Connection timed out after 5005 milliseconds
         Trying a mirror...
         ==> Downloading https://archive.apache.org/dist/sqoop/1.4.5/sqoop-1.4.5.bin__hadoop-2.0.4-alpha.tar.gz
         ######################################################################## 100.0%
         ==> Caveats
         Hadoop, Hive, HBase and ZooKeeper must be installed and configured
         for Sqoop to work.
         ==> Summary
         🍺  /usr/local/Cellar/sqoop/1.4.5: 66 files, 5.8M, built in 48 seconds
    
  2. dependencies

         $ brew deps sqoop
         hadoop
         hbase
         hive
         python
         zookeeper
    
    1. hadoop
    2. hbase
    3. hive
    4. python
    5. zookeeper

setup

  1. run sqoop

         $ sqoop
         Warning: /usr/local/Cellar/sqoop/1.4.5/libexec/bin/../../hcatalog does not exist! HCatalog jobs will fail.
         Please set $HCAT_HOME to the root of your HCatalog installation.
         Warning: /usr/local/Cellar/sqoop/1.4.5/libexec/bin/../../accumulo does not exist! Accumulo imports will fail.
         Please set $ACCUMULO_HOME to the root of your Accumulo installation.
         Warning: /usr/local/Cellar/sqoop/1.4.5/libexec/bin/../../zookeeper does not exist! Accumulo imports will fail.
         Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
         +======================================================================+
         |                    Error: JAVA_HOME is not set                       |
         +----------------------------------------------------------------------+
         | Please download the latest Sun JDK from the Sun Java web site        |
         |     > http://www.oracle.com/technetwork/java/javase/downloads        |
         |                                                                      |
         | HBase requires Java 1.7 or later.                                    |
         +======================================================================+
         Error: Could not find or load main class org.apache.sqoop.Sqoop
    
  2. set JAVA_HOME on mac

         $ pico .bash_profile
         export JAVA_HOME=`/usr/libexec/java_home -v 1.8`
    
  3. set HCAT_HOME

         $ pico .bash_profile
         export HCAT_HOME=/usr/local/Cellar/hive/1.1.0/libexec/hcatalog
    
  4. installing the jdbc drivers and check my post 2015-05-25-apache-hive-on-mac-osx-yosemite and connectors and drivers and documentation

         $ curl -L 'http://www.mysql.com/get/Downloads/Connector-J/mysql-connector-java-5.1.35.tar.gz/from/http://mysql.he.net/' | tar xz
         $ cd mysql-connector-java-5.1.35
         $ cp mysql-connector-java-5.1.35-bin.jar /usr/local/Cellar/sqoop/1.4.5/libexec/lib
    


blog comments powered by Disqus