install hadoop
$ brew install hadoop
==> Downloading http://www.apache.org/dyn/closer.cgi?path=hadoop/common/hadoop-2.6.0/h
==> Best Mirror http://mirrors.hust.edu.cn/apache/hadoop/common/hadoop-2.6.0/hadoop-2.
######################################################################## 100.0%
==> Caveats
In Hadoop's config file:
/usr/local/Cellar/hadoop/2.6.0/libexec/etc/hadoop/hadoop-env.sh,
/usr/local/Cellar/hadoop/2.6.0/libexec/etc/hadoop/mapred-env.sh and
/usr/local/Cellar/hadoop/2.6.0/libexec/etc/hadoop/yarn-env.sh
$JAVA_HOME has been set to be the output of:
/usr/libexec/java_home
==> Summary
🍺 /usr/local/Cellar/hadoop/2.6.0: 6140 files, 307M, built in 8.9 minutes
- hadoop will be installed in the directory
/usr/local/Cellar/hadoop
configuring hadoop
-
create a
soft-link
check my post command ln difference between soft link and hard link$ cd /usr/local $ ln -s Cellar/hadoop/2.6.0 hadoop
-
edit
hadoop-env.sh
$ cd hadoop/libexec/etc/hadoop/ $ pico hadoop-env.sh # export HADOOP_OPTS="$HADOOP_OPTS -Djava.net.preferIPv4Stack=true" export HADOOP_OPTS="$HADOOP_OPTS -Djava.net.preferIPv4Stack=true -Djava.security.krb5.realm= -Djava.security.krb5.kdc="
-
edit
core-site.xml
$ pico core-site.xml <?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <property> <name>hadoop.tmp.dir</name> <value>/usr/local/hadoop/hdfs/tmp</value> <description>A base for other temporary directories.</description> </property> <property> <name>fs.default.name</name> <value>hdfs://localhost:9000</value> </property> </configuration>
-
edit
mapred-site.xml
$ pico mapred-site.xml <?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <property> <name>mapred.job.tracker</name> <value>localhost:9010</value> </property> </configuration>
-
edit
hdfs-site.xml
$ pico hdfs-site.xml <?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <property> <name>mapred.job.tracker</name> <value>localhost:9010</value> </property> </configuration>
-
create alias
-
create
hstart
andhstop
$ cd $ pico .bash_profile alias hstart="/usr/local/hadoop/sbin/start-dfs.sh;/usr/local/hadoop/sbin/start-yarn.sh" alias hstop="/usr/local/hadoop/sbin/stop-yarn.sh;/usr/local/hadoop/sbin/stop-dfs.sh"
-
and execute
$ source ~/.bash_profile
-
-
before we can run hadoop we first need to format the hdfs
$ hdfs namenode -format
ssh localhost
-
check
ssh keys
-
nothing needs to be done here if you have already generated ssh keys
-
to verify just check for the existance of
~/.ssh/id_rsa
and the~/.ssh/id_rsa.pub
files -
if not the keys can be generated using
$ ssh-keygen -t rsa -P ""
-
-
enable
remote login
-
check
remote login
# system preferences -> sharing -> remote login $ sudo systemsetup -setremotelogin on Password: setremotelogin: remote login is already On. $ sudo systemsetup -setremotelogin off Do you really want to turn remote login off? If you do, you will lose this connection and can only turn it back on locally at the server (yes/no)? yes $ sudo systemsetup -setremotelogin on
-
authorize ssh keys
$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
-
try to login
$ ssh localhost > last login: Fri Mar ... $ exit
-
running hadoop
-
now we can run hadoop just by typing
$ hstart
-
and stopping using
$ hstop
download examples
-
examples
-
test them out using
$ hadoop jar </path/to/hadoop-examples file> pi 10 100
good to know
-
hadoop web interface
errors
-
faild to start namenode
$ hdfs namenode
-
and the problem is…
$ hadoop namenode -format
-
-
no such file or directory
$ hstart $ hdfs dfs -ls /
-
then we need to create the default directory structure hadoop expects
$ whoami > spaceship $ hdfs dfs -mkdir -p /user/spaceship > ... $ hdfs dfs -ls > ... $ hdfs dfs -put book.txt > ... $ hdfs dfs -ls > ... > found 1 items
-