AUG 30, 2013 14:50 PM
Last time I wrote about Hadoop configuration. I would like to make a clarification. I work with Cloudera's Hadoop distribution and what I write here applies to that distro only.
My blog today will be very short. It will contain important commands and files which as a Hadoop developer/admin you need to familiarize yourself with.
Important Files
There are a few important files and commands that you need to know about Hadoop. All of these files are found under /etc/hadoop/conf
- hadoop-env.sh: Environment specific configuration file
- core-site.xml: System level Hadoop configuration
- hdfs-site.xml: HDFS settings
- mapred-site.xml: Additional HDFS settings
- masters: Contains list of hosts which act as Hadoop masters
- slaves: Contains a list of files that act as Hadoop slaves
Important Command
Before you start using the following commands you will need to make sure that the namenode is running on your host. If it is not then you will get a network exception.
- "hadoop fs –ls /" Lists the files in HDFS root directory
- "hadoop job –lis"t Lists all the jobs running
What’s next?
As I mentioned today my blog is very short due to my heavy workload. In my next blog I will elaborate more on the files and commands discussed today.
If you or your company is interested in more information on this topic and other topics, be sure to keep reading my blog.
Also, as I am with the IEEE Computer Society, I should mention there are technology resources available 24/7 and specific training on custom topics available. Here is the IEEE CS program link if you are interested, TechLeader Training Partner Program, http://www.computer.