Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 5 additions & 5 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -1,14 +1,14 @@
FROM sequenceiq/hadoop-docker:2.6.0
FROM sequenceiq/hadoop-docker:2.7.0
MAINTAINER SequenceIQ

#support for Hadoop 2.6.0
RUN curl -s http://d3kbcqa49mib13.cloudfront.net/spark-1.6.1-bin-hadoop2.6.tgz | tar -xz -C /usr/local/
RUN cd /usr/local && ln -s spark-1.6.1-bin-hadoop2.6 spark
#support for Hadoop 2.7.0
RUN curl -s http://d3kbcqa49mib13.cloudfront.net/spark-2.0.0-bin-hadoop2.7.tgz | tar -xz -C /usr/local/
RUN cd /usr/local && ln -s spark-2.0.0-bin-hadoop2.7 spark
ENV SPARK_HOME /usr/local/spark
RUN mkdir $SPARK_HOME/yarn-remote-client
ADD yarn-remote-client $SPARK_HOME/yarn-remote-client

RUN $BOOTSTRAP && $HADOOP_PREFIX/bin/hadoop dfsadmin -safemode leave && $HADOOP_PREFIX/bin/hdfs dfs -put $SPARK_HOME-1.6.1-bin-hadoop2.6/lib /spark
RUN $BOOTSTRAP && $HADOOP_PREFIX/bin/hadoop dfsadmin -safemode leave && $HADOOP_PREFIX/bin/hdfs dfs -put $SPARK_HOME-2.0.0-bin-hadoop2.7/jars /spark

ENV YARN_CONF_DIR $HADOOP_PREFIX/etc/hadoop
ENV PATH $PATH:$SPARK_HOME/bin:$HADOOP_PREFIX/bin
Expand Down
14 changes: 7 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,12 +10,12 @@ The base Hadoop Docker image is also available as an official [Docker image](htt

##Pull the image from Docker Repository
```
docker pull sequenceiq/spark:1.6.0
docker pull sequenceiq/spark:2.0.0
```

## Building the image
```
docker build --rm -t sequenceiq/spark:1.6.0 .
docker build --rm -t sequenceiq/spark:2.0.0
```

## Running the image
Expand All @@ -24,16 +24,16 @@ docker build --rm -t sequenceiq/spark:1.6.0 .
* in your /etc/hosts file add $(boot2docker ip) as host 'sandbox' to make it easier to access your sandbox UI
* open yarn UI ports when running container
```
docker run -it -p 8088:8088 -p 8042:8042 -p 4040:4040 -h sandbox sequenceiq/spark:1.6.0 bash
docker run -it -p 8088:8088 -p 8042:8042 -p 4040:4040 -h sandbox sequenceiq/spark:2.0.0 bash
```
or
```
docker run -d -h sandbox sequenceiq/spark:1.6.0 -d
docker run -d -h sandbox sequenceiq/spark:2.0.0 -d
```

## Versions
```
Hadoop 2.6.0 and Apache Spark v1.6.0 on Centos
Hadoop 2.7.0 and Apache Spark v2.0.0 on Centos
```

## Testing
Expand Down Expand Up @@ -71,7 +71,7 @@ spark-submit \
--driver-memory 1g \
--executor-memory 1g \
--executor-cores 1 \
$SPARK_HOME/lib/spark-examples-1.6.0-hadoop2.6.0.jar
$SPARK_HOME/lib/spark-examples-2.0.0-hadoop2.7.0.jar
```

Estimating Pi (yarn-client mode):
Expand All @@ -84,7 +84,7 @@ spark-submit \
--driver-memory 1g \
--executor-memory 1g \
--executor-cores 1 \
$SPARK_HOME/lib/spark-examples-1.6.0-hadoop2.6.0.jar
$SPARK_HOME/lib/spark-examples-2.0.0-hadoop2.7.0.jar
```

### Submitting from the outside of the container
Expand Down
2 changes: 1 addition & 1 deletion bootstrap.sh
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ cd $HADOOP_PREFIX/share/hadoop/common ; for cp in ${ACP//,/ }; do echo == $cp;
sed s/HOSTNAME/$HOSTNAME/ /usr/local/hadoop/etc/hadoop/core-site.xml.template > /usr/local/hadoop/etc/hadoop/core-site.xml

# setting spark defaults
echo spark.yarn.jar hdfs:///spark/spark-assembly-1.6.0-hadoop2.6.0.jar > $SPARK_HOME/conf/spark-defaults.conf
echo spark.yarn.jar hdfs:///spark/spark-yarn_2.11-2.0.0.jar > $SPARK_HOME/conf/spark-defaults.conf
cp $SPARK_HOME/conf/metrics.properties.template $SPARK_HOME/conf/metrics.properties

service sshd start
Expand Down