...
To find the IP address the following command can be entered outside of PySPARK:
Code Block |
---|
nslookup cn462cnXX (where XX is the compute node number) |
OR a user can ping the node and the response back will be the compute node’s IP address.
Code Block |
---|
ping cn462 cnXX (where XX is the compute node number) |
Once the IP address is placed within the HTTP link, the Web UI should load and look like the following:
...
Submission script
The spark-submit command from the previous submission script example should be able to run the Spark Python code.
Another way to call pyspark and pass the python script would be the following.
Code Block |
---|
#!/bin/bash
#SBATCH --job-name=jobNameHere # create a short name for your job
#SBATCH --nodes=2 # node count
#SBATCH --ntasks=252
#SBATCH --mem=20G # memory per node
#SBATCH --time=00:05:00 # total run time limit (HH:MM:SS)
#SBATCH --no-requeue
module load spark/3.5.0-mine
start-all.sh
echo $MASTER | tee master.txt
pyspark < script.py
stop-all.sh |