Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

To find the IP address the following command can be entered outside of PySPARK:

Code Block
nslookup cn462cnXX  (where XX is the compute node number)

OR a user can ping the node and the response back will be the compute node’s IP address.

Code Block
ping cn462 cnXX  (where XX is the compute node number)

Once the IP address is placed within the HTTP link, the Web UI should load and look like the following:

...

Submission script

The spark-submit command from the previous submission script example should be able to run the Spark Python code.

Another way to call pyspark and pass the python script would be the following.

Code Block
#!/bin/bash
#SBATCH --job-name=jobNameHere      # create a short name for your job
#SBATCH --nodes=2                # node count
#SBATCH --ntasks=252
#SBATCH --mem=20G                # memory per node
#SBATCH --time=00:05:00          # total run time limit (HH:MM:SS)
#SBATCH --no-requeue

module load spark/3.5.0-mine

start-all.sh
echo $MASTER | tee master.txt

pyspark < script.py

stop-all.sh