{"data":{"markdownRemark":{"html":"<h2 id=\"running-on-hadoop\"><a href=\"#running-on-hadoop\" aria-hidden=\"true\" class=\"anchor\"><svg aria-hidden=\"true\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>Running on Hadoop</h2>\n<p>The Simudyne SDK only requires that you have a Hadoop based environment with HDFS, Yarn, and Spark 3.2+ installed onto your cluster. Optionally if you wish to use Parquet with Hive that will also need to be installed <a href=\":version/reference/data_export/hive\">see here on how to configure</a>.</p>\n<div class=\"ui segment warning message\">\nSpecific Versions\n<ul>\n<li>Spark 2 - use version 2.3.x of Simudyne SDK</li>\n<li>Spark 3 - use version 2.5.2+ of Simudyne SDK</li>\n</ul>\n<p>Please note that you should not use version 2.4.0-2.5.0 of the SDK if you wish to use Spark. This is because version 2.4 uses Scala 12 which is only supported by version 3 of Spark, and the libraries to complement this Spark 3 support are included in version 2.5.2+</p>\n</div>\n<p>Because of the usage of the driver/worker nodes Java 8+ is required on all nodes, however most recent versions of Hadoop installs should satisfy this.</p>\n<p>Our recommended setup is to make usage of <a href=\"https://www.cloudera.com/products/cloudera-data-platform.html\">Cloudera CDP</a> for connecting to your existing or Azure/AWS environments. A Data Engineering template with inclusion of Spark is recommended for usage with Simudyne's SDK.</p>\n<p>However, as long as you have a valid Hadoop cluster with Spark, Spark on Yarn, and HDFS you should be able to work with the SDK. The core requirement is being able to submit a spark job that includes a singularly packaged jar file of your simulation along with any configuration or data lakes included.</p>\n<h2 id=\"setting-up-spark\"><a href=\"#setting-up-spark\" aria-hidden=\"true\" class=\"anchor\"><svg aria-hidden=\"true\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>Setting up Spark</h2>\n<p>The first requirement is to install Spark , running standalone or on top of Hadoop YARN. **The required version is Spark 3.2+ **</p>\n<p>We recommend using the version of Spark running on Cloudera products : <a href=\"https://www.cloudera.com/products/open-source/apache-hadoop/apache-spark.html\">https://www.cloudera.com/products/open-source/apache-hadoop/apache-spark.html</a></p>\n<p>Once Spark is installed you can check it is running correctly launching the Spark-shell in a terminal : </p>\n<div class=\"gatsby-highlight\" data-language=\"bash\"><pre class=\"language-bash\"><code class=\"language-bash\">./bin/spark-shell</code></pre></div>\n<p>\n  <a\n    class=\"gatsby-resp-image-link\"\n    href=\"/static/bash_spark-shell-cb318999964e5c815243c27abc78d4e5-68bde.png\"\n    style=\"display: block\"\n    target=\"_blank\"\n    rel=\"noopener\"\n  >\n  \n  <span\n    class=\"gatsby-resp-image-wrapper\"\n    style=\"position: relative; display: block; padding: 20px; max-width: 642px; margin-left: auto; margin-right: auto;\"\n  >\n    <span\n      class=\"gatsby-resp-image-background-image\"\n      style=\"padding-bottom: 47.28971962616823%; position: relative; bottom: 0; left: 0; background-image: url('data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAAJCAYAAAAywQxIAAAACXBIWXMAABJ0AAASdAHeZh94AAABQUlEQVQoz5VS2W6DQAxcQjkDhDOcAYWiqj+VKFKi/P8fTD1uidKqL3kYeXdtj2cMZvn4xDzPOB6PP3HGsizo+h6HYUAvkRjk3DStxqIokKYpkiRBGIZ6Nsbgfr/DtP0BbdOgbVslyPMcu90OcRwr2LDdbrU5iiJ945k1fLdtW2tIeL1eYbphRN912O/3mKZJlVFBVVUPsFkhSpjLsgxlWSLLM30PgkAJb7ebKBy+FTaCcRwVdV0rVLXcmSMB1VcS10FlUSq5H/hKeD6fYab5/UHQidJUCgppWu1RAS36vq9KaNN1XSVYQduMp9MJJowTKfRhWZYm1uQrWHsulwuM9eYgkOme56kCTnccR+Nms3mdMElzsRSrRX4tRlrkr8A7B9EuBxAc+nfQL8KiqpHKnkjAfXH5JCIJC7mKZ/yn+pnwCwBa1gEi5E8DAAAAAElFTkSuQmCC'); background-size: cover; display: block;\"\n    >\n      <picture>\n        <source\n          srcset=\"/static/bash_spark-shell-cb318999964e5c815243c27abc78d4e5-38156.webp 173w,\n/static/bash_spark-shell-cb318999964e5c815243c27abc78d4e5-85678.webp 345w,\n/static/bash_spark-shell-cb318999964e5c815243c27abc78d4e5-5c2c3.webp 690w,\n/static/bash_spark-shell-cb318999964e5c815243c27abc78d4e5-e2dab.webp 1035w,\n/static/bash_spark-shell-cb318999964e5c815243c27abc78d4e5-efaec.webp 1070w\"\n          sizes=\"(max-width: 642px) 100vw, 642px\"\n          type=\"image/webp\"\n        />\n        <source\n          srcset=\"/static/bash_spark-shell-cb318999964e5c815243c27abc78d4e5-c006b.png 173w,\n/static/bash_spark-shell-cb318999964e5c815243c27abc78d4e5-484fe.png 345w,\n/static/bash_spark-shell-cb318999964e5c815243c27abc78d4e5-51909.png 690w,\n/static/bash_spark-shell-cb318999964e5c815243c27abc78d4e5-7c4f0.png 1035w,\n/static/bash_spark-shell-cb318999964e5c815243c27abc78d4e5-68bde.png 1070w\"\n          sizes=\"(max-width: 642px) 100vw, 642px\"\n          type=\"image/png\"\n        />\n        <img\n          class=\"gatsby-resp-image-image\"\n          style=\"width: 100%; height: 100%; margin: 0; vertical-align: middle; position: absolute; top: 0; left: 0; box-shadow: inset 0px 0px 0px 400px white;\"\n          src=\"/static/bash_spark-shell-cb318999964e5c815243c27abc78d4e5-51909.png\"\n          alt=\"bash spark shell\"\n          title=\"\"\n          src=\"/static/bash_spark-shell-cb318999964e5c815243c27abc78d4e5-51909.png\"\n        />\n      </picture>\n      </span>\n  </span>\n  \n  </a>\n    </p>\n<p>You need to identify your <strong>Spark master URL</strong> which points towards the master node of your cluster. Above, the master URL indicates Spark is running locally (master = local[*]).\nThe master URL should generally be a <code class=\"language-text\">spark://host:port</code> type of URL on a standalone cluster.</p>","headings":[{"value":"Running on Hadoop","depth":2},{"value":"Setting up Spark","depth":2}],"frontmatter":{"title":"Distributed Requirements","toc":null,"experimental":null}},"site":{"siteMetadata":{"title":"Simudyne Docs","latestVersion":"2.6"}}},"pageContext":{"absolutePath":"/home/vsts/work/1/s/content/2.5/reference/distributed_computation/distributed_requirements.md","versioned":true,"version":"2.5","kind":"reference","pagePath":"/reference/distributed_computation/distributed_requirements","chronology":{"prev":{"name":"Distributed Computation","path":"/reference/distributed_computation"},"next":{"name":"Spark Setup","path":"/reference/distributed_computation/spark_setup"}},"lastUpdated":"2026-04-21T13:56:54.861Z"}}