{"data":{"markdownRemark":{"html":"<p>The Simudyne SDK has built in output generators for all of it's varying run types, and in multiple different formats. The following pages will be helpful in understanding if there are certain restrictions, requirements, or configurations for certain output formats. However most information on using these outputs is the same regardless of format. </p>\n<h2 id=\"output-formats\"><a href=\"#output-formats\" aria-hidden=\"true\" class=\"anchor\"><svg aria-hidden=\"true\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>Output Formats</h2>\n<ul>\n<li><a href=\":version/reference/data_export/parquet\">Parquet</a>: Parquet is the preferred output format for the SDK due to it's ability to minimize file size and work with data science tools</li>\n<li><a href=\":version/reference/data_export/json\">JSON</a>: Not to be confused with the output via the REST API, creates static JSON files for consumption</li>\n<li><a href=\":version/reference/data_export/csv\">CSV</a>: CSV tabling allows for easy import and/or working with Excel</li>\n<li><a href=\":version/reference/data_export/sql\">MySQL</a>: Currently only MySQL tables are support</li>\n<li><a href=\":version/reference/data_export/hive\">HIVE via Parquet</a>: Allows you to use HIVE tables on an existing or external cluster in the Parquet format</li>\n<li><a href=\":version/reference/data_export/h2\">H2 &#x26; Other Connections</a>: A brief explanation of other ways to handle output connections.</li>\n</ul>\n<h2 id=\"batch-run-export\"><a href=\"#batch-run-export\" aria-hidden=\"true\" class=\"anchor\"><svg aria-hidden=\"true\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>Batch Run Export</h2>\n<p>By default, when running a batch run, Agent and Link data is not serialised, and so not output to parquet. This is to reduce the amount of data being held in memory when sending the batch results to the console. If the data is being output to parquet and does not need to be viewed on the console, the in memory data storage can be turned off allowing the Simudyne SDK to export Agent and Link data to parquet as well as the general Model data. This is done by setting the config field <code class=\"language-text\">core.return-data</code> to <code class=\"language-text\">false</code>. </p>\n<p>For large model runs that produce a lot of data, setting this config field to false will also reduce the amount of memory being held by the simulation, which can help avoid potential OutOfMemory exceptions and improve the efficiency of the model.</p>\n<p>If the data does not need to be displayed on the console, but Agent and Link data is not needed, the config fields <code class=\"language-text\">core-abm.serialize.agents</code> and <code class=\"language-text\">core-abm.serialize.links</code> should be set to false, to avoid generating uncessary data.</p>\n<h2 id=\"scenario-run-export\"><a href=\"#scenario-run-export\" aria-hidden=\"true\" class=\"anchor\"><svg aria-hidden=\"true\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>Scenario Run Export</h2>\n<p>Scenario runs do not hold the data in memory because they are not managed by the console, and the data cannot be viewed on the console. This means that Agent and Link data is serialised by default, and so should be explicitly turned off if not needed. (Use the config fields <code class=\"language-text\">core-abm.serialize.agents</code> and <code class=\"language-text\">core-abm.serialize.links</code> to control this.)</p>\n<p>Data export format for scenario runs is controlled via the POST request sent to start the scenario run. (See the scenario REST specification for more details on the POST request <a href=\":version/rest_api/scenario\">here</a>.)</p>\n<p>By default the scenario will output data as JSON files. To specify the output format as parquet, set the 'format' field in the 'output' section of the POST request.</p>\n<div class=\"gatsby-highlight\" data-language=\"json\"><pre class=\"language-json\"><code class=\"language-json\"><span class=\"token punctuation\">{</span>\n  //Other scenario json fields\n  <span class=\"token property\">\"output\"</span><span class=\"token operator\">:</span> <span class=\"token punctuation\">{</span><span class=\"token property\">\"uri\"</span><span class=\"token operator\">:</span> <span class=\"token string\">\"/path/to/export/to\"</span> <span class=\"token punctuation\">,</span> <span class=\"token property\">\"format\"</span><span class=\"token operator\">:</span> <span class=\"token string\">\"parquet\"</span><span class=\"token punctuation\">}</span>\n<span class=\"token punctuation\">}</span></code></pre></div>\n<h2 id=\"model-sampler-export\"><a href=\"#model-sampler-export\" aria-hidden=\"true\" class=\"anchor\"><svg aria-hidden=\"true\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>Model Sampler Export</h2>\n<p>The model sampler will always output data to parquet. As with scenarios, the data is not held in memory, so Agent and Link data is serialised by default and should be explicity turned off if not needed using the config fields <code class=\"language-text\">core-abm.serialize.agents</code> and <code class=\"language-text\">core-abm.serialize.links</code>.</p>\n<h2 id=\"output-directory-structure\"><a href=\"#output-directory-structure\" aria-hidden=\"true\" class=\"anchor\"><svg aria-hidden=\"true\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>Output Directory Structure</h2>\n<p>When exporting data to parquet, the folder layout can be specified in the config using the config field <code class=\"language-text\">core.export.folder-structure</code>. There are two options supported for this field, <code class=\"language-text\">group-by-type</code> and <code class=\"language-text\">group-by-run</code>. If no value is specified, it will default to <code class=\"language-text\">group-by-type</code>.</p>\n<h3 id=\"group-by-type-structure\"><a href=\"#group-by-type-structure\" aria-hidden=\"true\" class=\"anchor\"><svg aria-hidden=\"true\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>Group by type structure</h3>\n<p>When the output folder structure is group by type, folders are created for each output table type, and a output file for each run is created inside these folders.</p>\n<p>For this example, the root export directory passed through the config field <code class=\"language-text\">core.export-path</code> is /exportFolder. The output is also shown as parquet files, but could be JSON/CSV/etc.</p>\n<p class=\"code-header\">Group by type batch output folders</p>\n<div class=\"gatsby-highlight\" data-language=\"text\"><pre class=\"language-text\"><code class=\"language-text\">/exportFolder/\n    {simulation_id}/\n        runs/\n            root/\n              run000.parquet\n              run001.parquet\n              run002.parquet\n            root__system__Agents__Cell\n              run000.parquet\n              run001.parquet\n              run002.parquet\n            metadata.json\n            finished.json</code></pre></div>\n<ul>\n<li>exportFolder -> This is the root export directory</li>\n<li>{simulation_id} -> This is the UUID created for every run of the simulation (This is the ID used with the REST API)</li>\n<li>runs -> The root folder for all output run data</li>\n<li>root -> The data for each output table type will be in its own folder    </li>\n<li>run000.parquet, run001.parquet -> The output files created for each run.</li>\n<li>metadata.json -> A file containing some metadata about the data produced.</li>\n<li>finished.json -> An empty file created to signal that no new data will be added to this directory.</li>\n</ul>\n<p class=\"code-header\">Group by type scenario output folders</p>\n<div class=\"gatsby-highlight\" data-language=\"text\"><pre class=\"language-text\"><code class=\"language-text\">/exportFolder/\n    {simulation_id}/\n        runs/\n            root/\n              scenario0run0001.parquet\n              scenario0run0002.parquet\n              scenario0run0003.parquet\n            root__system__Agents__Cell\n              scenario0run0001.parquet\n              scenario0run0002.parquet\n              scenario0run0003.parquet\n            metadata.json\n            finished.json</code></pre></div>\n<ul>\n<li>exportFolder -> This is the root export directory</li>\n<li>{simulation_id} -> This is the UUID created for every run of the simulation (This is the ID used with the REST API)</li>\n<li>runs -> The root folder for all output run data</li>\n<li>root -> The data for each output table type will be in its own folder   </li>\n<li>scenario0.run0.parquet -> The output files created for each run.   </li>\n<li>metadata.json -> A file containing some metadata about the data produced.</li>\n<li>finished.json -> An empty file created to signal that no new data will be added to this directory.</li>\n</ul>\n<p>The model sampler output folders will match the scenario output folders. </p>\n<h3 id=\"group-by-run-structure\"><a href=\"#group-by-run-structure\" aria-hidden=\"true\" class=\"anchor\"><svg aria-hidden=\"true\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>Group by run structure</h3>\n<p>When the output folder structure is group by runs, folders are created for each simulation run, and a output file for each table type is created inside these folders.</p>\n<p>For this example, the root export directory passed through the config field <code class=\"language-text\">core.export-path</code> is /exportFolder.</p>\n<p class=\"code-header\">Group by run batch output folders</p>\n<div class=\"gatsby-highlight\" data-language=\"text\"><pre class=\"language-text\"><code class=\"language-text\">/exportFolder/\n    {simulation_id}/\n        runs/\n            run000/\n              root001.parquet\n              root__system__Agents__Cell001.parquet\n            run001/\n              root001.parquet\n              root__system__Agents__Cell001.parquet\n            run002/\n              root001.parquet\n              root__system__Agents__Cell001.parquet\n            metadata.json\n            finished.json</code></pre></div>\n<ul>\n<li>exportFolder -> This is the root export directory</li>\n<li>{simulation_id} -> This is the UUID created for every run of the simulation (This is the ID used with the REST API)</li>\n<li>runs -> The root folder for all output run data</li>\n<li>run0 -> The data for each run of the simulation will be in its own folder     </li>\n<li>root.parquet, root<strong>system</strong>Agents__Cell001.parquet -> The output files created.</li>\n<li>metadata.json -> A file containing some metadata about the data produced.</li>\n<li>finished.json -> An empty file created to signal that no new data will be added to this directory.</li>\n</ul>\n<p class=\"code-header\">Group by run scenario output folders</p>\n<div class=\"gatsby-highlight\" data-language=\"text\"><pre class=\"language-text\"><code class=\"language-text\">/exportFolder/\n    {simulation_id}/\n        runs/\n            scenario0.run0/\n              root001.parquet\n              root__system__Agents__Cell001.parquet\n            metadata.json\n            finished.json</code></pre></div>\n<ul>\n<li>exportFolder -> This is the root export directory</li>\n<li>{simulation_id} -> This is the UUID created for every run of the simulation (This is the ID used with the REST API)</li>\n<li>runs -> The root folder for all output run data</li>\n<li>scenario0.run0 -> The data for each scenario and run will be in its own folder     </li>\n<li>root.parquet, root<strong>system</strong>Agents__Cell001.parquet -> The output files created.</li>\n<li>metadata.json -> A file containing some metadata about the data produced.</li>\n<li>finished.json -> An empty file created to signal that no new data will be added to this directory.</li>\n</ul>\n<p>The model sampler output folders will match the scenario output folders. </p>\n<h3 id=\"metadatajson\"><a href=\"#metadatajson\" aria-hidden=\"true\" class=\"anchor\"><svg aria-hidden=\"true\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>metadata.json</h3>\n<p>A metadata file is added to the data export giving details about the data. The metadata contains</p>\n<ul>\n<li>model_name -> The name of the model that we can use to query the API</li>\n<li>source -> Simudyne</li>\n<li>source_version -> The version of The Simudyne SDK that produced this data</li>\n<li>format -> Parquet</li>\n<li>creation_date -> The date this data was produced</li>\n<li>schema -> The nested schema that matches this data output</li>\n<li>custom -> Custom data that can be passed through in the create simulation request</li>\n</ul>\n<h3 id=\"finishedjson\"><a href=\"#finishedjson\" aria-hidden=\"true\" class=\"anchor\"><svg aria-hidden=\"true\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>finished.json</h3>\n<p>This is an empty file created at the end of a run to let you know that no new output files will be created in this directory.</p>\n<div class=\"ui segment info message\">\n<h4>Metadata/Finished.json locations</h4>\nThe location of these files will be in the main output directory if running in a batch mode due to the grouping of output locations.\n<p>As well if you are working with a SQL/Hive location you will NEED to specify this output folder for these files alongside the corresponding URL for connection to the external tables.</p>\n</div>","headings":[{"value":"Output Formats","depth":2},{"value":"Batch Run Export","depth":2},{"value":"Scenario Run Export","depth":2},{"value":"Model Sampler Export","depth":2},{"value":"Output Directory Structure","depth":2},{"value":"Group by type structure","depth":3},{"value":"Group by run structure","depth":3},{"value":"metadata.json","depth":3},{"value":"finished.json","depth":3}],"frontmatter":{"title":"Data Output","toc":true,"experimental":null}},"site":{"siteMetadata":{"title":"Simudyne Docs","latestVersion":"2.6"}}},"pageContext":{"absolutePath":"/home/vsts/work/1/s/content/2.6/reference/data_export.md","versioned":false,"version":"2.6","kind":"reference","pagePath":"/reference/data_export","chronology":{"prev":{"name":"Multirun Output","path":"/reference/data_management/multirun"},"next":{"name":"Parquet","path":"/reference/data_export/parquet"}},"lastUpdated":"2026-04-21T13:56:54.868Z"}}