{"data":{"markdownRemark":{"html":"<p>The other data formats on <a href=\":version/reference/data_export\">Data Export</a> all make usage of the default built-in output that is managed either via configuring of what data is serialized, or through output channels <a href=\":version/reference/data_management/channels\">Output Channels</a>. However there may be the case where these types of outputs do not fit into your data science workflow. The below will detail a brief example using H2 - an in-memory relational database to explain how this can be accomplished. </p>\n<p>Simudyne SDK as of version 2.4 makes usage of <a href=\"https://github.com/51zero/eel-sdk/\">Eel</a> a toolkit for manipulating data in the Hadoop ecosystem making usage of outputs in parquet, orc, csv in locations such as local, HDFS, or Hive tables. Existing users can output both local and/or distributed on a cluster, users will not need to change anything about their code in order to make usafe of this new output technology and enjoy the performance benefits that it provides. </p>\n<p>However in order to develop a customized data pipeline there are additional options via usage of Eel being included in the SDK that users should feel free to take advantage of.</p>\n<h2 id=\"explaining-source--sink\"><a href=\"#explaining-source--sink\" aria-hidden=\"true\" class=\"anchor\"><svg aria-hidden=\"true\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>Explaining Source &#x26; Sink</h2>\n<p>The core data structure in Eel is a DataStream consisting of a Schema and <code class=\"language-text\">n</code> rooms containing values for each field in the schema. Conceptually, a DataStream is similar to a table in a normal reltational database.</p>\n<p>With Eel, DataStreams can be read from a variety of Sources, and correspondingly can be written out to Sinks. These Source/Sink types can cary from Hive, JDBC Database, or collections in Java, JSON, or Parquet.</p>\n<div class=\"ui segment info message\">\n  <h4>Supported Eel Projects</h4>\n  By default, Eel Core allows for the variety of basic Source/Sink formats such as Parquet/CSV/etc. We also include Hive and Orc to allow users to both natively work with Hive via our SDK or easily work with Orc formatted files.\n  \n  However, HBase, Kafka, and Kudu are not included. Should a user wish to work with these technologies they must do so at their own risk. If you wish to include these parts of the Eel SDK's libraries you should make sure to EXCLUDE both Hive/Hadoop/Log4J packages from that library in order to not cause issues with conflicting versions.\n</div>\n<p>Here is a basic example of the code structure for a Source/Sink:</p>\n<div class=\"gatsby-highlight\" data-language=\"java\"><pre class=\"language-java\"><code class=\"language-java\">val source <span class=\"token operator\">=</span> <span class=\"token function\">CsvSource</span><span class=\"token punctuation\">(</span><span class=\"token keyword\">new</span> <span class=\"token class-name\">Path</span><span class=\"token punctuation\">(</span><span class=\"token string\">\"input_historical.csv\"</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">)</span>\nval sink <span class=\"token operator\">=</span> <span class=\"token function\">ParquetSink</span><span class=\"token punctuation\">(</span><span class=\"token keyword\">new</span> <span class=\"token class-name\">Path</span><span class=\"token punctuation\">(</span><span class=\"token string\">\"output_npv.pq\"</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">)</span>\nsource<span class=\"token punctuation\">.</span><span class=\"token function\">toDataStream</span><span class=\"token punctuation\">(</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">.</span><span class=\"token function\">filter</span><span class=\"token punctuation\">(</span>_<span class=\"token punctuation\">.</span><span class=\"token function\">get</span><span class=\"token punctuation\">(</span><span class=\"token string\">\"npv\"</span><span class=\"token punctuation\">)</span> <span class=\"token operator\">></span> <span class=\"token number\">0</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">.</span><span class=\"token function\">to</span><span class=\"token punctuation\">(</span>sink<span class=\"token punctuation\">)</span></code></pre></div>\n<p>What's happening here is that we are reading in a csv file containing presumably multiple columns/rows of data. Then we define a Parquet output that will contain just the NPV value. We then take the source and create a DataStream, and then we can process that stream with actions such as <code class=\"language-text\">map</code>, <code class=\"language-text\">filter</code>, <code class=\"language-text\">take</code>, <code class=\"language-text\">drop</code>. Here we add a filter to output the stream only if the NPV value is greater than 0. If this is true it will output the contents of the row in the same schema format as the input, but to a parquet Sink via the .to() method.</p>\n<h2 id=\"defining-a-custom-schema\"><a href=\"#defining-a-custom-schema\" aria-hidden=\"true\" class=\"anchor\"><svg aria-hidden=\"true\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>Defining a Custom Schema</h2>\n<p>Note if you wish to make usage of Custom Output Channels please refer to our (<a href=\":version/reference/data_management/channels\">Reference Docs</a>) and use the standard Parquet or Hive outputs.</p>\n<p>However, because of the versatility using Eel, it might be faster or make more sense to define a custom Source/Sink operation within your model that does\nnot have to handle the serailization used by various Simudyne SDK structures. As such, please refer to this short example on how if using your own custom\nSource/Sink code on defining a schema.</p>\n<div class=\"gatsby-highlight\" data-language=\"java\"><pre class=\"language-java\"><code class=\"language-java\">val personDetailsStruct <span class=\"token operator\">=</span> Field<span class=\"token punctuation\">.</span><span class=\"token function\">createStructField</span><span class=\"token punctuation\">(</span><span class=\"token string\">\"PERSON_DETAILS\"</span><span class=\"token punctuation\">,</span>\n  <span class=\"token function\">Seq</span><span class=\"token punctuation\">(</span>\n\t<span class=\"token function\">Field</span><span class=\"token punctuation\">(</span><span class=\"token string\">\"NAME\"</span><span class=\"token punctuation\">,</span> StringType<span class=\"token punctuation\">)</span><span class=\"token punctuation\">,</span>\n\t<span class=\"token function\">Field</span><span class=\"token punctuation\">(</span><span class=\"token string\">\"AGE\"</span><span class=\"token punctuation\">,</span> IntType<span class=\"token punctuation\">.</span>Signed<span class=\"token punctuation\">)</span><span class=\"token punctuation\">,</span>\n\t<span class=\"token function\">Field</span><span class=\"token punctuation\">(</span><span class=\"token string\">\"SALARY\"</span><span class=\"token punctuation\">,</span> <span class=\"token function\">DecimalType</span><span class=\"token punctuation\">(</span><span class=\"token function\">Precision</span><span class=\"token punctuation\">(</span><span class=\"token number\">38</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">,</span> <span class=\"token function\">Scale</span><span class=\"token punctuation\">(</span><span class=\"token number\">5</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">,</span>\n\t<span class=\"token function\">Field</span><span class=\"token punctuation\">(</span><span class=\"token string\">\"CREATION_TIME\"</span><span class=\"token punctuation\">,</span> TimestampMillisType<span class=\"token punctuation\">)</span>\n  <span class=\"token punctuation\">)</span>\n<span class=\"token punctuation\">)</span>\nval schema <span class=\"token operator\">=</span> <span class=\"token function\">StructType</span><span class=\"token punctuation\">(</span>personDetailsStruct<span class=\"token punctuation\">)</span>\n\nval rows <span class=\"token operator\">=</span> <span class=\"token function\">Vector</span><span class=\"token punctuation\">(</span>\n  <span class=\"token function\">Vector</span><span class=\"token punctuation\">(</span><span class=\"token function\">Vector</span><span class=\"token punctuation\">(</span><span class=\"token string\">\"Fred\"</span><span class=\"token punctuation\">,</span> <span class=\"token number\">50</span><span class=\"token punctuation\">,</span> <span class=\"token function\">BigDecimal</span><span class=\"token punctuation\">(</span><span class=\"token string\">\"50000.99000\"</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">,</span> <span class=\"token keyword\">new</span> <span class=\"token class-name\">Timestamp</span><span class=\"token punctuation\">(</span>System<span class=\"token punctuation\">.</span><span class=\"token function\">currentTimeMillis</span><span class=\"token punctuation\">(</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">,</span>\n  <span class=\"token function\">Vector</span><span class=\"token punctuation\">(</span><span class=\"token function\">Vector</span><span class=\"token punctuation\">(</span><span class=\"token string\">\"Gary\"</span><span class=\"token punctuation\">,</span> <span class=\"token number\">50</span><span class=\"token punctuation\">,</span> <span class=\"token function\">BigDecimal</span><span class=\"token punctuation\">(</span><span class=\"token string\">\"20000.34000\"</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">,</span> <span class=\"token keyword\">new</span> <span class=\"token class-name\">Timestamp</span><span class=\"token punctuation\">(</span>System<span class=\"token punctuation\">.</span><span class=\"token function\">currentTimeMillis</span><span class=\"token punctuation\">(</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">,</span>\n  <span class=\"token function\">Vector</span><span class=\"token punctuation\">(</span><span class=\"token function\">Vector</span><span class=\"token punctuation\">(</span><span class=\"token string\">\"Alice\"</span><span class=\"token punctuation\">,</span> <span class=\"token number\">50</span><span class=\"token punctuation\">,</span> <span class=\"token function\">BigDecimal</span><span class=\"token punctuation\">(</span><span class=\"token string\">\"99999.98000\"</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">,</span> <span class=\"token keyword\">new</span> <span class=\"token class-name\">Timestamp</span><span class=\"token punctuation\">(</span>System<span class=\"token punctuation\">.</span><span class=\"token function\">currentTimeMillis</span><span class=\"token punctuation\">(</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">)</span>\n<span class=\"token punctuation\">)</span></code></pre></div>\n<p>This population of a the rows Vector can instead be done as an agent action within the step function.</p>\n<div class=\"gatsby-highlight\" data-language=\"java\"><pre class=\"language-java\"><code class=\"language-java\"><span class=\"token keyword\">static</span> Action<span class=\"token generics function\"><span class=\"token punctuation\">&lt;</span>Person<span class=\"token punctuation\">></span></span> <span class=\"token function\">updateSalary</span><span class=\"token punctuation\">(</span><span class=\"token punctuation\">)</span> <span class=\"token punctuation\">{</span>\n    <span class=\"token keyword\">return</span> Action<span class=\"token punctuation\">.</span><span class=\"token function\">create</span><span class=\"token punctuation\">(</span>\n        Person<span class=\"token punctuation\">.</span><span class=\"token keyword\">class</span><span class=\"token punctuation\">,</span>\n        p <span class=\"token operator\">-</span><span class=\"token operator\">></span> <span class=\"token punctuation\">{</span>\n\t\t\tinterim_row <span class=\"token operator\">=</span> <span class=\"token function\">Vector</span><span class=\"token punctuation\">(</span><span class=\"token string\">\"Fred\"</span><span class=\"token punctuation\">,</span> <span class=\"token number\">50</span><span class=\"token punctuation\">,</span> <span class=\"token function\">BigDecimal</span><span class=\"token punctuation\">(</span><span class=\"token string\">\"50000.99000\"</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">,</span> <span class=\"token keyword\">new</span> <span class=\"token class-name\">Timestamp</span><span class=\"token punctuation\">(</span>System<span class=\"token punctuation\">.</span><span class=\"token function\">currentTimeMillis</span><span class=\"token punctuation\">(</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">)</span>\n\t\t\trows<span class=\"token punctuation\">.</span><span class=\"token function\">addAll</span><span class=\"token punctuation\">(</span>interim_row<span class=\"token punctuation\">)</span>\n\t\t<span class=\"token punctuation\">}</span>\n\t<span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span>\n<span class=\"token punctuation\">}</span></code></pre></div>\n<p>This example structure can then be written out to a Parquet Sink like this:</p>\n<div class=\"gatsby-highlight\" data-language=\"java\"><pre class=\"language-java\"><code class=\"language-java\">DataStream<span class=\"token punctuation\">.</span><span class=\"token function\">fromValues</span><span class=\"token punctuation\">(</span>schema<span class=\"token punctuation\">,</span> rows<span class=\"token punctuation\">)</span><span class=\"token punctuation\">.</span><span class=\"token function\">to</span><span class=\"token punctuation\">(</span><span class=\"token function\">ParquetSink</span><span class=\"token punctuation\">(</span>parquetFilePath<span class=\"token punctuation\">)</span><span class=\"token punctuation\">)</span></code></pre></div>\n<h2 id=\"creating-an-h2-sink-via-jdbc\"><a href=\"#creating-an-h2-sink-via-jdbc\" aria-hidden=\"true\" class=\"anchor\"><svg aria-hidden=\"true\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>Creating an H2 Sink via JDBC</h2>\n<p>If you are working with H2 you will need to include the driver into your code by adding <code class=\"language-text\">org.h2.Driver</code> to your imports.</p>\n<p>This example given a database string and a tableName will return a JdbcSink which as shown above can be the output. For H2 the database string should look something like this <code class=\"language-text\">jdbc:h2:~/test</code></p>\n<div class=\"gatsby-highlight\" data-language=\"java\"><pre class=\"language-java\"><code class=\"language-java\">def <span class=\"token function\">createJDBCSink</span><span class=\"token punctuation\">(</span>dbName<span class=\"token operator\">:</span> String<span class=\"token punctuation\">,</span> tableName<span class=\"token operator\">:</span> String<span class=\"token punctuation\">)</span><span class=\"token operator\">:</span> JdbcSink <span class=\"token operator\">=</span> <span class=\"token punctuation\">{</span>\n\tval driver <span class=\"token operator\">=</span> <span class=\"token string\">\"org.h2.Driver\"</span> \n\n\t<span class=\"token keyword\">try</span> <span class=\"token punctuation\">{</span>\n\t  Class<span class=\"token punctuation\">.</span><span class=\"token function\">forName</span><span class=\"token punctuation\">(</span>driver<span class=\"token punctuation\">)</span>\n\t<span class=\"token punctuation\">}</span> <span class=\"token keyword\">catch</span> <span class=\"token punctuation\">{</span>\n\t  <span class=\"token keyword\">case</span> _<span class=\"token operator\">:</span> Throwable <span class=\"token operator\">=</span><span class=\"token operator\">></span> <span class=\"token function\">println</span><span class=\"token punctuation\">(</span><span class=\"token string\">\"JDBC Driver not Found\"</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span>\n\t<span class=\"token punctuation\">}</span>\n\n\tval dataSource <span class=\"token operator\">=</span> <span class=\"token keyword\">new</span> <span class=\"token class-name\">BasicDataSource</span><span class=\"token punctuation\">(</span><span class=\"token punctuation\">)</span>\n\tdataSource<span class=\"token punctuation\">.</span><span class=\"token function\">setDriverClassName</span><span class=\"token punctuation\">(</span>driver<span class=\"token punctuation\">)</span>\n\tdataSource<span class=\"token punctuation\">.</span><span class=\"token function\">setUrl</span><span class=\"token punctuation\">(</span>dbName<span class=\"token punctuation\">)</span>\n\tdataSource<span class=\"token punctuation\">.</span><span class=\"token function\">setUsername</span><span class=\"token punctuation\">(</span><span class=\"token string\">\"username1\"</span><span class=\"token punctuation\">)</span>\n\tdataSource<span class=\"token punctuation\">.</span><span class=\"token function\">setPassword</span><span class=\"token punctuation\">(</span><span class=\"token string\">\"password1\"</span><span class=\"token punctuation\">)</span>\n\tdataSource<span class=\"token punctuation\">.</span><span class=\"token function\">setPoolPreparedStatements</span><span class=\"token punctuation\">(</span><span class=\"token boolean\">false</span><span class=\"token punctuation\">)</span>\n\n\t<span class=\"token keyword\">new</span> <span class=\"token class-name\">JdbcSink</span><span class=\"token punctuation\">(</span><span class=\"token punctuation\">(</span><span class=\"token punctuation\">)</span> <span class=\"token operator\">=</span><span class=\"token operator\">></span> dataSource<span class=\"token punctuation\">.</span><span class=\"token function\">getConnection</span><span class=\"token punctuation\">(</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">,</span> tableName<span class=\"token punctuation\">)</span>\n<span class=\"token punctuation\">}</span></code></pre></div>","headings":[{"value":"Explaining Source & Sink","depth":2},{"value":"Defining a Custom Schema","depth":2},{"value":"Creating an H2 Sink via JDBC","depth":2}],"frontmatter":{"title":"H2 and JDBC","toc":null,"experimental":null}},"site":{"siteMetadata":{"title":"Simudyne Docs","latestVersion":"2.6"}}},"pageContext":{"absolutePath":"/home/vsts/work/1/s/content/2.6/reference/data_export/h2.md","versioned":false,"version":"2.6","kind":"reference","pagePath":"/reference/data_export/h2","chronology":{"prev":{"name":"Hive via Parquet","path":"/reference/data_export/hive"},"next":{"name":"Run and Deploy","path":"/reference/run_deploy"}},"lastUpdated":"2026-04-21T13:56:54.868Z"}}